Xin Qiu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e946127613 
								
							 
						 
						
							
							
								
								glm 4v 1st sdp for vision ( #12904 )  
							
							 
							
							... 
							
							
							
							* glm4v 1st sdp
* update glm4v example
* meet code review
* fix style 
							
						 
						
							2025-02-28 13:23:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5c100ac105 
								
							 
						 
						
							
							
								
								Add ENTRYPOINT to Dockerfile to auto-start vllm service on container launch (for CVTE customer) ( #12901 )  
							
							 
							
							... 
							
							
							
							* Add ENTRYPOINT to Dockerfile to auto-start service on container launch (for CVTE client)
* Update start-vllm-service.sh
* Update README.md
* Update README.md
* Update start-vllm-service.sh
* Update README.md 
							
						 
						
							2025-02-27 17:33:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								be1f073866 
								
							 
						 
						
							
							
								
								add fuse moe optimization for moonlight ( #12898 )  
							
							 
							
							
							
						 
						
							2025-02-27 09:15:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ad65e2b03a 
								
							 
						 
						
							
							
								
								Update README.md ( #12900 )  
							
							 
							
							
							
						 
						
							2025-02-27 08:30:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5faba06409 
								
							 
						 
						
							
							
								
								simple optimization for moonlight moe decoding forward ( #12891 )  
							
							 
							
							
							
						 
						
							2025-02-25 16:18:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ae9f5320da 
								
							 
						 
						
							
							
								
								vLLM CPU: Fix Triton Version to Resolve Related Error( #12893 )  
							
							 
							
							
							
						 
						
							2025-02-25 15:00:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ab3fc66eb7 
								
							 
						 
						
							
							
								
								optimize attention part of moonlight-14B-A3B ( #12886 )  
							
							 
							
							
							
						 
						
							2025-02-25 09:38:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								dd30d12cb6 
								
							 
						 
						
							
							
								
								Fix serving-cpu image: setuptools-scm requires setuptools>=61 ( #12876 )  
							
							 
							
							... 
							
							
							
							* setuptools-scm requires setuptools>=61
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile 
							
						 
						
							2025-02-25 09:10:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								06694ba61a 
								
							 
						 
						
							
							
								
								Further fix portable zip file link ( #12885 )  
							
							 
							
							
							
						 
						
							2025-02-24 18:06:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								671ddfd847 
								
							 
						 
						
							
							
								
								Update wrong file name for portable zip quickstart ( #12883 )  
							
							 
							
							
							
						 
						
							2025-02-24 17:52:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a9c8e73a77 
								
							 
						 
						
							
							
								
								Update llama.cpp Prerequisites guide regarding oneAPI 2025.0 ( #12881 )  
							
							 
							
							... 
							
							
							
							* Update llama.cpp Prerequisites guide regarding oneAPI 2025.0
* Update based on comments
* Small fix
* Small fix 
							
						 
						
							2025-02-24 16:32:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4f2f92afa3 
								
							 
						 
						
							
							
								
								Update inference-cpp docker ( #12882 )  
							
							 
							
							... 
							
							
							
							* remove nouse run.py
* add WORKDIR /llm 
							
						 
						
							2025-02-24 14:32:44 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3f6ecce508 
								
							 
						 
						
							
							
								
								support using xgrammar to get json output ( #12870 )  
							
							 
							
							
							
						 
						
							2025-02-24 14:10:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								afad979168 
								
							 
						 
						
							
							
								
								Add Apache 2.0 License Information in Dockerfile to Comply with OSPDT Requirements ( #12878 )  
							
							 
							
							... 
							
							
							
							* ospdt: add Header for Dockerfile
* OSPDT: add Header for Dockerfile
* OSPDT: add Header for Dockerfile
* OSPDT: add Header for Dockerfile 
							
						 
						
							2025-02-24 14:00:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								02ec313eab 
								
							 
						 
						
							
							
								
								Update README.md ( #12877 )  
							
							 
							
							
							
						 
						
							2025-02-24 09:59:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								10400abfb7 
								
							 
						 
						
							
							
								
								Fix CodeQL workflow ( #12875 )  
							
							 
							
							... 
							
							
							
							* Update codeql.yml
* Update codeql.yml 
							
						 
						
							2025-02-24 09:16:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1e00bed001 
								
							 
						 
						
							
							
								
								Add GPU example for Janus-Pro ( #12869 )  
							
							 
							
							... 
							
							
							
							* Add example for Janus-Pro
* Update model link
* Fixes
* Fixes
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2025-02-21 18:36:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								21d6a78be0 
								
							 
						 
						
							
							
								
								Update Ollama portable zip QuickStart to fit new version ( #12871 )  
							
							 
							
							... 
							
							
							
							* Update ollama portable zip quickstart
* Update demo images 
							
						 
						
							2025-02-21 17:54:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3ea5389a99 
								
							 
						 
						
							
							
								
								Fix vllm api_server v1/models error ( #12867 )  
							
							 
							
							
							
						 
						
							2025-02-21 11:08:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8077850452 
								
							 
						 
						
							
							
								
								[NPU GGUF] Add simple example ( #12853 )  
							
							 
							
							
							
						 
						
							2025-02-21 09:58:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								348dc8056d 
								
							 
						 
						
							
							
								
								Fix vllm gptq awq error ( #12863 )  
							
							 
							
							... 
							
							
							
							* fix gptq awq error
* fix python style 
							
						 
						
							2025-02-20 16:27:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a488981f3f 
								
							 
						 
						
							
							
								
								Ollama portable zip QuickStart tiny fix ( #12862 )  
							
							 
							
							... 
							
							
							
							* Tiny fix to ollama portable zip quickstart
* Tiny fix 
							
						 
						
							2025-02-20 14:11:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0f2706be42 
								
							 
						 
						
							
							
								
								Update CN Ollama portable zip QuickStart for troubleshooting & tips ( #12860 )  
							
							 
							
							... 
							
							
							
							* Small fix for english version
* Update CN ollama portable zip quickstart for troubleshooting & tips
* Small fix 
							
						 
						
							2025-02-20 11:32:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								38a682adb1 
								
							 
						 
						
							
							
								
								Update Readme ( #12855 )  
							
							 
							
							
							
						 
						
							2025-02-19 19:55:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4eed0c7d99 
								
							 
						 
						
							
							
								
								initial implementation for low_bit_loader vLLM ( #12838 )  
							
							 
							
							... 
							
							
							
							* initial
* add logic for handling tensor parallel models
* fix
* Add some comments
* add doc
* fix done 
							
						 
						
							2025-02-19 19:45:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c81b7fc003 
								
							 
						 
						
							
							
								
								Add Portable zip Linux QuickStart ( #12849 )  
							
							 
							
							... 
							
							
							
							* linux doc
* update
* Update ollama_portablze_zip_quickstart.md
* Update ollama_portablze_zip_quickstart.md
* Update ollama_portablze_zip_quickstart.zh-CN.md
* Update ollama_portablze_zip_quickstart.md
* meet code review
* update
* Add tips & troubleshooting sections for both Linux & Windows
* Rebase
* Fix based on comments
* Small fix
* Fix img
* Update table for linux
* Small fix
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2025-02-19 19:13:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b26409d53f 
								
							 
						 
						
							
							
								
								R1 Hybrid: Add Benchmark for DeepSeek R1 transformers example ( #12854 )  
							
							 
							
							... 
							
							
							
							* init
* fix
* update
* update
* fix
* fix 
							
						 
						
							2025-02-19 18:33:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5d041f9ebf 
								
							 
						 
						
							
							
								
								Add latest models list in ollama quickstart ( #12850 )  
							
							 
							
							... 
							
							
							
							* Add latest models llist on ollama quickstart
* update oneapi version describe
* move models list to ollama_portable_zip doc
* update CN readme 
							
						 
						
							2025-02-19 18:29:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								aee2db30f9 
								
							 
						 
						
							
							
								
								update sdp support ( #12847 )  
							
							 
							
							
							
						 
						
							2025-02-19 12:07:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								93c10be762 
								
							 
						 
						
							
							
								
								LLM: Support hybrid convert for DeepSeek V3/R1 ( #12834 )  
							
							 
							
							... 
							
							
							
							LLM: Support hybrid convert for DeepSeek V3/R1 
							
						 
						
							2025-02-19 11:31:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								637543e135 
								
							 
						 
						
							
							
								
								Update Ollama portable zip QuickStart with troubleshooting ( #12846 )  
							
							 
							
							... 
							
							
							
							* Update ollama portable zip quickstart with runtime configurations
* Small fix
* Update based on comments
* Small fix
* Small fix 
							
						 
						
							2025-02-19 11:04:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bde8acc303 
								
							 
						 
						
							
							
								
								[NPU] Update doc of gguf support ( #12837 )  
							
							 
							
							
							
						 
						
							2025-02-19 10:46:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e1809a6295 
								
							 
						 
						
							
							
								
								Update multimodal on vllm 0.6.6 ( #12816 )  
							
							 
							
							... 
							
							
							
							* add glm4v and minicpmv example
* fix 
							
						 
						
							2025-02-19 10:04:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								09150b6058 
								
							 
						 
						
							
							
								
								Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 ( #12832 )  
							
							 
							
							... 
							
							
							
							Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 with DeepseekV3Attention
and DeepseekV3MLP to XPU 
							
						 
						
							2025-02-18 13:34:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								09ed96082b 
								
							 
						 
						
							
							
								
								Add DeepSeek V3/R1 CPU example ( #12836 )  
							
							 
							
							... 
							
							
							
							Add DeepSeek V3/R1 CPU example for bf16 model 
							
						 
						
							2025-02-18 12:45:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8418450300 
								
							 
						 
						
							
							
								
								optimize minicpm-o's tts part ( #12833 )  
							
							 
							
							
							
						 
						
							2025-02-17 14:53:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f7b5a093a7 
								
							 
						 
						
							
							
								
								Merge CPU & XPU Dockerfiles with Serving Images and Refactor ( #12815 )  
							
							 
							
							... 
							
							
							
							* Update Dockerfile
* Update Dockerfile
* Ensure scripts are executable
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* update
* Update Dockerfile
* remove inference-cpu and inference-xpu
* update README 
							
						 
						
							2025-02-17 14:23:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								eaec64baca 
								
							 
						 
						
							
							
								
								Update README.md ( #12826 )  
							
							 
							
							
							
						 
						
							2025-02-14 21:20:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									joan726 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								59e8e1e91e 
								
							 
						 
						
							
							
								
								Added ollama_portablze_zip_quickstart.zh-CN.md ( #12822 )  
							
							 
							
							
							
						 
						
							2025-02-14 18:54:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a09552e59a 
								
							 
						 
						
							
							
								
								Update ollama quickstart ( #12823 )  
							
							 
							
							
							
						 
						
							2025-02-14 09:55:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f67986021c 
								
							 
						 
						
							
							
								
								Update download link for Ollama portable zip QuickStart ( #12821 )  
							
							 
							
							... 
							
							
							
							* Update download link for Ollama portable zip quickstart
* Update based on comments 
							
						 
						
							2025-02-13 17:48:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								16e63cbc18 
								
							 
						 
						
							
							
								
								Update readme ( #12820 )  
							
							 
							
							
							
						 
						
							2025-02-13 14:26:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								68414afcb9 
								
							 
						 
						
							
							
								
								Add initial QuickStart for Ollama portable zip ( #12817 )  
							
							 
							
							... 
							
							
							
							* Add initial quickstart for Ollama portable zip
* Small fix
* Fixed based on comments
* Small fix
* Add demo image for run ollama
* Update download link 
							
						 
						
							2025-02-13 13:18:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1083fe5508 
								
							 
						 
						
							
							
								
								Reenable pp and lightweight-serving serving on 0.6.6 ( #12814 )  
							
							 
							
							... 
							
							
							
							* reenable pp ang lightweight serving on 066
* update readme
* updat
* update tag 
							
						 
						
							2025-02-13 10:16:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								af693425f1 
								
							 
						 
						
							
							
								
								Upgrade to vLLM 0.6.6 ( #12796 )  
							
							 
							
							... 
							
							
							
							* init
* update engine init
* fix serving load_in_low_bit problem
* temp
* temp
* temp
* temp
* temp
* fix
* fixed
* done
* fix
* fix all arguments
* fix
* fix throughput script
* fix
* fix
* use official ipex-llm
* Fix readme
* fix
---------
Co-authored-by: hzjane <a1015616934@qq.com> 
							
						 
						
							2025-02-12 16:47:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f8ab833f74 
								
							 
						 
						
							
							
								
								support and optimize janus pro ( #12813 )  
							
							 
							
							
							
						 
						
							2025-02-12 15:07:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bd815a4d96 
								
							 
						 
						
							
							
								
								Update the base image of inference-cpp image to oneapi 2025.0.2 ( #12802 )  
							
							 
							
							... 
							
							
							
							* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile 
							
						 
						
							2025-02-12 14:15:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								73cfe293fa 
								
							 
						 
						
							
							
								
								add basic support for Baichuan-M1-14B-Instruct ( #12808 )  
							
							 
							
							
							
						 
						
							2025-02-11 17:27:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d093b75aa0 
								
							 
						 
						
							
							
								
								[NPU] Update driver installation in QuickStart ( #12807 )  
							
							 
							
							
							
						 
						
							2025-02-11 15:49:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b70ad902b4 
								
							 
						 
						
							
							
								
								Fix ipex-llm CPU linear dtype not match ( #12805 )  
							
							 
							
							
							
						 
						
							2025-02-11 10:34:44 +08:00