Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								93c10be762 
								
							 
						 
						
							
							
								
								LLM: Support hybrid convert for DeepSeek V3/R1 ( #12834 )  
							
							 
							
							... 
							
							
							
							LLM: Support hybrid convert for DeepSeek V3/R1 
							
						 
						
							2025-02-19 11:31:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								637543e135 
								
							 
						 
						
							
							
								
								Update Ollama portable zip QuickStart with troubleshooting ( #12846 )  
							
							 
							
							... 
							
							
							
							* Update ollama portable zip quickstart with runtime configurations
* Small fix
* Update based on comments
* Small fix
* Small fix 
							
						 
						
							2025-02-19 11:04:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bde8acc303 
								
							 
						 
						
							
							
								
								[NPU] Update doc of gguf support ( #12837 )  
							
							 
							
							
							
						 
						
							2025-02-19 10:46:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e1809a6295 
								
							 
						 
						
							
							
								
								Update multimodal on vllm 0.6.6 ( #12816 )  
							
							 
							
							... 
							
							
							
							* add glm4v and minicpmv example
* fix 
							
						 
						
							2025-02-19 10:04:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								09150b6058 
								
							 
						 
						
							
							
								
								Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 ( #12832 )  
							
							 
							
							... 
							
							
							
							Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 with DeepseekV3Attention
and DeepseekV3MLP to XPU 
							
						 
						
							2025-02-18 13:34:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								09ed96082b 
								
							 
						 
						
							
							
								
								Add DeepSeek V3/R1 CPU example ( #12836 )  
							
							 
							
							... 
							
							
							
							Add DeepSeek V3/R1 CPU example for bf16 model 
							
						 
						
							2025-02-18 12:45:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8418450300 
								
							 
						 
						
							
							
								
								optimize minicpm-o's tts part ( #12833 )  
							
							 
							
							
							
						 
						
							2025-02-17 14:53:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f7b5a093a7 
								
							 
						 
						
							
							
								
								Merge CPU & XPU Dockerfiles with Serving Images and Refactor ( #12815 )  
							
							 
							
							... 
							
							
							
							* Update Dockerfile
* Update Dockerfile
* Ensure scripts are executable
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* update
* Update Dockerfile
* remove inference-cpu and inference-xpu
* update README 
							
						 
						
							2025-02-17 14:23:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								eaec64baca 
								
							 
						 
						
							
							
								
								Update README.md ( #12826 )  
							
							 
							
							
							
						 
						
							2025-02-14 21:20:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									joan726 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								59e8e1e91e 
								
							 
						 
						
							
							
								
								Added ollama_portablze_zip_quickstart.zh-CN.md ( #12822 )  
							
							 
							
							
							
						 
						
							2025-02-14 18:54:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a09552e59a 
								
							 
						 
						
							
							
								
								Update ollama quickstart ( #12823 )  
							
							 
							
							
							
						 
						
							2025-02-14 09:55:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f67986021c 
								
							 
						 
						
							
							
								
								Update download link for Ollama portable zip QuickStart ( #12821 )  
							
							 
							
							... 
							
							
							
							* Update download link for Ollama portable zip quickstart
* Update based on comments 
							
						 
						
							2025-02-13 17:48:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								16e63cbc18 
								
							 
						 
						
							
							
								
								Update readme ( #12820 )  
							
							 
							
							
							
						 
						
							2025-02-13 14:26:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								68414afcb9 
								
							 
						 
						
							
							
								
								Add initial QuickStart for Ollama portable zip ( #12817 )  
							
							 
							
							... 
							
							
							
							* Add initial quickstart for Ollama portable zip
* Small fix
* Fixed based on comments
* Small fix
* Add demo image for run ollama
* Update download link 
							
						 
						
							2025-02-13 13:18:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1083fe5508 
								
							 
						 
						
							
							
								
								Reenable pp and lightweight-serving serving on 0.6.6 ( #12814 )  
							
							 
							
							... 
							
							
							
							* reenable pp ang lightweight serving on 066
* update readme
* updat
* update tag 
							
						 
						
							2025-02-13 10:16:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								af693425f1 
								
							 
						 
						
							
							
								
								Upgrade to vLLM 0.6.6 ( #12796 )  
							
							 
							
							... 
							
							
							
							* init
* update engine init
* fix serving load_in_low_bit problem
* temp
* temp
* temp
* temp
* temp
* fix
* fixed
* done
* fix
* fix all arguments
* fix
* fix throughput script
* fix
* fix
* use official ipex-llm
* Fix readme
* fix
---------
Co-authored-by: hzjane <a1015616934@qq.com> 
							
						 
						
							2025-02-12 16:47:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f8ab833f74 
								
							 
						 
						
							
							
								
								support and optimize janus pro ( #12813 )  
							
							 
							
							
							
						 
						
							2025-02-12 15:07:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bd815a4d96 
								
							 
						 
						
							
							
								
								Update the base image of inference-cpp image to oneapi 2025.0.2 ( #12802 )  
							
							 
							
							... 
							
							
							
							* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile 
							
						 
						
							2025-02-12 14:15:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								73cfe293fa 
								
							 
						 
						
							
							
								
								add basic support for Baichuan-M1-14B-Instruct ( #12808 )  
							
							 
							
							
							
						 
						
							2025-02-11 17:27:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d093b75aa0 
								
							 
						 
						
							
							
								
								[NPU] Update driver installation in QuickStart ( #12807 )  
							
							 
							
							
							
						 
						
							2025-02-11 15:49:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b70ad902b4 
								
							 
						 
						
							
							
								
								Fix ipex-llm CPU linear dtype not match ( #12805 )  
							
							 
							
							
							
						 
						
							2025-02-11 10:34:44 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2701a9d1e3 
								
							 
						 
						
							
							
								
								Remove Migrated Workflows to Avoid Duplication and Confusion ( #12801 )  
							
							 
							
							... 
							
							
							
							* Delete .github/actions/llm directory
* Delete .github/workflows/release-ipex-llm.yaml
* Delete .github/workflows/llm-nightly-test.yml
* Delete .github/workflows/llm_unit_tests.yml
* Delete .github/workflows/llm-binary-build.yml
* Delete .github/workflows/llm_example_tests.yml
* Delete .github/workflows/llm_performance_tests.yml
* Delete .github/workflows/manually_build.yml
* Delete .github/workflows/manually_build_for_testing.yml
* Delete .github/workflows/release-pypi.yml 
							
						 
						
							2025-02-10 14:58:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								eb2df5ed70 
								
							 
						 
						
							
							
								
								common.h -> npu/npu_common.h ( #12800 )  
							
							 
							
							
							
						 
						
							2025-02-10 14:38:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e4ceb722b6 
								
							 
						 
						
							
							
								
								fix qwen2 vl ( #12798 )  
							
							 
							
							
							
						 
						
							2025-02-10 13:25:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3fee838b14 
								
							 
						 
						
							
							
								
								[NPU] Fix of c++ convert example ( #12797 )  
							
							 
							
							
							
						 
						
							2025-02-10 11:17:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								468d3f22fc 
								
							 
						 
						
							
							
								
								Rename NPU public example to llm-cli ( #12790 )  
							
							 
							
							... 
							
							
							
							* rename to llm-cli
* update readme 
							
						 
						
							2025-02-08 10:19:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e90a9ad196 
								
							 
						 
						
							
							
								
								[NPU] Support non-const parameter for decoder layers when keep_ir=True ( #12789 )  
							
							 
							
							... 
							
							
							
							* support layernorm=False for decoder layers
* renbame to meet review
* fix style
* rename to const_parameter
* fix rebase error
* fix rebase error 
							
						 
						
							2025-02-08 09:58:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8aea5319bb 
								
							 
						 
						
							
							
								
								update more lora example ( #12785 )  
							
							 
							
							
							
						 
						
							2025-02-08 09:46:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fd28cf1672 
								
							 
						 
						
							
							
								
								Upgrade ipex-llm[cpp] to oneAPI 2025.0 on Windows ( #12778 )  
							
							 
							
							... 
							
							
							
							* Upgrade ipex-llm[cpp] to oneAPI 2025.0
* Fit oneapi pypi dependency on Windows for now 
							
						 
						
							2025-02-07 18:29:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ca1d7b7c2c 
								
							 
						 
						
							
							
								
								[NPU] Support qwen models with cos_sin_input=True ( #12788 )  
							
							 
							
							
							
						 
						
							2025-02-07 16:41:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6ff7faa781 
								
							 
						 
						
							
							
								
								[NPU] Update deepseek support in python examples and quickstart ( #12786 )  
							
							 
							
							
							
						 
						
							2025-02-07 11:25:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b4f2be2b09 
								
							 
						 
						
							
							
								
								[NPU] Update C++ example to add DeepSeek-R1 ( #12787 )  
							
							 
							
							
							
						 
						
							2025-02-07 11:23:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d0d9c9d636 
								
							 
						 
						
							
							
								
								remove load_in_8bit usage as it is not supported a long time ago ( #12779 )  
							
							 
							
							
							
						 
						
							2025-02-07 11:21:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9e9b6c9f2b 
								
							 
						 
						
							
							
								
								Fix cpu serving docker image ( #12783 )  
							
							 
							
							
							
						 
						
							2025-02-07 11:12:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b4c9e23f73 
								
							 
						 
						
							
							
								
								fix galore and peft finetune example ( #12776 )  
							
							 
							
							
							
						 
						
							2025-02-06 16:36:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c0d6b282b8 
								
							 
						 
						
							
							
								
								fix lisa finetune example ( #12775 )  
							
							 
							
							
							
						 
						
							2025-02-06 16:35:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2e5f2e5dda 
								
							 
						 
						
							
							
								
								fix dpo finetune ( #12774 )  
							
							 
							
							
							
						 
						
							2025-02-06 16:35:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9697197f3e 
								
							 
						 
						
							
							
								
								fix qlora finetune example ( #12769 )  
							
							 
							
							
							
						 
						
							2025-02-06 11:18:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								094a25b740 
								
							 
						 
						
							
							
								
								[NPU] Expose parameter to control blob / IR save logic ( #12767 )  
							
							 
							
							... 
							
							
							
							* update api
* fix convert.py
* fix style
* remove unnecessary bin file
* fix style 
							
						 
						
							2025-02-06 10:07:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9c0daf6396 
								
							 
						 
						
							
							
								
								Fix readme links ( #12771 )  
							
							 
							
							
							
						 
						
							2025-02-05 19:24:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a1e7bfc638 
								
							 
						 
						
							
							
								
								Update Readme ( #12770 )  
							
							 
							
							
							
						 
						
							2025-02-05 19:19:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0237ffb302 
								
							 
						 
						
							
							
								
								refactor xpu linear forward ( #12768 )  
							
							 
							
							
							
						 
						
							2025-02-05 17:40:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Danciu Georgian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								413d6c2b66 
								
							 
						 
						
							
							
								
								Update check.py removing a twice defined function ( #12760 )  
							
							 
							
							... 
							
							
							
							Remove duplicate function 
							
						 
						
							2025-02-05 11:37:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								184adb2653 
								
							 
						 
						
							
							
								
								Small fix to MiniCPM-o-2_6 GPU example ( #12766 )  
							
							 
							
							
							
						 
						
							2025-02-05 11:32:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ee809e71df 
								
							 
						 
						
							
							
								
								add troubleshooting section ( #12755 )  
							
							 
							
							
							
						 
						
							2025-01-26 11:03:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5fb87d7486 
								
							 
						 
						
							
							
								
								remove ${HF_TOKEN} ( #12742 )  
							
							 
							
							
							
						 
						
							2025-01-26 10:31:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xiangyu Tian 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f924880694 
								
							 
						 
						
							
							
								
								vLLM: Fix vLLM-CPU docker image ( #12741 )  
							
							 
							
							
							
						 
						
							2025-01-24 10:00:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								69f13c78b8 
								
							 
						 
						
							
							
								
								[NPU] Update layernorm node on MTL/ARL ( #12738 )  
							
							 
							
							... 
							
							
							
							* Update layernorm node on MTL/ARL
* Fix on style 
							
						 
						
							2025-01-23 17:25:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d11f257ee7 
								
							 
						 
						
							
							
								
								Add GPU example for MiniCPM-o-2_6 ( #12735 )  
							
							 
							
							... 
							
							
							
							* Add init example for omni mode
* Small fix
* Small fix
* Add chat example
* Remove lagecy link
* Further update link
* Add readme
* Small fix
* Update main readme link
* Update based on comments
* Small fix
* Small fix
* Small fix 
							
						 
						
							2025-01-23 16:10:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								dcca522618 
								
							 
						 
						
							
							
								
								Remove sdpa available patch ( #12734 )  
							
							 
							
							
							
						 
						
							2025-01-22 17:22:28 +08:00