Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2d08155513 
								
							 
						 
						
							
							
								
								remove bmm, which is only required in ipex 2.0 ( #12630 )  
							
							 
							
							
							
						 
						
							2024-12-27 17:28:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f17ccfa61a 
								
							 
						 
						
							
							
								
								[NPU] Fix save-load usage of minicpm models ( #12628 )  
							
							 
							
							
							
						 
						
							2024-12-27 15:56:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c72a5db757 
								
							 
						 
						
							
							
								
								remove unused code again ( #12624 )  
							
							 
							
							
							
						 
						
							2024-12-27 14:17:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								46eeab4479 
								
							 
						 
						
							
							
								
								[NPU] Fix regression caused by layer_norm change ( #12627 )  
							
							 
							
							
							
						 
						
							2024-12-27 14:08:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								90f6709486 
								
							 
						 
						
							
							
								
								[remove pipeline examples ( #12626 )  
							
							 
							
							
							
						 
						
							2024-12-27 13:42:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5f04ed7254 
								
							 
						 
						
							
							
								
								NPU] Update prompt format for baichuan2-pipeline ( #12625 )  
							
							 
							
							
							
						 
						
							2024-12-27 11:30:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								34dbdb8ee3 
								
							 
						 
						
							
							
								
								small fix ( #12623 )  
							
							 
							
							
							
						 
						
							2024-12-27 10:19:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								55ce091242 
								
							 
						 
						
							
							
								
								Add GLM4-Edge-V GPU example ( #12596 )  
							
							 
							
							... 
							
							
							
							* Add GLM4-Edge-V examples
* polish readme
* revert wrong changes
* polish readme
* polish readme
* little polish in reference info and indent
* Small fix and sample output updates
* Update main readme
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2024-12-27 09:40:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								796ee571a5 
								
							 
						 
						
							
							
								
								[NPU doc] Update verified platforms ( #12621 )  
							
							 
							
							
							
						 
						
							2024-12-26 17:39:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bbdbbb0d88 
								
							 
						 
						
							
							
								
								[NPU] Compatible with other third-party models like auto-round ( #12620 )  
							
							 
							
							... 
							
							
							
							* support third party model
* simplify code
* fix sty;e
* fix sym int4 GW
* code refactor
* fix 
							
						 
						
							2024-12-26 17:25:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a9abde0b5d 
								
							 
						 
						
							
							
								
								support passing attn_scale to sdpa ( #12619 )  
							
							 
							
							
							
						 
						
							2024-12-26 16:58:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								40a7d2b4f0 
								
							 
						 
						
							
							
								
								Consolidated C-Eval Benchmark Guide for Single-GPU and Multi-GPU Environments ( #12618 )  
							
							 
							
							... 
							
							
							
							* run c-eval on multi-GPUs
* Update README.md 
							
						 
						
							2024-12-26 15:23:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ccc4055058 
								
							 
						 
						
							
							
								
								[NPU] Update prompt format for baichuan2 ( #12615 )  
							
							 
							
							... 
							
							
							
							* Update baichuan2.py
* style fix 
							
						 
						
							2024-12-26 11:41:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1604b4ead8 
								
							 
						 
						
							
							
								
								small fix ( #12616 )  
							
							 
							
							
							
						 
						
							2024-12-26 11:35:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d841e1dc0d 
								
							 
						 
						
							
							
								
								[NPU] update convert script based on latest usage ( #12617 )  
							
							 
							
							
							
						 
						
							2024-12-26 11:23:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ef585d3360 
								
							 
						 
						
							
							
								
								Polish Readme for ModelScope-related examples ( #12603 )  
							
							 
							
							
							
						 
						
							2024-12-26 10:52:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								28737c250c 
								
							 
						 
						
							
							
								
								Update Dockerfile ( #12585 )  
							
							 
							
							
							
						 
						
							2024-12-26 10:20:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a596f1ae5f 
								
							 
						 
						
							
							
								
								remove bigdl-llm test to fix langchain UT ( #12613 )  
							
							 
							
							
							
						 
						
							2024-12-26 10:17:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9e895f04ec 
								
							 
						 
						
							
							
								
								[NPU] fix npu save ( #12614 )  
							
							 
							
							... 
							
							
							
							* fix npu save
* update 
							
						 
						
							2024-12-26 09:21:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Mingqi Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0477fe6480 
								
							 
						 
						
							
							
								
								[docs] Update doc for latest open webui: 0.4.8 ( #12591 )  
							
							 
							
							... 
							
							
							
							* Update open webui doc
* Resolve comments 
							
						 
						
							2024-12-26 09:18:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6249c1e373 
								
							 
						 
						
							
							
								
								rewrite llama optimization ( #12609 )  
							
							 
							
							
							
						 
						
							2024-12-25 17:04:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5f5ac8a856 
								
							 
						 
						
							
							
								
								fix llama related import ( #12611 )  
							
							 
							
							
							
						 
						
							2024-12-25 16:23:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								54b1d7d333 
								
							 
						 
						
							
							
								
								Update README.zh-CN.md ( #12610 )  
							
							 
							
							
							
						 
						
							2024-12-25 15:38:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4e6b9d804f 
								
							 
						 
						
							
							
								
								add compresskv back for mistral ( #12607 )  
							
							 
							
							... 
							
							
							
							* add compresskv back for mistral
* fix
* fix 
							
						 
						
							2024-12-25 11:06:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									joan726 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9c9800be31 
								
							 
						 
						
							
							
								
								Update README.zh-CN.md ( #12570 )  
							
							 
							
							
							
						 
						
							2024-12-24 20:32:36 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4135b895b3 
								
							 
						 
						
							
							
								
								refactor chatglm2, internlm, stablelm and qwen ( #12604 )  
							
							 
							
							
							
						 
						
							2024-12-24 18:18:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								073f936c37 
								
							 
						 
						
							
							
								
								refactor mistral and phi3 ( #12605 )  
							
							 
							
							
							
						 
						
							2024-12-24 17:52:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								45f8f72a28 
								
							 
						 
						
							
							
								
								[NPU] Fix minicpm on MTL ( #12599 )  
							
							 
							
							
							
						 
						
							2024-12-24 15:37:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ad2dc965c5 
								
							 
						 
						
							
							
								
								refactor mllama, gpt2 and internvl ( #12602 )  
							
							 
							
							
							
						 
						
							2024-12-24 14:18:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7aaf02f602 
								
							 
						 
						
							
							
								
								refactor baichuan, glm4 and minicpm3 ( #12600 )  
							
							 
							
							
							
						 
						
							2024-12-24 14:16:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c410d9cf73 
								
							 
						 
						
							
							
								
								[NPU] support asym_int4 for baichuan ( #12576 )  
							
							 
							
							... 
							
							
							
							* add npu support for baichuan
* Update baichuan_mp.py
* Update baichuan_mp.py 
							
						 
						
							2024-12-24 09:17:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								098eb335b2 
								
							 
						 
						
							
							
								
								refactor sd 1.5 and qwen2-vl and fix ( #12590 )  
							
							 
							
							
							
						 
						
							2024-12-20 17:34:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b050368efc 
								
							 
						 
						
							
							
								
								refactor yuan2 and starcoder2 and fix ( #12589 )  
							
							 
							
							
							
						 
						
							2024-12-20 16:41:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6ea8033635 
								
							 
						 
						
							
							
								
								refactor glm edge ( #12588 )  
							
							 
							
							
							
						 
						
							2024-12-20 15:36:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b0338c5529 
								
							 
						 
						
							
							
								
								Add --modelscope option for glm-v4 MiniCPM-V-2_6 glm-edge and internvl2 ( #12583 )  
							
							 
							
							... 
							
							
							
							* Add --modelscope option for glm-v4 and MiniCPM-V-2_6
* glm-edge
* minicpm-v-2_6:don't use model_hub=modelscope when use lowbit; internvl2
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com> 
							
						 
						
							2024-12-20 13:54:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f3b5fad3be 
								
							 
						 
						
							
							
								
								refactor qwen2 and llama3 ( #12587 )  
							
							 
							
							
							
						 
						
							2024-12-20 13:25:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								51ff9ebd8a 
								
							 
						 
						
							
							
								
								Upgrade oneccl version to 0.0.6.3 ( #12560 )  
							
							 
							
							... 
							
							
							
							* Update Dockerfile
* Update Dockerfile
* Update start-vllm-service.sh 
							
						 
						
							2024-12-20 09:29:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								47da3c999f 
								
							 
						 
						
							
							
								
								Add --modelscope in GPU examples for minicpm, minicpm3, baichuan2 ( #12564 )  
							
							 
							
							... 
							
							
							
							* Add --modelscope for more models
* minicpm
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com> 
							
						 
						
							2024-12-19 17:25:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3eeb02f1be 
								
							 
						 
						
							
							
								
								support Megrez-3B-Omni ( #12582 )  
							
							 
							
							
							
						 
						
							2024-12-19 17:23:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4e7e988f70 
								
							 
						 
						
							
							
								
								[NPU] Fix MTL and ARL support ( #12580 )  
							
							 
							
							
							
						 
						
							2024-12-19 16:55:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								80f2fdc37b 
								
							 
						 
						
							
							
								
								optimize new minicpm model ( #12579 )  
							
							 
							
							
							
						 
						
							2024-12-19 14:22:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4540424271 
								
							 
						 
						
							
							
								
								optimize siglip attention again ( #12578 )  
							
							 
							
							
							
						 
						
							2024-12-19 13:40:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e0921f80c1 
								
							 
						 
						
							
							
								
								padding mask on torch side ( #12577 )  
							
							 
							
							
							
						 
						
							2024-12-19 10:53:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								47e90a362f 
								
							 
						 
						
							
							
								
								Add --modelscope in GPU examples for glm4, codegeex2, qwen2 and qwen2.5  ( #12561 )  
							
							 
							
							... 
							
							
							
							* Add --modelscope for more models
* imporve readme
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com> 
							
						 
						
							2024-12-19 10:00:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								28e81fda8e 
								
							 
						 
						
							
							
								
								Replace runner doc in ollama quickstart ( #12575 )  
							
							 
							
							
							
						 
						
							2024-12-18 19:05:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f7a2bd21cf 
								
							 
						 
						
							
							
								
								Update ollama and llama.cpp readme ( #12574 )  
							
							 
							
							
							
						 
						
							2024-12-18 17:33:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e2ae42929a 
								
							 
						 
						
							
							
								
								small fix ( #12573 )  
							
							 
							
							
							
						 
						
							2024-12-18 15:48:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a4eb561f36 
								
							 
						 
						
							
							
								
								optimize siglip attention on arc ( #12569 )  
							
							 
							
							
							
						 
						
							2024-12-18 14:19:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1a2ab12876 
								
							 
						 
						
							
							
								
								[NPU] support asym_int4 for minicpm ( #12567 )  
							
							 
							
							
							
						 
						
							2024-12-18 10:55:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6e801bc4e1 
								
							 
						 
						
							
							
								
								Update readme ( #12565 )  
							
							 
							
							
							
						 
						
							2024-12-18 09:33:16 +08:00