Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								aa9e70a347 
								
							 
						 
						
							
							
								
								Update B580 Doc ( #12678 )  
							
							 
							
							
							
						 
						
							2025-01-08 22:36:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c6f57ad6ed 
								
							 
						 
						
							
							
								
								Update README.md ( #12677 )  
							
							 
							
							
							
						 
						
							2025-01-08 21:55:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2321e8d60c 
								
							 
						 
						
							
							
								
								Update README.md ( #12676 )  
							
							 
							
							
							
						 
						
							2025-01-08 21:54:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5c24276fc4 
								
							 
						 
						
							
							
								
								fix custom kernel registration ( #12674 )  
							
							 
							
							
							
						 
						
							2025-01-08 17:39:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a22a8c21bb 
								
							 
						 
						
							
							
								
								small fix and remove ununsed code about ipex ( #12671 )  
							
							 
							
							
							
						 
						
							2025-01-08 17:39:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c11f5f0fcd 
								
							 
						 
						
							
							
								
								also convert SdpaAttention in optimize_model ( #12673 )  
							
							 
							
							
							
						 
						
							2025-01-08 16:48:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2c23ce2553 
								
							 
						 
						
							
							
								
								Create a BattleMage QuickStart ( #12663 )  
							
							 
							
							... 
							
							
							
							* Create bmg_quickstart.md
* Update bmg_quickstart.md
* Clarify IPEX-LLM package installation based on use case
* Update bmg_quickstart.md
* Update bmg_quickstart.md 
							
						 
						
							2025-01-08 14:58:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7dd156d292 
								
							 
						 
						
							
							
								
								small fix and add comment ( #12670 )  
							
							 
							
							
							
						 
						
							2025-01-08 10:56:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ccf618ff4a 
								
							 
						 
						
							
							
								
								Remove all ipex usage ( #12666 )  
							
							 
							
							
							
						 
						
							2025-01-08 10:31:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									logicat 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0534d7254f 
								
							 
						 
						
							
							
								
								Update docker_cpp_xpu_quickstart.md ( #12667 )  
							
							 
							
							
							
						 
						
							2025-01-08 09:56:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5db6f9dcde 
								
							 
						 
						
							
							
								
								Add option with PyTorch 2.6 RC version for testing purposes ( #12668 )  
							
							 
							
							... 
							
							
							
							* Add option with PyTorch 2.6 RC version for testing purposes
* Small update 
							
						 
						
							2025-01-07 18:28:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f9ee7898c8 
								
							 
						 
						
							
							
								
								fix onednn dependency bug ( #12665 )  
							
							 
							
							
							
						 
						
							2025-01-07 16:26:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								29ad5c449e 
								
							 
						 
						
							
							
								
								refactor codegeex to remove ipex kernel usage ( #12664 )  
							
							 
							
							
							
						 
						
							2025-01-07 16:17:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								525b0ee991 
								
							 
						 
						
							
							
								
								[NPU] Tiny fixes on examples ( #12661 )  
							
							 
							
							
							
						 
						
							2025-01-07 14:30:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ebdf19fa7e 
								
							 
						 
						
							
							
								
								[NPU] Further fix saving of generation config ( #12657 )  
							
							 
							
							... 
							
							
							
							* Further fix saving of generation config
* Fix based on comments
* Small fix 
							
						 
						
							2025-01-07 13:53:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								381d448ee2 
								
							 
						 
						
							
							
								
								[NPU] Example & Quickstart updates ( #12650 )  
							
							 
							
							... 
							
							
							
							* Remove model with optimize_model=False in NPU verified models tables, and remove related example
* Remove experimental in run optimized model section title
* Unify model table order & example cmd
* Move embedding example to separate folder & update quickstart example link
* Add Quickstart reference in main NPU readme
* Small fix
* Small fix
* Move save/load examples under NPU/HF-Transformers-AutoModels
* Add low-bit and polish arguments for LLM Python examples
* Small fix
* Add low-bit and polish arguments for Multi-Model  examples
* Polish argument for Embedding models
* Polish argument for LLM CPP examples
* Add low-bit and polish argument for Save-Load examples
* Add accuracy tuning tips for examples
* Update NPU qucikstart accuracy tuning with low-bit optimizations
* Add save/load section to qucikstart
* Update CPP example sample output to EN
* Add installation regarding cmake for CPP examples
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Unify max prompt length to 512
* Change recommended low-bit for Qwen2.5-3B-Instruct to asym_int4
* Update based on comments
* Small fix 
							
						 
						
							2025-01-07 13:52:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ddc0ef3993 
								
							 
						 
						
							
							
								
								refactor device check and remove cohere/mixtral support ( #12659 )  
							
							 
							
							
							
						 
						
							2025-01-07 11:15:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ea65e4fecc 
								
							 
						 
						
							
							
								
								remove falcon support and related UT ( #12656 )  
							
							 
							
							
							
						 
						
							2025-01-07 09:26:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fae73eee79 
								
							 
						 
						
							
							
								
								[NPU] Support save npu quantized model without npu dependency ( #12647 )  
							
							 
							
							... 
							
							
							
							* support save awq
* load quantized model & save npu compiled model
* fix style
* update
* fix dll load issue
* update error message
* fix style 
							
						 
						
							2025-01-06 18:06:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								502461d836 
								
							 
						 
						
							
							
								
								remove unnecessary ipex kernel usage ( #12649 )  
							
							 
							
							
							
						 
						
							2025-01-03 16:45:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9f8b134889 
								
							 
						 
						
							
							
								
								add ipex-llm custom kernel registration ( #12648 )  
							
							 
							
							
							
						 
						
							2025-01-03 16:45:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0b377100c5 
								
							 
						 
						
							
							
								
								Add guide for save-load usage ( #12498 )  
							
							 
							
							
							
						 
						
							2025-01-03 16:30:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6711a48a36 
								
							 
						 
						
							
							
								
								Enable internvl2-8b on vllm( #12645 )  
							
							 
							
							
							
						 
						
							2025-01-03 14:49:36 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8fd2dcba86 
								
							 
						 
						
							
							
								
								Add benchmark_util for transformers >= 4.47.0 ( #12644 )  
							
							 
							
							
							
						 
						
							2025-01-03 10:48:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								550fa01649 
								
							 
						 
						
							
							
								
								[Doc] Update ipex-llm ollama troubleshooting for v0.4.6 ( #12642 )  
							
							 
							
							... 
							
							
							
							* update ollama v0.4.6 troubleshooting
* update chinese ollama-doc 
							
						 
						
							2025-01-02 17:28:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8e5328e9b4 
								
							 
						 
						
							
							
								
								add disable opts for awq ( #12641 )  
							
							 
							
							
							
						 
						
							2025-01-02 15:45:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								62318964fa 
								
							 
						 
						
							
							
								
								Update llama example information ( #12640 )  
							
							 
							
							... 
							
							
							
							Co-authored-by: ATMxsp01 <shou.xu@intel.com> 
							
						 
						
							2025-01-02 13:48:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								81211fd010 
								
							 
						 
						
							
							
								
								remove unused code ( #12635 )  
							
							 
							
							
							
						 
						
							2025-01-02 13:31:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								534566e290 
								
							 
						 
						
							
							
								
								[NPU] Support minicpm-v with python cpp backend ( #12637 )  
							
							 
							
							
							
						 
						
							2025-01-02 11:13:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f289f68d57 
								
							 
						 
						
							
							
								
								small fix ( #12634 )  
							
							 
							
							
							
						 
						
							2024-12-30 17:14:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2d08155513 
								
							 
						 
						
							
							
								
								remove bmm, which is only required in ipex 2.0 ( #12630 )  
							
							 
							
							
							
						 
						
							2024-12-27 17:28:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f17ccfa61a 
								
							 
						 
						
							
							
								
								[NPU] Fix save-load usage of minicpm models ( #12628 )  
							
							 
							
							
							
						 
						
							2024-12-27 15:56:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c72a5db757 
								
							 
						 
						
							
							
								
								remove unused code again ( #12624 )  
							
							 
							
							
							
						 
						
							2024-12-27 14:17:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								46eeab4479 
								
							 
						 
						
							
							
								
								[NPU] Fix regression caused by layer_norm change ( #12627 )  
							
							 
							
							
							
						 
						
							2024-12-27 14:08:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								90f6709486 
								
							 
						 
						
							
							
								
								[remove pipeline examples ( #12626 )  
							
							 
							
							
							
						 
						
							2024-12-27 13:42:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5f04ed7254 
								
							 
						 
						
							
							
								
								NPU] Update prompt format for baichuan2-pipeline ( #12625 )  
							
							 
							
							
							
						 
						
							2024-12-27 11:30:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								34dbdb8ee3 
								
							 
						 
						
							
							
								
								small fix ( #12623 )  
							
							 
							
							
							
						 
						
							2024-12-27 10:19:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								55ce091242 
								
							 
						 
						
							
							
								
								Add GLM4-Edge-V GPU example ( #12596 )  
							
							 
							
							... 
							
							
							
							* Add GLM4-Edge-V examples
* polish readme
* revert wrong changes
* polish readme
* polish readme
* little polish in reference info and indent
* Small fix and sample output updates
* Update main readme
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2024-12-27 09:40:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								796ee571a5 
								
							 
						 
						
							
							
								
								[NPU doc] Update verified platforms ( #12621 )  
							
							 
							
							
							
						 
						
							2024-12-26 17:39:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bbdbbb0d88 
								
							 
						 
						
							
							
								
								[NPU] Compatible with other third-party models like auto-round ( #12620 )  
							
							 
							
							... 
							
							
							
							* support third party model
* simplify code
* fix sty;e
* fix sym int4 GW
* code refactor
* fix 
							
						 
						
							2024-12-26 17:25:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a9abde0b5d 
								
							 
						 
						
							
							
								
								support passing attn_scale to sdpa ( #12619 )  
							
							 
							
							
							
						 
						
							2024-12-26 16:58:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								40a7d2b4f0 
								
							 
						 
						
							
							
								
								Consolidated C-Eval Benchmark Guide for Single-GPU and Multi-GPU Environments ( #12618 )  
							
							 
							
							... 
							
							
							
							* run c-eval on multi-GPUs
* Update README.md 
							
						 
						
							2024-12-26 15:23:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ccc4055058 
								
							 
						 
						
							
							
								
								[NPU] Update prompt format for baichuan2 ( #12615 )  
							
							 
							
							... 
							
							
							
							* Update baichuan2.py
* style fix 
							
						 
						
							2024-12-26 11:41:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1604b4ead8 
								
							 
						 
						
							
							
								
								small fix ( #12616 )  
							
							 
							
							
							
						 
						
							2024-12-26 11:35:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d841e1dc0d 
								
							 
						 
						
							
							
								
								[NPU] update convert script based on latest usage ( #12617 )  
							
							 
							
							
							
						 
						
							2024-12-26 11:23:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ef585d3360 
								
							 
						 
						
							
							
								
								Polish Readme for ModelScope-related examples ( #12603 )  
							
							 
							
							
							
						 
						
							2024-12-26 10:52:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								28737c250c 
								
							 
						 
						
							
							
								
								Update Dockerfile ( #12585 )  
							
							 
							
							
							
						 
						
							2024-12-26 10:20:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a596f1ae5f 
								
							 
						 
						
							
							
								
								remove bigdl-llm test to fix langchain UT ( #12613 )  
							
							 
							
							
							
						 
						
							2024-12-26 10:17:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9e895f04ec 
								
							 
						 
						
							
							
								
								[NPU] fix npu save ( #12614 )  
							
							 
							
							... 
							
							
							
							* fix npu save
* update 
							
						 
						
							2024-12-26 09:21:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Mingqi Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0477fe6480 
								
							 
						 
						
							
							
								
								[docs] Update doc for latest open webui: 0.4.8 ( #12591 )  
							
							 
							
							... 
							
							
							
							* Update open webui doc
* Resolve comments 
							
						 
						
							2024-12-26 09:18:20 +08:00