Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								78cca0a68c 
								
							 
						 
						
							
							
								
								[NPU] update llm-npu-cli example ( #12729 )  
							
							 
							
							... 
							
							
							
							* update cli example
* add license
* rename
* update readme sample output 
							
						 
						
							2025-01-22 09:59:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7e29edcc4b 
								
							 
						 
						
							
							
								
								Update Readme ( #12730 )  
							
							 
							
							
							
						 
						
							2025-01-22 08:43:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6789e5d92f 
								
							 
						 
						
							
							
								
								small fix ( #12727 )  
							
							 
							
							
							
						 
						
							2025-01-21 17:27:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								412bfd6644 
								
							 
						 
						
							
							
								
								Update readme ( #12724 )  
							
							 
							
							
							
						 
						
							2025-01-21 10:59:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								716d4fe563 
								
							 
						 
						
							
							
								
								Add  vllm 0.6.2 vision offline example ( #12721 )  
							
							 
							
							... 
							
							
							
							* add vision offline example
* add to docker 
							
						 
						
							2025-01-21 09:58:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								085974e307 
								
							 
						 
						
							
							
								
								fix nf4 to cpu ( #12722 )  
							
							 
							
							
							
						 
						
							2025-01-21 09:23:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9aa4be8ced 
								
							 
						 
						
							
							
								
								Update runtime configuration on MTL ( #12720 )  
							
							 
							
							
							
						 
						
							2025-01-20 11:06:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bda87c21eb 
								
							 
						 
						
							
							
								
								add support and optimization for minicpmo audio part ( #12716 )  
							
							 
							
							
							
						 
						
							2025-01-16 16:39:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								53aae24616 
								
							 
						 
						
							
							
								
								Add note about enabling Resizable BAR in BIOS for GPU setup ( #12715 )  
							
							 
							
							
							
						 
						
							2025-01-16 16:22:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								534e0e6774 
								
							 
						 
						
							
							
								
								Update dependency for PyTorch 2.6 RC support for woq int4 ( #12714 )  
							
							 
							
							
							
						 
						
							2025-01-16 15:51:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhao Changmin 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								54d6328b3c 
								
							 
						 
						
							
							
								
								woq int4 fwd ( #12711 )  
							
							 
							
							
							
						 
						
							2025-01-16 15:48:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b62734748f 
								
							 
						 
						
							
							
								
								add support and optimization for minicpmo vision part ( #12713 )  
							
							 
							
							
							
						 
						
							2025-01-16 14:51:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c52bdff76b 
								
							 
						 
						
							
							
								
								Update Deepseek coder GPU example ( #12712 )  
							
							 
							
							... 
							
							
							
							* Update Deepseek coder GPU example
* Fix based on comment 
							
						 
						
							2025-01-16 14:05:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9d65dcd7ef 
								
							 
						 
						
							
							
								
								Fix deepseek coder with linear rope type support on GPU ( #12709 )  
							
							 
							
							... 
							
							
							
							* Fix deepseek coder with linear rope type
* Style fix
* Move to optimize_pre
* Small fix
* Small fix
* Small fix to not affect other cases
* Style fixes
* Update function name
* Small fix
* Small fix
* Small fix
* Fix for low transformers version first
* Style fix
* Small fix 
							
						 
						
							2025-01-15 21:12:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								36bf3d8e29 
								
							 
						 
						
							
							
								
								[NPU doc] Update ARL product in QuickStart ( #12708 )  
							
							 
							
							
							
						 
						
							2025-01-15 15:57:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9930351112 
								
							 
						 
						
							
							
								
								LLM: add new qtype woq_int4 to support gemm int4 temporary. ( #12706 )  
							
							 
							
							... 
							
							
							
							This PR add temporary qtype woq_int4 to avoid affecting other qtype and models.
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com> 
							
						 
						
							2025-01-15 14:41:33 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6d03d06ebb 
								
							 
						 
						
							
							
								
								Change runtime configurations for perf test on Windows ( #12705 )  
							
							 
							
							... 
							
							
							
							* Change runtime configurations for perf test on Windows
* Small fix 
							
						 
						
							2025-01-14 17:54:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								350fae285d 
								
							 
						 
						
							
							
								
								Add Qwen2-VL HF GPU example with ModelScope Support ( #12606 )  
							
							 
							
							... 
							
							
							
							* Add qwen2-vl example
* complete generate.py & readme
* improve lint style
* update 1-6
* update main readme
* Format and other small fixes
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2025-01-13 15:42:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a1da7908b9 
								
							 
						 
						
							
							
								
								Fix name device is not found bug ( #12703 )  
							
							 
							
							
							
						 
						
							2025-01-13 10:11:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e2d58f733e 
								
							 
						 
						
							
							
								
								Update ollama v0.5.1 document ( #12699 )  
							
							 
							
							... 
							
							
							
							* Update ollama document version and known issue 
							
						 
						
							2025-01-10 18:04:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								db9db51e2c 
								
							 
						 
						
							
							
								
								fix lnl perf ( #12700 )  
							
							 
							
							
							
						 
						
							2025-01-10 18:00:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4bf93c66e8 
								
							 
						 
						
							
							
								
								Support install from source for PyTorch 2.6 RC in UT ( #12697 )  
							
							 
							
							... 
							
							
							
							* Support install from source for PyTorch 2.6 RC in UT
* Remove expecttest 
							
						 
						
							2025-01-10 16:44:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								da8bcb7db1 
								
							 
						 
						
							
							
								
								[NPU ] fix load logic of glm-edge models ( #12698 )  
							
							 
							
							
							
						 
						
							2025-01-10 16:08:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									joan726 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								584c1c5373 
								
							 
						 
						
							
							
								
								Update B580 CN doc ( #12695 )  
							
							 
							
							
							
						 
						
							2025-01-10 11:20:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cbb8e2a2d5 
								
							 
						 
						
							
							
								
								Update documents ( #12693 )  
							
							 
							
							
							
						 
						
							2025-01-10 10:47:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f8dc408888 
								
							 
						 
						
							
							
								
								fix user issue ( #12692 )  
							
							 
							
							
							
						 
						
							2025-01-10 10:18:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								68857494a5 
								
							 
						 
						
							
							
								
								refactor to simplify following upgrade 2 ( #12685 )  
							
							 
							
							
							
						 
						
							2025-01-10 09:29:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2673792de6 
								
							 
						 
						
							
							
								
								Update Dockerfile ( #12688 )  
							
							 
							
							
							
						 
						
							2025-01-10 09:01:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f9b29a4f56 
								
							 
						 
						
							
							
								
								Update B580 doc ( #12691 )  
							
							 
							
							
							
						 
						
							2025-01-10 08:59:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									joan726 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								66d4385cc9 
								
							 
						 
						
							
							
								
								Update B580 CN Doc ( #12686 )  
							
							 
							
							
							
						 
						
							2025-01-09 19:10:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c24741584d 
								
							 
						 
						
							
							
								
								Support PyTorch 2.6 RC perf test on Windows ( #12683 )  
							
							 
							
							
							
						 
						
							2025-01-09 18:17:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7234c9b27b 
								
							 
						 
						
							
							
								
								update quantize kv cache condition ( #12681 )  
							
							 
							
							
							
						 
						
							2025-01-09 15:23:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5d8081afbc 
								
							 
						 
						
							
							
								
								Remove dummy model from performance tests ( #12682 )  
							
							 
							
							
							
						 
						
							2025-01-09 14:50:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1ec40cd09e 
								
							 
						 
						
							
							
								
								refactor to simplify following upgrade ( #12680 )  
							
							 
							
							
							
						 
						
							2025-01-09 13:34:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								aa9e70a347 
								
							 
						 
						
							
							
								
								Update B580 Doc ( #12678 )  
							
							 
							
							
							
						 
						
							2025-01-08 22:36:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c6f57ad6ed 
								
							 
						 
						
							
							
								
								Update README.md ( #12677 )  
							
							 
							
							
							
						 
						
							2025-01-08 21:55:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2321e8d60c 
								
							 
						 
						
							
							
								
								Update README.md ( #12676 )  
							
							 
							
							
							
						 
						
							2025-01-08 21:54:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5c24276fc4 
								
							 
						 
						
							
							
								
								fix custom kernel registration ( #12674 )  
							
							 
							
							
							
						 
						
							2025-01-08 17:39:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a22a8c21bb 
								
							 
						 
						
							
							
								
								small fix and remove ununsed code about ipex ( #12671 )  
							
							 
							
							
							
						 
						
							2025-01-08 17:39:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c11f5f0fcd 
								
							 
						 
						
							
							
								
								also convert SdpaAttention in optimize_model ( #12673 )  
							
							 
							
							
							
						 
						
							2025-01-08 16:48:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2c23ce2553 
								
							 
						 
						
							
							
								
								Create a BattleMage QuickStart ( #12663 )  
							
							 
							
							... 
							
							
							
							* Create bmg_quickstart.md
* Update bmg_quickstart.md
* Clarify IPEX-LLM package installation based on use case
* Update bmg_quickstart.md
* Update bmg_quickstart.md 
							
						 
						
							2025-01-08 14:58:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7dd156d292 
								
							 
						 
						
							
							
								
								small fix and add comment ( #12670 )  
							
							 
							
							
							
						 
						
							2025-01-08 10:56:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ccf618ff4a 
								
							 
						 
						
							
							
								
								Remove all ipex usage ( #12666 )  
							
							 
							
							
							
						 
						
							2025-01-08 10:31:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									logicat 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0534d7254f 
								
							 
						 
						
							
							
								
								Update docker_cpp_xpu_quickstart.md ( #12667 )  
							
							 
							
							
							
						 
						
							2025-01-08 09:56:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5db6f9dcde 
								
							 
						 
						
							
							
								
								Add option with PyTorch 2.6 RC version for testing purposes ( #12668 )  
							
							 
							
							... 
							
							
							
							* Add option with PyTorch 2.6 RC version for testing purposes
* Small update 
							
						 
						
							2025-01-07 18:28:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f9ee7898c8 
								
							 
						 
						
							
							
								
								fix onednn dependency bug ( #12665 )  
							
							 
							
							
							
						 
						
							2025-01-07 16:26:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								29ad5c449e 
								
							 
						 
						
							
							
								
								refactor codegeex to remove ipex kernel usage ( #12664 )  
							
							 
							
							
							
						 
						
							2025-01-07 16:17:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								525b0ee991 
								
							 
						 
						
							
							
								
								[NPU] Tiny fixes on examples ( #12661 )  
							
							 
							
							
							
						 
						
							2025-01-07 14:30:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ebdf19fa7e 
								
							 
						 
						
							
							
								
								[NPU] Further fix saving of generation config ( #12657 )  
							
							 
							
							... 
							
							
							
							* Further fix saving of generation config
* Fix based on comments
* Small fix 
							
						 
						
							2025-01-07 13:53:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								381d448ee2 
								
							 
						 
						
							
							
								
								[NPU] Example & Quickstart updates ( #12650 )  
							
							 
							
							... 
							
							
							
							* Remove model with optimize_model=False in NPU verified models tables, and remove related example
* Remove experimental in run optimized model section title
* Unify model table order & example cmd
* Move embedding example to separate folder & update quickstart example link
* Add Quickstart reference in main NPU readme
* Small fix
* Small fix
* Move save/load examples under NPU/HF-Transformers-AutoModels
* Add low-bit and polish arguments for LLM Python examples
* Small fix
* Add low-bit and polish arguments for Multi-Model  examples
* Polish argument for Embedding models
* Polish argument for LLM CPP examples
* Add low-bit and polish argument for Save-Load examples
* Add accuracy tuning tips for examples
* Update NPU qucikstart accuracy tuning with low-bit optimizations
* Add save/load section to qucikstart
* Update CPP example sample output to EN
* Add installation regarding cmake for CPP examples
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Unify max prompt length to 512
* Change recommended low-bit for Qwen2.5-3B-Instruct to asym_int4
* Update based on comments
* Small fix 
							
						 
						
							2025-01-07 13:52:41 +08:00