Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8fa98e2742 
								
							 
						 
						
							
							
								
								Remove Qwen2-7b from NPU example for "Run Optimized Models (Experimental)" ( #12245 )  
							
							 
							
							... 
							
							
							
							* Remove qwen2-7b from npu example readme
* fix 
							
						 
						
							2024-10-22 17:07:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2bedb17be7 
								
							 
						 
						
							
							
								
								Add Qwen2.5 NPU Example ( #12110 )  
							
							 
							
							... 
							
							
							
							* Add Qwen2.5 NPU Example
* fix
* Merge qwen2.py and qwen2.5.py into qwen.py
* Fix description 
							
						 
						
							2024-09-25 15:20:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								828fa01ad3 
								
							 
						 
						
							
							
								
								[NPU] Add mixed_precision for Qwen2 7B ( #12098 )  
							
							 
							
							... 
							
							
							
							* Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct
* Small fix
* Fixed on load low bit with mixed precision
* Small fix
* Update example accordingly
* Update for default prompt
* Update base on comments
* Final fix 
							
						 
						
							2024-09-20 16:36:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e78e45ee01 
								
							 
						 
						
							
							
								
								update NPU readme: run conhost as administrator ( #12066 )  
							
							 
							
							
							
						 
						
							2024-09-11 17:54:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c5fdfde1bd 
								
							 
						 
						
							
							
								
								fix npu-model prompt ( #12057 )  
							
							 
							
							
							
						 
						
							2024-09-11 10:06:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ch1y0q 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								73a4360f3f 
								
							 
						 
						
							
							
								
								update lowbit path for baichuan2, qwen2, generate.py ( #12051 )  
							
							 
							
							... 
							
							
							
							* update lowbit path for baichuan2, qwen2, `generate.py`
* update readme 
							
						 
						
							2024-09-10 15:35:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f61b1785fb 
								
							 
						 
						
							
							
								
								Small update to NPU example readme ( #12034 )  
							
							 
							
							... 
							
							
							
							* Small update to NPU example readme
* Small fix 
							
						 
						
							2024-09-06 15:54:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0d04531ae0 
								
							 
						 
						
							
							
								
								update NPU readme of Qwen2 ( #12032 )  
							
							 
							
							... 
							
							
							
							* update readme
* update broadcast 
							
						 
						
							2024-09-06 15:02:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5b18bb3c4a 
								
							 
						 
						
							
							
								
								Add recommend version for mtl npu ( #12024 )  
							
							 
							
							
							
						 
						
							2024-09-05 16:28:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ch1y0q 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								820f8a4554 
								
							 
						 
						
							
							
								
								add --lowbit-path option for NPU llama example ( #12020 )  
							
							 
							
							... 
							
							
							
							* add option" `--lowbit-path`
* add descriptions in `README.md` and formatting
* Update llama.py 
							
						 
						
							2024-09-05 15:31:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cd077881f1 
								
							 
						 
						
							
							
								
								Disable lm head ( #11972 )  
							
							 
							
							
							
						 
						
							2024-08-30 11:05:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								431affd0a0 
								
							 
						 
						
							
							
								
								Update README.md ( #11964 )  
							
							 
							
							
							
						 
						
							2024-08-29 18:56:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								14b2c8dc32 
								
							 
						 
						
							
							
								
								Update qwen2-7b example script ( #11961 )  
							
							 
							
							
							
						 
						
							2024-08-29 18:25:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5f7ff76ea5 
								
							 
						 
						
							
							
								
								update troubleshooting ( #11960 )  
							
							 
							
							
							
						 
						
							2024-08-29 17:44:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								882f4a5ff7 
								
							 
						 
						
							
							
								
								Add lnl npu driver recommend version and enable cpu_lm_head on llama3 ( #11952 )  
							
							 
							
							... 
							
							
							
							* update lnl npu driver version and enable cpu_lm_head on llama3
* update
* fix style
* typo
* address comments
* update
* add qwen2-7b 
							
						 
						
							2024-08-29 15:01:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								71f03dcc39 
								
							 
						 
						
							
							
								
								Support qwen2-7b with fused decoderlayer optimization on NPU ( #11912 )  
							
							 
							
							
							
						 
						
							2024-08-29 13:34:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5ca7390082 
								
							 
						 
						
							
							
								
								[NPU] Add minicpm-2b support for npu multi-processing ( #11949 )  
							
							 
							
							... 
							
							
							
							* add minicpm-2b support
* update example for minicpm-2b
* add LNL NPU driver requirement in readme 
							
						 
						
							2024-08-28 18:08:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								90f692937d 
								
							 
						 
						
							
							
								
								Update npu baichuan2 ( #11939 )  
							
							 
							
							
							
						 
						
							2024-08-27 16:56:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a81a329a5f 
								
							 
						 
						
							
							
								
								[NPU] Add example for NPU multi-processing minicpm-1b model ( #11935 )  
							
							 
							
							... 
							
							
							
							* add minicpm example 
							
						 
						
							2024-08-27 14:57:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e246f1e258 
								
							 
						 
						
							
							
								
								update llama3 npu example ( #11933 )  
							
							 
							
							
							
						 
						
							2024-08-27 13:03:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								14dddfc0d6 
								
							 
						 
						
							
							
								
								Update NPU example readme ( #11931 )  
							
							 
							
							
							
						 
						
							2024-08-27 12:44:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								dd303776cf 
								
							 
						 
						
							
							
								
								Add troubleshooting about transpose value setting  
							
							 
							
							
							
						 
						
							2024-08-26 16:06:32 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								794abe2ce8 
								
							 
						 
						
							
							
								
								update npu-readme ( #11900 )  
							
							 
							
							
							
						 
						
							2024-08-22 17:49:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								72a7bf624b 
								
							 
						 
						
							
							
								
								Support qwen2-1.5b with fused decoderlayer optimization on NPU ( #11888 )  
							
							 
							
							
							
						 
						
							2024-08-22 11:09:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8c5c7f32dd 
								
							 
						 
						
							
							
								
								Update doc for running npu generate example with ipex-llm[npu] ( #11876 )  
							
							 
							
							... 
							
							
							
							* update doc for running npu generate example with ipex-llm[npu]
* switch max_prompt_len to 512 to fix compile error on mtl 
							
						 
						
							2024-08-21 13:45:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5b83493b1a 
								
							 
						 
						
							
							
								
								Add ipex-llm npu option in setup.py ( #11858 )  
							
							 
							
							... 
							
							
							
							* add ipex-llm npu release
* update example doc
* meet latest release changes 
							
						 
						
							2024-08-20 17:29:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7380823f3f 
								
							 
						 
						
							
							
								
								Update Llama2 multi-processes example ( #11852 )  
							
							 
							
							... 
							
							
							
							* update llama2 multi-processes examples
* update
* update readme
* update 
							
						 
						
							2024-08-19 19:49:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								99b05ba1dc 
								
							 
						 
						
							
							
								
								separate prefill into a process ( #11787 )  
							
							 
							
							... 
							
							
							
							* seperate prefill into a process
* using model.share_memory()
* might work
* worked
* use long prompt
* refactor
* cleanup
* fix bug
* clean up
* changable inter and intra process stages
* refactor
* add max output len
* fix npu_model changes that may cause generate down
* fix npu_model generate import error
* fix generare forward error
---------
Co-authored-by: sgwhat <ge.song@intel.com> 
							
						 
						
							2024-08-19 17:53:36 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								23d3acdc77 
								
							 
						 
						
							
							
								
								Add experimental support of fused decoder layer for llama2 ( #11768 )  
							
							 
							
							
							
						 
						
							2024-08-13 14:41:36 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								05989ad0f9 
								
							 
						 
						
							
							
								
								Update npu example and all in one benckmark ( #11766 )  
							
							 
							
							
							
						 
						
							2024-08-12 16:46:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a44ab32153 
								
							 
						 
						
							
							
								
								Switch to conhost when running on NPU ( #11687 )  
							
							 
							
							
							
						 
						
							2024-07-30 17:08:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhao Changmin 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								06745e5742 
								
							 
						 
						
							
							
								
								Add npu benchmark all-in-one script ( #11571 )  
							
							 
							
							... 
							
							
							
							* npu benchmark 
							
						 
						
							2024-07-15 10:42:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhao Changmin 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b9c66994a5 
								
							 
						 
						
							
							
								
								add npu sdp ( #11562 )  
							
							 
							
							
							
						 
						
							2024-07-11 16:57:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhao Changmin 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3c16c9f725 
								
							 
						 
						
							
							
								
								Optimize baichuan on NPU ( #11548 )  
							
							 
							
							... 
							
							
							
							* baichuan_npu 
							
						 
						
							2024-07-10 13:18:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhao Changmin 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								76a5802acf 
								
							 
						 
						
							
							
								
								update NPU examples ( #11540 )  
							
							 
							
							... 
							
							
							
							* update NPU examples 
							
						 
						
							2024-07-09 17:19:42 +08:00