Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6f22133efc 
								
							 
						 
						
							
							
								
								Update AWQ and GPTQ GPU example ( #12300 )  
							
							 
							
							
							
						 
						
							2024-10-31 09:35:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								41b8064554 
								
							 
						 
						
							
							
								
								Support minicpm-1B in level0 pipeline ( #12297 )  
							
							 
							
							
							
						 
						
							2024-10-30 17:21:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								46d8300f6b 
								
							 
						 
						
							
							
								
								bugfix for qlora finetuning on GPU ( #12298 )  
							
							 
							
							... 
							
							
							
							* bugfix for qlora 100 step error
* indent fix
* annotation fix 
							
						 
						
							2024-10-30 16:54:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2b2cb9c693 
								
							 
						 
						
							
							
								
								[NPU pipeline] Support save & load and update examples ( #12293 )  
							
							 
							
							... 
							
							
							
							* support save & load, update llama examples
* update baichuan2 example
* update readme 
							
						 
						
							2024-10-30 10:02:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3feb58d1e4 
								
							 
						 
						
							
							
								
								Support baichuan2 for level0 pipeline ( #12289 )  
							
							 
							
							
							
						 
						
							2024-10-29 19:24:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4467645088 
								
							 
						 
						
							
							
								
								[NPU] Support l0 Llama groupwise ( #12276 )  
							
							 
							
							... 
							
							
							
							* except lm_head
* remove
* support gw lm_head
* update
* fix
* remove run.bat
* fix style
* support llama3 
							
						 
						
							2024-10-28 17:06:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3fe2ea3081 
								
							 
						 
						
							
							
								
								[NPU] Reuse prefill of acc lib for pipeline ( #12279 )  
							
							 
							
							... 
							
							
							
							* first commit
* update example
* fix style
* update example
* embedding as const
* fix generate
* code  refactor
* meet code review
* fix style
* change max_output_len to max_context_len
* fix all-in-one
* fix example
* add check for new tokens 
							
						 
						
							2024-10-28 16:05:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ec362e6133 
								
							 
						 
						
							
							
								
								Add llama3 level0 example ( #12275 )  
							
							 
							
							
							
						 
						
							2024-10-28 09:24:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a0c6432899 
								
							 
						 
						
							
							
								
								[NPU] Add support for loading a FunASR model ( #12073 )  
							
							 
							
							... 
							
							
							
							* add support for loading funasr model
* add initial support for paraformer-encoder
* add npu ops impl
* add encoder-decoder npu pipeline
* move paraformer encoders prefix 30 layers  to npu and keep the rest layers on cpu 
							
						 
						
							2024-10-25 17:22:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								854398f6e0 
								
							 
						 
						
							
							
								
								update example to reduce peak memory usage ( #12274 )  
							
							 
							
							
							
						 
						
							2024-10-25 17:09:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ae57e23e4f 
								
							 
						 
						
							
							
								
								fix incompatibility between llama GW & llama pipeline ( #12267 )  
							
							 
							
							... 
							
							
							
							* fix
* fix 
							
						 
						
							2024-10-25 10:31:44 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								821fd96367 
								
							 
						 
						
							
							
								
								Initial integrate our L0 Llama impl into ipex-llm ( #12255 )  
							
							 
							
							... 
							
							
							
							* temp save
* initial support
* fix
* simplify code
* fix style
* fix example
* make default value of pipeline as False 
							
						 
						
							2024-10-24 09:49:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8fa98e2742 
								
							 
						 
						
							
							
								
								Remove Qwen2-7b from NPU example for "Run Optimized Models (Experimental)" ( #12245 )  
							
							 
							
							... 
							
							
							
							* Remove qwen2-7b from npu example readme
* fix 
							
						 
						
							2024-10-22 17:07:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9ea694484d 
								
							 
						 
						
							
							
								
								refactor ot remove old rope usage ( #12224 )  
							
							 
							
							
							
						 
						
							2024-10-17 17:06:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								667f0db466 
								
							 
						 
						
							
							
								
								Update Eagle example to Eagle2+ipex-llm integration ( #11717 )  
							
							 
							
							... 
							
							
							
							* update to e2 example
* update
* update 
							
						 
						
							2024-10-16 23:16:14 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f983f1a8f4 
								
							 
						 
						
							
							
								
								Add Qwen2-VL gpu example ( #12135 )  
							
							 
							
							... 
							
							
							
							* qwen2-vl readme
* add qwen2-vl example
* fix
* fix
* fix
* add link
* Update regarding modules_to_not_convert and readme
* Further fix
* Small fix
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2024-10-11 18:25:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4d93bb81fe 
								
							 
						 
						
							
							
								
								Initial support of NPU level0 Model ( #12177 )  
							
							 
							
							... 
							
							
							
							* first commit to support load dll and init llm pipeline
* add init generate
* fix style
* small updates
* fix style and check tokens number 
							
						 
						
							2024-10-11 09:45:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3d044dbf53 
								
							 
						 
						
							
							
								
								add llama3.2-vision Pytorch example ( #12165 )  
							
							 
							
							
							
						 
						
							2024-10-09 09:20:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ch1y0q 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								17c23cd759 
								
							 
						 
						
							
							
								
								add llama3.2 GPU example ( #12137 )  
							
							 
							
							... 
							
							
							
							* add llama3.2 GPU example
* change prompt format reference url
* update
* add Meta-Llama-3.2-1B-Instruct sample output
* update wording 
							
						 
						
							2024-09-29 14:41:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f71b38a994 
								
							 
						 
						
							
							
								
								Update MiniCPM_V_26 GPU example with save & load ( #12127 )  
							
							 
							
							
							
						 
						
							2024-09-26 17:40:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ch1y0q 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2ea13d502f 
								
							 
						 
						
							
							
								
								Add minicpm3 gpu example ( #12114 )  
							
							 
							
							... 
							
							
							
							* add minicpm3 gpu example
* update GPU example
* update
---------
Co-authored-by: Huang, Xinshengzi <xinshengzi.huang@intel.com> 
							
						 
						
							2024-09-26 13:51:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2bedb17be7 
								
							 
						 
						
							
							
								
								Add Qwen2.5 NPU Example ( #12110 )  
							
							 
							
							... 
							
							
							
							* Add Qwen2.5 NPU Example
* fix
* Merge qwen2.py and qwen2.5.py into qwen.py
* Fix description 
							
						 
						
							2024-09-25 15:20:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								828fa01ad3 
								
							 
						 
						
							
							
								
								[NPU] Add mixed_precision for Qwen2 7B ( #12098 )  
							
							 
							
							... 
							
							
							
							* Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct
* Small fix
* Fixed on load low bit with mixed precision
* Small fix
* Update example accordingly
* Update for default prompt
* Update base on comments
* Final fix 
							
						 
						
							2024-09-20 16:36:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ch1y0q 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2269768e71 
								
							 
						 
						
							
							
								
								add internvl2 example ( #12102 )  
							
							 
							
							... 
							
							
							
							* add internvl2 example
* add to README.md
* update
* add link to zh-CN readme 
							
						 
						
							2024-09-20 16:31:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								db7500bfd4 
								
							 
						 
						
							
							
								
								Add Qwen2.5 GPU example ( #12101 )  
							
							 
							
							... 
							
							
							
							* Add Qwen2.5 GPU example
* fix end line
* fix description 
							
						 
						
							2024-09-20 15:55:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ch1y0q 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b4b8c3e495 
								
							 
						 
						
							
							
								
								add lowbit_path for generate.py, fix npu_model ( #12077 )  
							
							 
							
							... 
							
							
							
							* add `lowbit_path` for `generate.py`, fix `npu_model`
* update `README.md` 
							
						 
						
							2024-09-13 17:28:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d703e4f127 
								
							 
						 
						
							
							
								
								Enable vllm multimodal minicpm-v-2-6 ( #12074 )  
							
							 
							
							... 
							
							
							
							* enable minicpm-v-2-6
* add image_url readme 
							
						 
						
							2024-09-13 13:28:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								e78e45ee01 
								
							 
						 
						
							
							
								
								update NPU readme: run conhost as administrator ( #12066 )  
							
							 
							
							
							
						 
						
							2024-09-11 17:54:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4ca330da15 
								
							 
						 
						
							
							
								
								Fix NPU load error message and add minicpm npu lowbit feat ( #12064 )  
							
							 
							
							... 
							
							
							
							* fix npu_model raise sym_int4 error
* add load_lowbit
* remove print&perf 
							
						 
						
							2024-09-11 16:56:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								32e8362da7 
								
							 
						 
						
							
							
								
								added minicpm cpu examples ( #12027 )  
							
							 
							
							... 
							
							
							
							* minicpm cpu examples
* add link for minicpm-2 
							
						 
						
							2024-09-11 15:51:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c5fdfde1bd 
								
							 
						 
						
							
							
								
								fix npu-model prompt ( #12057 )  
							
							 
							
							
							
						 
						
							2024-09-11 10:06:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ch1y0q 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								73a4360f3f 
								
							 
						 
						
							
							
								
								update lowbit path for baichuan2, qwen2, generate.py ( #12051 )  
							
							 
							
							... 
							
							
							
							* update lowbit path for baichuan2, qwen2, `generate.py`
* update readme 
							
						 
						
							2024-09-10 15:35:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f61b1785fb 
								
							 
						 
						
							
							
								
								Small update to NPU example readme ( #12034 )  
							
							 
							
							... 
							
							
							
							* Small update to NPU example readme
* Small fix 
							
						 
						
							2024-09-06 15:54:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0d04531ae0 
								
							 
						 
						
							
							
								
								update NPU readme of Qwen2 ( #12032 )  
							
							 
							
							... 
							
							
							
							* update readme
* update broadcast 
							
						 
						
							2024-09-06 15:02:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5b18bb3c4a 
								
							 
						 
						
							
							
								
								Add recommend version for mtl npu ( #12024 )  
							
							 
							
							
							
						 
						
							2024-09-05 16:28:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ch1y0q 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								820f8a4554 
								
							 
						 
						
							
							
								
								add --lowbit-path option for NPU llama example ( #12020 )  
							
							 
							
							... 
							
							
							
							* add option" `--lowbit-path`
* add descriptions in `README.md` and formatting
* Update llama.py 
							
						 
						
							2024-09-05 15:31:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b3b2cd64b4 
								
							 
						 
						
							
							
								
								Support lightweight-serving glm-4v-9b  ( #11994 )  
							
							 
							
							... 
							
							
							
							* enable glm-4v-9b serving
* update readme
* update for no image input 
							
						 
						
							2024-09-05 09:25:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								164f47adbd 
								
							 
						 
						
							
							
								
								MiniCPM-V-2 & MiniCPM-Llama3-V-2_5 example updates ( #11988 )  
							
							 
							
							... 
							
							
							
							* minicpm example updates
* --stream 
							
						 
						
							2024-09-03 17:02:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2e54f4402b 
								
							 
						 
						
							
							
								
								Rename MiniCPM-V-2_6 CPU example ( #11998 )  
							
							 
							
							
							
						 
						
							2024-09-03 16:50:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								65e281bb29 
								
							 
						 
						
							
							
								
								Add MiniCPM-V cpu example ( #11975 )  
							
							 
							
							... 
							
							
							
							* Add MiniCPM-V cpu example
* fix
* fix
* fix
* fix 
							
						 
						
							2024-09-02 10:17:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								79978e6f36 
								
							 
						 
						
							
							
								
								update npu multimodal readme ( #11979 )  
							
							 
							
							... 
							
							
							
							* update npu readme of multimodal
* small fix
* meet comment 
							
						 
						
							2024-08-30 19:02:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4811a490ef 
								
							 
						 
						
							
							
								
								small fix ( #11978 )  
							
							 
							
							... 
							
							
							
							* fix
* meet comment 
							
						 
						
							2024-08-30 17:55:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								573c20bae6 
								
							 
						 
						
							
							
								
								fix npu lm_head cpu condition ( #11976 )  
							
							 
							
							... 
							
							
							
							* fix
* fix
* fix
* fix stype
* fix style
* fix style 
							
						 
						
							2024-08-30 17:11:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								60aa1a2c0f 
								
							 
						 
						
							
							
								
								Initial NPU support for MiniCPM-V-2_6 ( #11966 )  
							
							 
							
							... 
							
							
							
							* initial pr
* update npu model
* fix
* fix kv cache type
* fix
* small fix
* fix style
* fix model id
* change inter_pp=4
* address comment
* fix
* fix style
* fix
* rebase 
							
						 
						
							2024-08-30 16:34:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								158289d205 
								
							 
						 
						
							
							
								
								[NPU] Add initial support for minicpm-llama-v2.5 ( #11962 )  
							
							 
							
							... 
							
							
							
							* add initial support for minicpm-llama-v2.5
* update impl
* add minicpm-llama3-v2.5 example 
							
						 
						
							2024-08-30 16:00:33 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cd077881f1 
								
							 
						 
						
							
							
								
								Disable lm head ( #11972 )  
							
							 
							
							
							
						 
						
							2024-08-30 11:05:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2e49e1f8e9 
								
							 
						 
						
							
							
								
								Further fix for MiniCPM-V-2_6 example ( #11965 )  
							
							 
							
							
							
						 
						
							2024-08-29 19:14:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								431affd0a0 
								
							 
						 
						
							
							
								
								Update README.md ( #11964 )  
							
							 
							
							
							
						 
						
							2024-08-29 18:56:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								14b2c8dc32 
								
							 
						 
						
							
							
								
								Update qwen2-7b example script ( #11961 )  
							
							 
							
							
							
						 
						
							2024-08-29 18:25:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7abe17d6f7 
								
							 
						 
						
							
							
								
								Update MiniCPM-V-2_6 Example ( #11958 )  
							
							 
							
							... 
							
							
							
							* Update example scripts regarding warmup, stream generate, moudles to not convert, etc.
* Update readme accordingly
* Fix based on comments
* Small fix
* Remove n_predict 
							
						 
						
							2024-08-29 18:23:48 +08:00