Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b9abb8a285 
								
							 
						 
						
							
							
								
								Support qwen2.5 3B for NPU & update related examples ( #12438 )  
							
							 
							
							... 
							
							
							
							* update qwen2.5-3B
* update convert
* small fix
* replace load_in_low_bit with low_bit
* small fix 
							
						 
						
							2024-11-25 16:38:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b633fbf26c 
								
							 
						 
						
							
							
								
								add chinese prompt troubleshooting for npu cpp examples ( #12437 )  
							
							 
							
							... 
							
							
							
							* add chinese prompt troubleshooting
* add chinese prompt troubleshooting 
							
						 
						
							2024-11-25 15:28:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f41405368a 
								
							 
						 
						
							
							
								
								Support minicpm for NPU C++ ( #12434 )  
							
							 
							
							... 
							
							
							
							* support minicpm-1b
* update
* tune fused_layers
* update readme.md 
							
						 
						
							2024-11-25 10:42:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0819fad34e 
								
							 
						 
						
							
							
								
								support Llama2-7B / Llama3-8B for NPU C++ ( #12431 )  
							
							 
							
							... 
							
							
							
							* support llama2
* update
* support fused_layers=4 for Llama2-7B 
							
						 
						
							2024-11-22 18:47:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4ffa6c752c 
								
							 
						 
						
							
							
								
								New convert support for C++ NPU ( #12430 )  
							
							 
							
							... 
							
							
							
							* initial commit
* fix
* fix style
* fix style
* fix
* fix 
							
						 
						
							2024-11-22 14:28:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2935e97610 
								
							 
						 
						
							
							
								
								small fix of cpp readme( #12425 )  
							
							 
							
							
							
						 
						
							2024-11-21 18:21:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7e0a840f74 
								
							 
						 
						
							
							
								
								add optimization to openjourney ( #12423 )  
							
							 
							
							... 
							
							
							
							* add optimization to openjourney
* add optimization to openjourney 
							
						 
						
							2024-11-21 15:23:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7288c759ce 
								
							 
						 
						
							
							
								
								Initial NPU C++ Example ( #12417 )  
							
							 
							
							... 
							
							
							
							* temp save
* meet review, update
* update
* meet review, add license
* typo 
							
						 
						
							2024-11-21 10:09:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d2a37b6ab2 
								
							 
						 
						
							
							
								
								add Stable diffusion examples ( #12418 )  
							
							 
							
							... 
							
							
							
							* add openjourney example
* add timing
* add stable diffusion to model page
* 4.1 fix
* small fix 
							
						 
						
							2024-11-20 17:18:36 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ff3f7cb25f 
								
							 
						 
						
							
							
								
								Fix speech_paraformer issue with unexpected changes ( #12416 )  
							
							 
							
							... 
							
							
							
							* Fix speech_paraformer issue with unexpected changes
* Add paraformer version specified 
							
						 
						
							2024-11-19 15:01:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7e50ff113c 
								
							 
						 
						
							
							
								
								Add padding_token=eos_token for GPU trl QLora example ( #12398 )  
							
							 
							
							... 
							
							
							
							* Avoid tokenizer doesn't have a padding token error. 
							
						 
						
							2024-11-14 10:51:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d2cbcb060c 
								
							 
						 
						
							
							
								
								Add initial support for modeling_xlm encoder on NPU ( #12393 )  
							
							 
							
							... 
							
							
							
							* Add initial support for modeling_xlm encoder on NPU
* Add EmbeddingModel class to keep the same usage with bce and npu fp16 linear convert
* Optimize currently implementation to support EmbeddingModel.encode API and convert other torch modules to NPU
* Add related example and documents 
							
						 
						
							2024-11-14 10:50:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0ee54fc55f 
								
							 
						 
						
							
							
								
								Upgrade to vllm 0.6.2 ( #12338 )  
							
							 
							
							... 
							
							
							
							* Initial updates for vllm 0.6.2
* fix
* Change Dockerfile to support v062
* Fix
* fix examples
* Fix
* done
* fix
* Update engine.py
* Fix Dockerfile to original path
* fix
* add option
* fix
* fix
* fix
* fix
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com> 
							
						 
						
							2024-11-12 20:35:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7a97fbb779 
								
							 
						 
						
							
							
								
								Support vpm and resampler module of minicpm-v on NPU ( #12375 )  
							
							 
							
							
							
						 
						
							2024-11-12 15:59:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2dfcc36825 
								
							 
						 
						
							
							
								
								Fix trl version and padding in trl qlora example ( #12368 )  
							
							 
							
							... 
							
							
							
							* Change trl to 0.9.6
* Enable padding to avoid padding related errors. 
							
						 
						
							2024-11-08 16:05:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b2e69a896c 
								
							 
						 
						
							
							
								
								[NPU] Support Baichuan groupwise & gw code refactor ( #12337 )  
							
							 
							
							... 
							
							
							
							* support minicpm 1b & qwen 1.5b gw
* support minicpm 1b
* baichuan part
* update
* support minicpm 1b & qwen 1.5b gw
* support minicpm 1b
* baichuan part
* update
* update
* update
* baichuan support
* code refactor
* remove code
* fix style
* address comments
* revert 
							
						 
						
							2024-11-08 11:42:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								812d5cc32e 
								
							 
						 
						
							
							
								
								[NPU L0] Support llama3.2 in L0 pipeline ( #12361 )  
							
							 
							
							
							
						 
						
							2024-11-08 10:01:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a7b66683f1 
								
							 
						 
						
							
							
								
								[NPU] Add Optimized Support for Llama3.2-1B/3B on NPU ( #12339 )  
							
							 
							
							... 
							
							
							
							* Add initial support for llama3.2-1b/3b
* move llama3.2 support into current llama_mp impl 
							
						 
						
							2024-11-06 19:21:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d872639395 
								
							 
						 
						
							
							
								
								[NPU] Llama3, Qwen2 1.5b, MiniCPM 1/2B groupwise support ( #12327 )  
							
							 
							
							... 
							
							
							
							* support minicpm 1b & qwen 1.5b gw
* support minicpm 1b
* support minicpm 2b
* fix style & error
* fix style & update
* remove print 
							
						 
						
							2024-11-05 15:51:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								82a61b5cf3 
								
							 
						 
						
							
							
								
								Limit trl version in example ( #12332 )  
							
							 
							
							... 
							
							
							
							* Limit trl version in example
* Limit trl version in example 
							
						 
						
							2024-11-05 14:50:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c8679ad592 
								
							 
						 
						
							
							
								
								Qwen layernorm as input ( #12309 )  
							
							 
							
							... 
							
							
							
							* qwen layernorm as input
* add group size 
							
						 
						
							2024-11-04 09:51:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cd5e22cee5 
								
							 
						 
						
							
							
								
								Update Llava GPU Example ( #12311 )  
							
							 
							
							... 
							
							
							
							* update-llava-example
* add warmup
* small fix on llava example
* remove space& extra print prompt
* renew example
* small fix
---------
Co-authored-by: Jinhe Tang <jin.tang1337@gmail.com> 
							
						 
						
							2024-11-01 17:06:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d409d9d0eb 
								
							 
						 
						
							
							
								
								[NPU L0] Update streaming mode of example ( #12312 )  
							
							 
							
							
							
						 
						
							2024-11-01 15:38:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								126f95be80 
								
							 
						 
						
							
							
								
								Fix DPO finetuning example ( #12313 )  
							
							 
							
							
							
						 
						
							2024-11-01 13:29:44 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								eda764909c 
								
							 
						 
						
							
							
								
								Add minicpm-2b in L0 pipeline ( #12308 )  
							
							 
							
							
							
						 
						
							2024-11-01 09:30:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3df6195cb0 
								
							 
						 
						
							
							
								
								Fix application quickstart ( #12305 )  
							
							 
							
							... 
							
							
							
							* fix graphrag quickstart
* fix axolotl quickstart
* fix ragflow quickstart
* fix ragflow quickstart
* fix graphrag toc
* fix comments
* fix comment
* fix comments 
							
						 
						
							2024-10-31 16:57:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4892df61c9 
								
							 
						 
						
							
							
								
								Add qwen2-1.5b in l0 pipeline example ( #12306 )  
							
							 
							
							
							
						 
						
							2024-10-31 16:44:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								30f668c206 
								
							 
						 
						
							
							
								
								updated transformers & accelerate requirements ( #12301 )  
							
							 
							
							
							
						 
						
							2024-10-31 15:59:40 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								416c19165c 
								
							 
						 
						
							
							
								
								Add Qwen pipeline and example ( #12292 )  
							
							 
							
							... 
							
							
							
							* support qwen pipeline
* update error msg
* style
* meet review
* minor 
							
						 
						
							2024-10-31 11:25:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Rahul Nair 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4cf1ccc43a 
								
							 
						 
						
							
							
								
								Update DPO EADME.md ( #12162 )  
							
							 
							
							... 
							
							
							
							bitsanbytes multi backend is now available and is required , otherwise would error out saying that no cuda is available 
							
						 
						
							2024-10-31 10:56:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chu,Youcheng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								29400e2e75 
								
							 
						 
						
							
							
								
								feat: change oneccl to internal ( #12296 )  
							
							 
							
							... 
							
							
							
							* feat: change oneccl
* fix: restore llama-70b
* fix: remove tab
* fix: remove extra blank
* small fix
* add comments
* fix: add a blank space 
							
						 
						
							2024-10-31 09:51:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6f22133efc 
								
							 
						 
						
							
							
								
								Update AWQ and GPTQ GPU example ( #12300 )  
							
							 
							
							
							
						 
						
							2024-10-31 09:35:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								41b8064554 
								
							 
						 
						
							
							
								
								Support minicpm-1B in level0 pipeline ( #12297 )  
							
							 
							
							
							
						 
						
							2024-10-30 17:21:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								46d8300f6b 
								
							 
						 
						
							
							
								
								bugfix for qlora finetuning on GPU ( #12298 )  
							
							 
							
							... 
							
							
							
							* bugfix for qlora 100 step error
* indent fix
* annotation fix 
							
						 
						
							2024-10-30 16:54:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2b2cb9c693 
								
							 
						 
						
							
							
								
								[NPU pipeline] Support save & load and update examples ( #12293 )  
							
							 
							
							... 
							
							
							
							* support save & load, update llama examples
* update baichuan2 example
* update readme 
							
						 
						
							2024-10-30 10:02:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3feb58d1e4 
								
							 
						 
						
							
							
								
								Support baichuan2 for level0 pipeline ( #12289 )  
							
							 
							
							
							
						 
						
							2024-10-29 19:24:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4467645088 
								
							 
						 
						
							
							
								
								[NPU] Support l0 Llama groupwise ( #12276 )  
							
							 
							
							... 
							
							
							
							* except lm_head
* remove
* support gw lm_head
* update
* fix
* remove run.bat
* fix style
* support llama3 
							
						 
						
							2024-10-28 17:06:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3fe2ea3081 
								
							 
						 
						
							
							
								
								[NPU] Reuse prefill of acc lib for pipeline ( #12279 )  
							
							 
							
							... 
							
							
							
							* first commit
* update example
* fix style
* update example
* embedding as const
* fix generate
* code  refactor
* meet code review
* fix style
* change max_output_len to max_context_len
* fix all-in-one
* fix example
* add check for new tokens 
							
						 
						
							2024-10-28 16:05:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ec362e6133 
								
							 
						 
						
							
							
								
								Add llama3 level0 example ( #12275 )  
							
							 
							
							
							
						 
						
							2024-10-28 09:24:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a0c6432899 
								
							 
						 
						
							
							
								
								[NPU] Add support for loading a FunASR model ( #12073 )  
							
							 
							
							... 
							
							
							
							* add support for loading funasr model
* add initial support for paraformer-encoder
* add npu ops impl
* add encoder-decoder npu pipeline
* move paraformer encoders prefix 30 layers  to npu and keep the rest layers on cpu 
							
						 
						
							2024-10-25 17:22:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								854398f6e0 
								
							 
						 
						
							
							
								
								update example to reduce peak memory usage ( #12274 )  
							
							 
							
							
							
						 
						
							2024-10-25 17:09:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ae57e23e4f 
								
							 
						 
						
							
							
								
								fix incompatibility between llama GW & llama pipeline ( #12267 )  
							
							 
							
							... 
							
							
							
							* fix
* fix 
							
						 
						
							2024-10-25 10:31:44 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								821fd96367 
								
							 
						 
						
							
							
								
								Initial integrate our L0 Llama impl into ipex-llm ( #12255 )  
							
							 
							
							... 
							
							
							
							* temp save
* initial support
* fix
* simplify code
* fix style
* fix example
* make default value of pipeline as False 
							
						 
						
							2024-10-24 09:49:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8fa98e2742 
								
							 
						 
						
							
							
								
								Remove Qwen2-7b from NPU example for "Run Optimized Models (Experimental)" ( #12245 )  
							
							 
							
							... 
							
							
							
							* Remove qwen2-7b from npu example readme
* fix 
							
						 
						
							2024-10-22 17:07:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9ea694484d 
								
							 
						 
						
							
							
								
								refactor ot remove old rope usage ( #12224 )  
							
							 
							
							
							
						 
						
							2024-10-17 17:06:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								667f0db466 
								
							 
						 
						
							
							
								
								Update Eagle example to Eagle2+ipex-llm integration ( #11717 )  
							
							 
							
							... 
							
							
							
							* update to e2 example
* update
* update 
							
						 
						
							2024-10-16 23:16:14 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f983f1a8f4 
								
							 
						 
						
							
							
								
								Add Qwen2-VL gpu example ( #12135 )  
							
							 
							
							... 
							
							
							
							* qwen2-vl readme
* add qwen2-vl example
* fix
* fix
* fix
* add link
* Update regarding modules_to_not_convert and readme
* Further fix
* Small fix
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2024-10-11 18:25:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4d93bb81fe 
								
							 
						 
						
							
							
								
								Initial support of NPU level0 Model ( #12177 )  
							
							 
							
							... 
							
							
							
							* first commit to support load dll and init llm pipeline
* add init generate
* fix style
* small updates
* fix style and check tokens number 
							
						 
						
							2024-10-11 09:45:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3d044dbf53 
								
							 
						 
						
							
							
								
								add llama3.2-vision Pytorch example ( #12165 )  
							
							 
							
							
							
						 
						
							2024-10-09 09:20:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ch1y0q 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								17c23cd759 
								
							 
						 
						
							
							
								
								add llama3.2 GPU example ( #12137 )  
							
							 
							
							... 
							
							
							
							* add llama3.2 GPU example
* change prompt format reference url
* update
* add Meta-Llama-3.2-1B-Instruct sample output
* update wording 
							
						 
						
							2024-09-29 14:41:54 +08:00