Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								350fae285d 
								
							 
						 
						
							
							
								
								Add Qwen2-VL HF GPU example with ModelScope Support ( #12606 )  
							
							 
							
							... 
							
							
							
							* Add qwen2-vl example
* complete generate.py & readme
* improve lint style
* update 1-6
* update main readme
* Format and other small fixes
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2025-01-13 15:42:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								525b0ee991 
								
							 
						 
						
							
							
								
								[NPU] Tiny fixes on examples ( #12661 )  
							
							 
							
							
							
						 
						
							2025-01-07 14:30:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								381d448ee2 
								
							 
						 
						
							
							
								
								[NPU] Example & Quickstart updates ( #12650 )  
							
							 
							
							... 
							
							
							
							* Remove model with optimize_model=False in NPU verified models tables, and remove related example
* Remove experimental in run optimized model section title
* Unify model table order & example cmd
* Move embedding example to separate folder & update quickstart example link
* Add Quickstart reference in main NPU readme
* Small fix
* Small fix
* Move save/load examples under NPU/HF-Transformers-AutoModels
* Add low-bit and polish arguments for LLM Python examples
* Small fix
* Add low-bit and polish arguments for Multi-Model  examples
* Polish argument for Embedding models
* Polish argument for LLM CPP examples
* Add low-bit and polish argument for Save-Load examples
* Add accuracy tuning tips for examples
* Update NPU qucikstart accuracy tuning with low-bit optimizations
* Add save/load section to qucikstart
* Update CPP example sample output to EN
* Add installation regarding cmake for CPP examples
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Unify max prompt length to 512
* Change recommended low-bit for Qwen2.5-3B-Instruct to asym_int4
* Update based on comments
* Small fix 
							
						 
						
							2025-01-07 13:52:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0b377100c5 
								
							 
						 
						
							
							
								
								Add guide for save-load usage ( #12498 )  
							
							 
							
							
							
						 
						
							2025-01-03 16:30:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								62318964fa 
								
							 
						 
						
							
							
								
								Update llama example information ( #12640 )  
							
							 
							
							... 
							
							
							
							Co-authored-by: ATMxsp01 <shou.xu@intel.com> 
							
						 
						
							2025-01-02 13:48:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c72a5db757 
								
							 
						 
						
							
							
								
								remove unused code again ( #12624 )  
							
							 
							
							
							
						 
						
							2024-12-27 14:17:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								90f6709486 
								
							 
						 
						
							
							
								
								[remove pipeline examples ( #12626 )  
							
							 
							
							
							
						 
						
							2024-12-27 13:42:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5f04ed7254 
								
							 
						 
						
							
							
								
								NPU] Update prompt format for baichuan2-pipeline ( #12625 )  
							
							 
							
							
							
						 
						
							2024-12-27 11:30:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								55ce091242 
								
							 
						 
						
							
							
								
								Add GLM4-Edge-V GPU example ( #12596 )  
							
							 
							
							... 
							
							
							
							* Add GLM4-Edge-V examples
* polish readme
* revert wrong changes
* polish readme
* polish readme
* little polish in reference info and indent
* Small fix and sample output updates
* Update main readme
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2024-12-27 09:40:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								796ee571a5 
								
							 
						 
						
							
							
								
								[NPU doc] Update verified platforms ( #12621 )  
							
							 
							
							
							
						 
						
							2024-12-26 17:39:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zijie Li 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ccc4055058 
								
							 
						 
						
							
							
								
								[NPU] Update prompt format for baichuan2 ( #12615 )  
							
							 
							
							... 
							
							
							
							* Update baichuan2.py
* style fix 
							
						 
						
							2024-12-26 11:41:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d841e1dc0d 
								
							 
						 
						
							
							
								
								[NPU] update convert script based on latest usage ( #12617 )  
							
							 
							
							
							
						 
						
							2024-12-26 11:23:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ef585d3360 
								
							 
						 
						
							
							
								
								Polish Readme for ModelScope-related examples ( #12603 )  
							
							 
							
							
							
						 
						
							2024-12-26 10:52:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b0338c5529 
								
							 
						 
						
							
							
								
								Add --modelscope option for glm-v4 MiniCPM-V-2_6 glm-edge and internvl2 ( #12583 )  
							
							 
							
							... 
							
							
							
							* Add --modelscope option for glm-v4 and MiniCPM-V-2_6
* glm-edge
* minicpm-v-2_6:don't use model_hub=modelscope when use lowbit; internvl2
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com> 
							
						 
						
							2024-12-20 13:54:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								47da3c999f 
								
							 
						 
						
							
							
								
								Add --modelscope in GPU examples for minicpm, minicpm3, baichuan2 ( #12564 )  
							
							 
							
							... 
							
							
							
							* Add --modelscope for more models
* minicpm
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com> 
							
						 
						
							2024-12-19 17:25:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								47e90a362f 
								
							 
						 
						
							
							
								
								Add --modelscope in GPU examples for glm4, codegeex2, qwen2 and qwen2.5  ( #12561 )  
							
							 
							
							... 
							
							
							
							* Add --modelscope for more models
* imporve readme
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com> 
							
						 
						
							2024-12-19 10:00:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								680ea7e4a8 
								
							 
						 
						
							
							
								
								[NPU doc] Update configuration for different platforms ( #12554 )  
							
							 
							
							
							
						 
						
							2024-12-17 10:15:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xu, Shuo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ccc18eefb5 
								
							 
						 
						
							
							
								
								Add Modelscope option for chatglm3 on GPU ( #12545 )  
							
							 
							
							... 
							
							
							
							* Add Modelscope option for GPU model chatglm3
* Update readme
* Update readme
* Update readme
* Update readme
* format update
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com> 
							
						 
						
							2024-12-16 20:00:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chu,Youcheng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a86487c539 
								
							 
						 
						
							
							
								
								Add GLM-Edge GPU example ( #12483 )  
							
							 
							
							... 
							
							
							
							* feat: initial commit
* generate.py and README updates
* Update link for main readme
* Update based on comments
* Small fix
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2024-12-16 14:39:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jun Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0b953e61ef 
								
							 
						 
						
							
							
								
								[REFINE] graphmode code ( #12540 )  
							
							 
							
							
							
						 
						
							2024-12-16 09:17:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								caf15cc5ef 
								
							 
						 
						
							
							
								
								[NPU] Add IPEX_LLM_NPU_MTL to enable support on mtl ( #12543 )  
							
							 
							
							
							
						 
						
							2024-12-13 17:01:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d20a968ce2 
								
							 
						 
						
							
							
								
								[NPU] Fix generate example ( #12541 )  
							
							 
							
							
							
						 
						
							2024-12-13 14:07:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								fa261b8af1 
								
							 
						 
						
							
							
								
								torch 2.3 inference docker ( #12517 )  
							
							 
							
							... 
							
							
							
							* torch 2.3 inference docker
* Update README.md
* add convert code
* rename image
* remove 2.1 and add graph example
* Update README.md 
							
						 
						
							2024-12-13 10:47:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								dbaf4abcb3 
								
							 
						 
						
							
							
								
								[NPU] Update C++ example with repetition_penalty & update Python code accordingly ( #12528 )  
							
							 
							
							... 
							
							
							
							* Update c++ npu examples with repetition penalty
* Fit python with updated C++ API
* Style fix
* Small fix
* Small fix 
							
						 
						
							2024-12-12 13:42:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6fc27da9c1 
								
							 
						 
						
							
							
								
								[NPU] Update glm-edge support in docs ( #12529 )  
							
							 
							
							
							
						 
						
							2024-12-12 11:14:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ea55235cbd 
								
							 
						 
						
							
							
								
								[NPU] Support glm-edge models ( #12511 )  
							
							 
							
							
							
						 
						
							2024-12-09 14:06:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								12c78978dd 
								
							 
						 
						
							
							
								
								[NPU C++] Update example with conversation mode support ( #12510 )  
							
							 
							
							
							
						 
						
							2024-12-06 12:46:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5e1416c9aa 
								
							 
						 
						
							
							
								
								fix readme for npu cpp examples and llama.cpp ( #12505 )  
							
							 
							
							... 
							
							
							
							* fix cpp readme
* fix cpp readme
* fix cpp readme 
							
						 
						
							2024-12-05 12:32:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chu,Youcheng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ffa9a9e1b3 
								
							 
						 
						
							
							
								
								Update streaming in npu examples ( #12495 )  
							
							 
							
							... 
							
							
							
							* feat: add streaming
* Update readme accordingly
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2024-12-04 17:51:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ef4028ac2d 
								
							 
						 
						
							
							
								
								[NPU] Support split lm_head for Qwen2 with CPP ( #12491 )  
							
							 
							
							... 
							
							
							
							* Use split for Qwen2 lm_head instead of slice in optimize_pre
* Support split lm_head in Qwen2 python cpp backend
* Fit with Python acc lib pipeline
* Removed default mixed_precision=True in all-in-one and related examples
* Small fix
* Style fix
* Fix based on comments
* Fix based on comments
* Stype fix 
							
						 
						
							2024-12-04 14:41:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7082844f3f 
								
							 
						 
						
							
							
								
								Fix NPU LLM example save/load tokenizer ( #12485 )  
							
							 
							
							
							
						 
						
							2024-12-03 16:30:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ab01753b1c 
								
							 
						 
						
							
							
								
								[NPU] update save-load API usage ( #12473 )  
							
							 
							
							
							
						 
						
							2024-12-03 09:46:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								aee9acb303 
								
							 
						 
						
							
							
								
								Add NPU QuickStart & update example links ( #12470 )  
							
							 
							
							... 
							
							
							
							* Add initial NPU quickstart (c++ part unfinished)
* Small update
* Update based on comments
* Update main readme
* Remove LLaMA description
* Small fix
* Small fix
* Remove subsection link in main README
* Small fix
* Update based on comments
* Small fix
* TOC update and other small fixes
* Update for Chinese main readme
* Update based on comments and other small fixes
* Change order 
							
						 
						
							2024-12-02 17:03:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c911026f03 
								
							 
						 
						
							
							
								
								[NPU C++] Update model support & examples & benchmark  ( #12466 )  
							
							 
							
							
							
						 
						
							2024-11-29 13:35:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								14d8d3d8af 
								
							 
						 
						
							
							
								
								Integrate NPU C++ imple into ipex-llm ( #12461 )  
							
							 
							
							
							
						 
						
							2024-11-29 09:25:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d272f6b471 
								
							 
						 
						
							
							
								
								remove nf4 unsupport comment in cpu finetuning ( #12460 )  
							
							 
							
							... 
							
							
							
							Co-authored-by: Ariadne <wyn2000330@126.com> 
							
						 
						
							2024-11-28 13:26:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chu,Youcheng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ce6fcaa9ba 
								
							 
						 
						
							
							
								
								update transformers version in example of glm4 ( #12453 )  
							
							 
							
							... 
							
							
							
							* fix: update transformers version in example of glm4
* fix: textual adjustments
* fix: texual adjustment 
							
						 
						
							2024-11-27 15:02:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								effb9bb41c 
								
							 
						 
						
							
							
								
								Small update to LangChain examples readme ( #12452 )  
							
							 
							
							
							
						 
						
							2024-11-27 14:02:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chu,Youcheng 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								acd77d9e87 
								
							 
						 
						
							
							
								
								Remove env variable BIGDL_LLM_XMX_DISABLED in documentation ( #12445 )  
							
							 
							
							... 
							
							
							
							* fix: remove BIGDL_LLM_XMX_DISABLED in mddocs
* fix: remove set SYCL_CACHE_PERSISTENT=1 in example
* fix: remove BIGDL_LLM_XMX_DISABLED in workflows
* fix: merge igpu and A-series Graphics
* fix: remove set BIGDL_LLM_XMX_DISABLED=1 in example
* fix: remove BIGDL_LLM_XMX_DISABLED in workflows
* fix: merge igpu and A-series Graphics
* fix: textual adjustment
* fix: textual adjustment
* fix: textual adjustment 
							
						 
						
							2024-11-27 11:16:36 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f8c2bb2943 
								
							 
						 
						
							
							
								
								[NPU] optimize qwen2 prefill performance for C++ ( #12451 )  
							
							 
							
							
							
						 
						
							2024-11-27 10:46:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin, Qiao 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c2efa264d9 
								
							 
						 
						
							
							
								
								Update LangChain examples to use upstream ( #12388 )  
							
							 
							
							... 
							
							
							
							* Update LangChain examples to use upstream
* Update README and fix links
* Update LangChain CPU examples to use upstream
* Update LangChain CPU voice_assistant example
* Update CPU README
* Update GPU README
* Remove GPU Langchain vLLM example and fix comments
* Change langchain -> LangChain
* Add reference for both upstream llms and embeddings
* Fix comments
* Fix comments
* Fix comments
* Fix comments
* Fix comment 
							
						 
						
							2024-11-26 16:43:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								66bd7abae4 
								
							 
						 
						
							
							
								
								add sdxl and lora-lcm optimization ( #12444 )  
							
							 
							
							... 
							
							
							
							* add sdxl and lora-lcm optimization
* fix openjourney speed drop 
							
						 
						
							2024-11-26 11:38:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0e23bd779f 
								
							 
						 
						
							
							
								
								Add support of llama3.2 for NPU C++ ( #12442 )  
							
							 
							
							... 
							
							
							
							* initial support of  llama3.2
* update
* update
* fix style
* fix style
* fix
* small fix 
							
						 
						
							2024-11-26 09:26:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b9abb8a285 
								
							 
						 
						
							
							
								
								Support qwen2.5 3B for NPU & update related examples ( #12438 )  
							
							 
							
							... 
							
							
							
							* update qwen2.5-3B
* update convert
* small fix
* replace load_in_low_bit with low_bit
* small fix 
							
						 
						
							2024-11-25 16:38:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b633fbf26c 
								
							 
						 
						
							
							
								
								add chinese prompt troubleshooting for npu cpp examples ( #12437 )  
							
							 
							
							... 
							
							
							
							* add chinese prompt troubleshooting
* add chinese prompt troubleshooting 
							
						 
						
							2024-11-25 15:28:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								f41405368a 
								
							 
						 
						
							
							
								
								Support minicpm for NPU C++ ( #12434 )  
							
							 
							
							... 
							
							
							
							* support minicpm-1b
* update
* tune fused_layers
* update readme.md 
							
						 
						
							2024-11-25 10:42:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								0819fad34e 
								
							 
						 
						
							
							
								
								support Llama2-7B / Llama3-8B for NPU C++ ( #12431 )  
							
							 
							
							... 
							
							
							
							* support llama2
* update
* support fused_layers=4 for Llama2-7B 
							
						 
						
							2024-11-22 18:47:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4ffa6c752c 
								
							 
						 
						
							
							
								
								New convert support for C++ NPU ( #12430 )  
							
							 
							
							... 
							
							
							
							* initial commit
* fix
* fix style
* fix style
* fix
* fix 
							
						 
						
							2024-11-22 14:28:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								2935e97610 
								
							 
						 
						
							
							
								
								small fix of cpp readme( #12425 )  
							
							 
							
							
							
						 
						
							2024-11-21 18:21:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinhe 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								7e0a840f74 
								
							 
						 
						
							
							
								
								add optimization to openjourney ( #12423 )  
							
							 
							
							... 
							
							
							
							* add optimization to openjourney
* add optimization to openjourney 
							
						 
						
							2024-11-21 15:23:51 +08:00