Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								0e23bd779f
								
							
						 | 
						
							
							
								
								Add support of llama3.2 for NPU C++ (#12442)
							
							
							
							
							
							
							
							* initial support of  llama3.2
* update
* update
* fix style
* fix style
* fix
* small fix 
							
						 | 
						
							2024-11-26 09:26:55 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								b9abb8a285
								
							
						 | 
						
							
							
								
								Support qwen2.5 3B for NPU & update related examples (#12438)
							
							
							
							
							
							
							
							* update qwen2.5-3B
* update convert
* small fix
* replace load_in_low_bit with low_bit
* small fix 
							
						 | 
						
							2024-11-25 16:38:31 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								b2e69a896c
								
							
						 | 
						
							
							
								
								[NPU] Support Baichuan groupwise & gw code refactor (#12337)
							
							
							
							
							
							
							
							* support minicpm 1b & qwen 1.5b gw
* support minicpm 1b
* baichuan part
* update
* support minicpm 1b & qwen 1.5b gw
* support minicpm 1b
* baichuan part
* update
* update
* update
* baichuan support
* code refactor
* remove code
* fix style
* address comments
* revert 
							
						 | 
						
							2024-11-08 11:42:42 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								812d5cc32e
								
							
						 | 
						
							
							
								
								[NPU L0] Support llama3.2 in L0 pipeline (#12361)
							
							
							
							
							
						 | 
						
							2024-11-08 10:01:23 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								d872639395
								
							
						 | 
						
							
							
								
								[NPU] Llama3, Qwen2 1.5b, MiniCPM 1/2B groupwise support (#12327)
							
							
							
							
							
							
							
							* support minicpm 1b & qwen 1.5b gw
* support minicpm 1b
* support minicpm 2b
* fix style & error
* fix style & update
* remove print 
							
						 | 
						
							2024-11-05 15:51:31 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Kai Huang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								c8679ad592
								
							
						 | 
						
							
							
								
								Qwen layernorm as input (#12309)
							
							
							
							
							
							
							
							* qwen layernorm as input
* add group size 
							
						 | 
						
							2024-11-04 09:51:15 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								d409d9d0eb
								
							
						 | 
						
							
							
								
								[NPU L0] Update streaming mode of example (#12312)
							
							
							
							
							
						 | 
						
							2024-11-01 15:38:10 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								eda764909c
								
							
						 | 
						
							
							
								
								Add minicpm-2b in L0 pipeline (#12308)
							
							
							
							
							
						 | 
						
							2024-11-01 09:30:01 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								4892df61c9
								
							
						 | 
						
							
							
								
								Add qwen2-1.5b in l0 pipeline example (#12306)
							
							
							
							
							
						 | 
						
							2024-10-31 16:44:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Kai Huang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								416c19165c
								
							
						 | 
						
							
							
								
								Add Qwen pipeline and example (#12292)
							
							
							
							
							
							
							
							* support qwen pipeline
* update error msg
* style
* meet review
* minor 
							
						 | 
						
							2024-10-31 11:25:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								41b8064554
								
							
						 | 
						
							
							
								
								Support minicpm-1B in level0 pipeline (#12297)
							
							
							
							
							
						 | 
						
							2024-10-30 17:21:47 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								2b2cb9c693
								
							
						 | 
						
							
							
								
								[NPU pipeline] Support save & load and update examples (#12293)
							
							
							
							
							
							
							
							* support save & load, update llama examples
* update baichuan2 example
* update readme 
							
						 | 
						
							2024-10-30 10:02:00 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								3feb58d1e4
								
							
						 | 
						
							
							
								
								Support baichuan2 for level0 pipeline (#12289)
							
							
							
							
							
						 | 
						
							2024-10-29 19:24:16 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								4467645088
								
							
						 | 
						
							
							
								
								[NPU] Support l0 Llama groupwise (#12276)
							
							
							
							
							
							
							
							* except lm_head
* remove
* support gw lm_head
* update
* fix
* remove run.bat
* fix style
* support llama3 
							
						 | 
						
							2024-10-28 17:06:55 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								3fe2ea3081
								
							
						 | 
						
							
							
								
								[NPU] Reuse prefill of acc lib for pipeline (#12279)
							
							
							
							
							
							
							
							* first commit
* update example
* fix style
* update example
* embedding as const
* fix generate
* code  refactor
* meet code review
* fix style
* change max_output_len to max_context_len
* fix all-in-one
* fix example
* add check for new tokens 
							
						 | 
						
							2024-10-28 16:05:49 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								ec362e6133
								
							
						 | 
						
							
							
								
								Add llama3 level0 example (#12275)
							
							
							
							
							
						 | 
						
							2024-10-28 09:24:51 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								854398f6e0
								
							
						 | 
						
							
							
								
								update example to reduce peak memory usage (#12274)
							
							
							
							
							
						 | 
						
							2024-10-25 17:09:26 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								ae57e23e4f
								
							
						 | 
						
							
							
								
								fix incompatibility between llama GW & llama pipeline (#12267)
							
							
							
							
							
							
							
							* fix
* fix 
							
						 | 
						
							2024-10-25 10:31:44 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								821fd96367
								
							
						 | 
						
							
							
								
								Initial integrate our L0 Llama impl into ipex-llm (#12255)
							
							
							
							
							
							
							
							* temp save
* initial support
* fix
* simplify code
* fix style
* fix example
* make default value of pipeline as False 
							
						 | 
						
							2024-10-24 09:49:27 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								4d93bb81fe
								
							
						 | 
						
							
							
								
								Initial support of NPU level0 Model (#12177)
							
							
							
							
							
							
							
							* first commit to support load dll and init llm pipeline
* add init generate
* fix style
* small updates
* fix style and check tokens number 
							
						 | 
						
							2024-10-11 09:45:53 +08:00 | 
						
						
							
							
							
								
							
							
						 |