binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								c911026f03
								
							
						 | 
						
							
							
								
								[NPU C++] Update model support & examples & benchmark  (#12466)
							
							
							
							
							
						 | 
						
							2024-11-29 13:35:58 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								14d8d3d8af
								
							
						 | 
						
							
							
								
								Integrate NPU C++ imple into ipex-llm (#12461)
							
							
							
							
							
						 | 
						
							2024-11-29 09:25:37 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								b9abb8a285
								
							
						 | 
						
							
							
								
								Support qwen2.5 3B for NPU & update related examples (#12438)
							
							
							
							
							
							
							
							* update qwen2.5-3B
* update convert
* small fix
* replace load_in_low_bit with low_bit
* small fix 
							
						 | 
						
							2024-11-25 16:38:31 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								3fe2ea3081
								
							
						 | 
						
							
							
								
								[NPU] Reuse prefill of acc lib for pipeline (#12279)
							
							
							
							
							
							
							
							* first commit
* update example
* fix style
* update example
* embedding as const
* fix generate
* code  refactor
* meet code review
* fix style
* change max_output_len to max_context_len
* fix all-in-one
* fix example
* add check for new tokens 
							
						 | 
						
							2024-10-28 16:05:49 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jin, Qiao
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								8fa98e2742
								
							
						 | 
						
							
							
								
								Remove Qwen2-7b from NPU example for "Run Optimized Models (Experimental)" (#12245)
							
							
							
							
							
							
							
							* Remove qwen2-7b from npu example readme
* fix 
							
						 | 
						
							2024-10-22 17:07:51 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jin, Qiao
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								2bedb17be7
								
							
						 | 
						
							
							
								
								Add Qwen2.5 NPU Example (#12110)
							
							
							
							
							
							
							
							* Add Qwen2.5 NPU Example
* fix
* Merge qwen2.py and qwen2.5.py into qwen.py
* Fix description 
							
						 | 
						
							2024-09-25 15:20:03 +08:00 | 
						
						
							
							
							
								
							
							
						 |