binbin Deng
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								23d3acdc77
								
							
						 | 
						
							
							
								
								Add experimental support of fused decoder layer for llama2 (#11768)
							
							
							
							
							
						 | 
						
							2024-08-13 14:41:36 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zhao Changmin
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								f7e957aaf9
								
							
						 | 
						
							
							
								
								Clean npu dtype branch (#11515)
							
							
							
							
							
							
							
							* clean branch
* create_npu_kernels 
							
						 | 
						
							2024-07-05 15:45:26 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zhao Changmin
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								57b8adb189
								
							
						 | 
						
							
							
								
								[WIP] Support npu load_low_bit method (#11502)
							
							
							
							
							
							
							
							* npu_load_low_bit 
							
						 | 
						
							2024-07-04 17:15:34 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zhao Changmin
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								6a0134a9b2
								
							
						 | 
						
							
							
								
								support q4_0_rtn  (#11477)
							
							
							
							
							
							
							
							* q4_0_rtn 
							
						 | 
						
							2024-07-02 16:57:02 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zhao Changmin
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								cf8eb7b128
								
							
						 | 
						
							
							
								
								Init NPU quantize method and support q8_0_rtn (#11452)
							
							
							
							
							
							
							
							* q8_0_rtn
* fix float point 
							
						 | 
						
							2024-07-01 13:45:07 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								ca0e69c3a7
								
							
						 | 
						
							
							
								
								optimize npu llama perf again (#11431)
							
							
							
							
							
						 | 
						
							2024-06-26 10:52:54 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								9f6e5b4fba
								
							
						 | 
						
							
							
								
								optimize llama npu perf (#11426)
							
							
							
							
							
						 | 
						
							2024-06-25 17:43:20 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								a5e7d93242
								
							
						 | 
						
							
							
								
								Add initial save/load low bit support for NPU(now only fp16 is supported) (#11359)
							
							
							
							
							
						 | 
						
							2024-06-20 10:49:39 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								ae7b662ed2
								
							
						 | 
						
							
							
								
								add fp16 NPU Linear support and fix intel_npu_acceleration_library version 1.0 support (#11352)
							
							
							
							
							
						 | 
						
							2024-06-19 09:14:59 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								83082e5cc7
								
							
						 | 
						
							
							
								
								add initial support for intel npu acceleration library (#11347)
							
							
							
							
							
						 | 
						
							2024-06-18 16:07:16 +08:00 | 
						
						
							
							
							
								
							
							
						 |