- 
				
					
						
						
							
							
						
						
							ddcdf47539
					
					
						Support Windows ARL release (#12183)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-10-11 18:30:52 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							f983f1a8f4
					
					
						Add Qwen2-VL gpu example (#12135)
					
					
						
					
					
						
						
							
							Jinhe
						
					
					2024-10-11 18:25:23 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							310f18c8af
					
					
						update NPU pipeline generate (#12182)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-10-11 17:39:20 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							1daab4531f
					
					
						Upgrade oneccl to 0.0.4 in serving-xpu image (#12185)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-10-11 16:54:50 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							724b2ae66d
					
					
						add npu-level0 pipeline.dll to ipex-llm (#12181)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-10-11 16:05:20 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							4d93bb81fe
					
					
						Initial support of NPU level0 Model (#12177)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-10-11 09:45:53 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							ac44e98b7d
					
					
						Update Windows guide regarding LNL support (#12178)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-10-11 09:20:08 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							0ef7e1d101
					
					
						fix vllm docs (#12176)
					
					
						
					
					
						
						
							
							Guancheng Fu
						
					
					2024-10-10 15:44:36 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							890662610b
					
					
						Fix auto importer for LNL release (#12175)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-10-10 15:17:43 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							535bee5381
					
					
						fix qwen2 vl again (#12174)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-10-10 13:50:01 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							aef1f671bd
					
					
						Support LNL Windows release (#12169)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-10-09 17:41:10 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							78d253165d
					
					
						optimize qwen2 vl perf again (#12167)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-10-09 16:43:48 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							412cf8e20c
					
					
						[UPDATE] update mddocs/DockerGuides/vllm_docker_quickstart.md (#12166)
					
					
						
					
					
						
						
							
							Jun Wang
						
					
					2024-10-09 11:19:32 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							3d044dbf53
					
					
						add llama3.2-vision Pytorch example (#12165)
					
					
						
					
					
						
						
							
							Zijie Li
						
					
					2024-10-08 21:20:42 -0400
				
			 
		
			- 
				
					
						
						
							
							
						
						
							e2ef9e938e
					
					
						Delete deprecated docs/readthedocs directory (#12164)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-10-08 14:48:02 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							644af2a76e
					
					
						add basic llama 3.2 vision support (#12163)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-10-08 10:46:48 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							9b75806d14
					
					
						Update Windows GPU quickstart regarding demo (#12124)
					
					
						
					
					
						
						
							
							Ch1y0q
						
					
					2024-09-29 18:08:49 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							17c23cd759
					
					
						add llama3.2 GPU example (#12137)
					
					
						
					
					
						
						
							
							Ch1y0q
						
					
					2024-09-29 14:41:54 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							f71b38a994
					
					
						Update MiniCPM_V_26 GPU example with save & load (#12127)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-09-26 17:40:22 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							669ff1a97b
					
					
						fix sd1.5 (#12129)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-26 17:15:16 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							a266528719
					
					
						optimize llama 3.2 rope (#12128)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-26 16:08:10 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							584c3489e7
					
					
						add basic support for llama3.2 (#12125)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-26 15:46:19 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							66f419f8b7
					
					
						fix qwen2 vl (#12126)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-26 15:44:02 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							2ea13d502f
					
					
						Add minicpm3 gpu example (#12114)
					
					
						
					
					
						
						
							
							Ch1y0q
						
					
					2024-09-26 13:51:37 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							77af9bc5fa
					
					
						support passing None to low_bit in optimize_model (#12121)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-26 11:09:35 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							47e0b83cbf
					
					
						optimize sd 1.5 (#12119)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-25 15:45:13 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							2bedb17be7
					
					
						Add Qwen2.5 NPU Example (#12110)
					
					
						
					
					
						
						
							
							Jin, Qiao
						
					
					2024-09-25 15:20:03 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							657889e3e4
					
					
						use english prompt by default (#12115)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-24 17:40:50 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							5d63aef60b
					
					
						optimize qwen2 vl again (#12109)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-23 13:22:01 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							03bd01c99c
					
					
						optimize npu qwen2 (#12107)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-20 04:46:16 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							02399021d6
					
					
						add npu load_low_bit api in all-in-one benchmark (#12103)
					
					
						
					
					
						
						
							
							Jinhe
						
					
					2024-09-20 17:56:08 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							47a9597f24
					
					
						Add missing link for Qwen2.5 to CN-ZH readme (#12106)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-09-20 17:30:30 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							9239fd4f12
					
					
						add basic support and optimization for qwen2-vl (#12104)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-20 17:23:06 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							828fa01ad3
					
					
						[NPU] Add 
mixed_precision for Qwen2 7B (#12098)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-09-20 16:36:21 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							2269768e71
					
					
						add internvl2 example (#12102)
					
					
						
					
					
						
						
							
							Ch1y0q
						
					
					2024-09-20 16:31:54 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							ad1fe77fe6
					
					
						Add language switching (#12096)
					
					
						
					
					
						
						
							
							joan726
						
					
					2024-09-20 16:05:20 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							09b8c80d9d
					
					
						update code for NPU qwen2 (#12094)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-20 00:58:32 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							db7500bfd4
					
					
						Add Qwen2.5 GPU example (#12101)
					
					
						
					
					
						
						
							
							Jin, Qiao
						
					
					2024-09-20 15:55:57 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							b36359e2ab
					
					
						Fix xpu serving image oneccl (#12100)
					
					
						
					
					
						
						
							
							Guancheng Fu
						
					
					2024-09-20 15:25:41 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							54b973c744
					
					
						fix ipex_llm import in transformers 4.45 (#12099)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-20 15:24:59 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							a6cbc01911
					
					
						Use new oneccl for ipex-llm serving image (#12097)
					
					
						
					
					
						
						
							
							Guancheng Fu
						
					
					2024-09-20 14:52:49 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							1295898830
					
					
						update vllm_online_benchmark script to support long input (#12095)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-20 14:18:30 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							9650bf616a
					
					
						add 
transpose_value_cache for NPU benchmark (#12092)
					
					
						
					
					
						
						
							
							Ch1y0q
						
					
					2024-09-19 18:45:05 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							f7fb3c896c
					
					
						Update lm_head optimization for Qwen2 7B (#12090)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-09-18 17:02:02 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							ee33b93464
					
					
						Longbench: NV code to ipex-llm (#11662)
					
					
						
					
					
						
						
							
							Xu, Shuo
						
					
					2024-09-18 15:55:14 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							40e463c66b
					
					
						Enable vllm load gptq model (#12083)
					
					
						
					
					
						
						
							
							Wang, Jian4
						
					
					2024-09-18 14:41:00 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							c2774e1a43
					
					
						Update oneccl to 0.0.3 in serving-xpu image (#12088)
					
					
						
					
					
						
						
							
							Xiangyu Tian
						
					
					2024-09-18 14:29:17 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							081af41def
					
					
						[NPU] Optimize Qwen2 lm_head to use INT4 (#12072)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-14 00:26:46 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							18714ceac7
					
					
						Update README.md (#12084)
					
					
						
					
					
						
						
							
							joan726
						
					
					2024-09-14 15:24:08 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							b4b8c3e495
					
					
						add 
lowbit_path for generate.py, fix npu_model (#12077)
					
					
						
					
					
						
						
							
							Ch1y0q
						
					
					2024-09-13 17:28:05 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							d703e4f127
					
					
						Enable vllm multimodal minicpm-v-2-6 (#12074)
					
					
						
					
					
						
						
							
							Wang, Jian4
						
					
					2024-09-13 13:28:35 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							a767438546
					
					
						fix typo (#12076)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-12 20:44:42 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							3f0b24ae2b
					
					
						update cpp quickstart (#12075)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-12 20:35:32 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							9b4fee8b5b
					
					
						disable nightly release for finetune images (#12070)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-12 15:10:50 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							beb876665d
					
					
						pin gradio version to fix connection error (#12069)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-12 14:36:09 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							48d9092b5a
					
					
						upgrade OneAPI version for cpp Windows (#12063)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-11 20:12:12 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							e78e45ee01
					
					
						update NPU readme: run conhost as administrator (#12066)
					
					
						
					
					
						
						
							
							Jinhe
						
					
					2024-09-11 17:54:04 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							4ca330da15
					
					
						Fix NPU load error message and add minicpm npu lowbit feat (#12064)
					
					
						
					
					
						
						
							
							Jinhe
						
					
					2024-09-11 16:56:35 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							32e8362da7
					
					
						added minicpm cpu examples (#12027)
					
					
						
					
					
						
						
							
							Jinhe
						
					
					2024-09-11 15:51:21 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							a0c73c26d8
					
					
						clean NPU code (#12060)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-11 00:10:35 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							c75f3dd874
					
					
						vllm no padding glm4 to avoid nan error (#12062)
					
					
						
					
					
						
						
							
							Wang, Jian4
						
					
					2024-09-11 13:44:40 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							649390c464
					
					
						fix: textual and env variable adjustment (#12038)
					
					
						
					
					
						
						
							
							Chu,Youcheng
						
					
					2024-09-11 13:38:01 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							c94032f97e
					
					
						Try to fix llamaindex ut again (#12061)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-09-11 12:11:04 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							7e1e51d91a
					
					
						Update vllm setting (#12059)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-11 11:45:08 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							30a8680645
					
					
						Update for vllm one card padding (#12058)
					
					
						
					
					
						
						
							
							Wang, Jian4
						
					
					2024-09-11 10:52:55 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							c5fdfde1bd
					
					
						fix npu-model prompt (#12057)
					
					
						
					
					
						
						
							
							Zijie Li
						
					
					2024-09-11 10:06:45 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							94dade9aca
					
					
						Fix UT of ipex_llm.llamaindex (#12055)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-09-11 09:58:43 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							52863dd567
					
					
						fix vllm_online_benchmark.py (#12056)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-11 09:45:30 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							d8c044e79d
					
					
						optimize minicpm3 kv cache (#12052)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-10 16:51:21 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							5d3ab16a80
					
					
						Add vllm glm and baichuan padding (#12053)
					
					
						
					
					
						
						
							
							Wang, Jian4
						
					
					2024-09-10 15:57:28 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							69c8d36f16
					
					
						Switching from vLLM v0.3.3 to vLLM 0.5.4 (#12042)
					
					
						
					
					
						
						
							
							Guancheng Fu
						
					
					2024-09-10 15:37:43 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							73a4360f3f
					
					
						update lowbit path for baichuan2, qwen2, 
generate.py (#12051)
					
					
						
					
					
						
						
							
							Ch1y0q
						
					
					2024-09-10 15:35:24 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							dc4af02b2a
					
					
						Fix qwen2 1.5B NPU load error (#12049)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-09 23:41:18 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							abc370728c
					
					
						optimize minicpm3 again (#12047)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-10 14:19:57 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							f0061a9916
					
					
						remove local import os to fix Baichuan NPU load issue (#12044)
					
					
						
					
					
						
						
							
							Ch1y0q
						
					
					2024-09-10 14:13:24 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							640998edea
					
					
						update inter_pp of qwen2 (#12041)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-09 19:34:17 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							048b4590aa
					
					
						add basic minicpm3 optimization (#12039)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-09 17:25:08 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							16c658e732
					
					
						LLM: add known issues to harness evaluation (#12036)
					
					
						
					
					
						
						
							
							Chu,Youcheng
						
					
					2024-09-09 14:15:42 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							6cedb601e4
					
					
						remove some useless code (#12035)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-06 17:51:08 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							d2e1b9aaff
					
					
						Add input padding during prefill for qwen2-7b (#12033)
					
					
						
					
					
						
						
							
							binbin Deng
						
					
					2024-09-06 16:39:59 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							f61b1785fb
					
					
						Small update to NPU example readme (#12034)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-09-06 15:54:23 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							0d04531ae0
					
					
						update NPU readme of Qwen2 (#12032)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-06 00:02:39 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							58555bd9de
					
					
						Optimize broadcast for npu llama (#12028)
					
					
						
					
					
						
						
							
							Yang Wang
						
					
					2024-09-05 22:28:20 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							e5581e6ded
					
					
						Select the Appropriate APT Repository Based on CPU Type (#12023)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-05 17:06:07 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							5b18bb3c4a
					
					
						Add recommend version for mtl npu (#12024)
					
					
						
					
					
						
						
							
							binbin Deng
						
					
					2024-09-05 16:28:53 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							845e5dc89e
					
					
						Support lm_head of minicpm-2b on NPU (#12019)
					
					
						
					
					
						
						
							
							binbin Deng
						
					
					2024-09-05 16:19:22 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							820f8a4554
					
					
						add 
--lowbit-path option for NPU llama example (#12020)
					
					
						
					
					
						
						
							
							Ch1y0q
						
					
					2024-09-05 15:31:01 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							8803242f5c
					
					
						fix llama on cpu (#12018)
					
					
						
					
					
						
						
							
							Guoqiong Song
						
					
					2024-09-04 19:17:54 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							b3b2cd64b4
					
					
						Support lightweight-serving glm-4v-9b  (#11994)
					
					
						
					
					
						
						
							
							Wang, Jian4
						
					
					2024-09-05 09:25:08 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							75b19f8522
					
					
						revert actions/download-artifact version to 3 (#12017)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-04 22:39:07 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							c6348a4666
					
					
						Update action.yml (#12016)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-04 22:12:24 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							b1408a1f1c
					
					
						fix UT (#12005)
					
					
						
					
					
						
						
							
							Yishuo Wang
						
					
					2024-09-04 18:02:49 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							77cb348220
					
					
						fix dependabot alerts (#12006)
					
					
						
					
					
						
						
							
							Shaojun Liu
						
					
					2024-09-04 17:13:45 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							2b993ad479
					
					
						vllm update for glm-4 model automatic not_convert (#12003)
					
					
						
					
					
						
						
							
							Wang, Jian4
						
					
					2024-09-04 13:50:32 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							9eaff5e47d
					
					
						add save &  load support for NPU optimized model (#11999)
					
					
						
					
					
						
						
							
							Ruonan Wang
						
					
					2024-09-03 05:53:22 -0700
				
			 
		
			- 
				
					
						
						
							
							
						
						
							6eb55653ba
					
					
						Performance mode strategy update for input_embeds input (#11997)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-09-03 17:46:16 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							164f47adbd
					
					
						MiniCPM-V-2 & MiniCPM-Llama3-V-2_5 example updates (#11988)
					
					
						
					
					
						
						
							
							Jinhe
						
					
					2024-09-03 17:02:06 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							2e54f4402b
					
					
						Rename MiniCPM-V-2_6 CPU example (#11998)
					
					
						
					
					
						
						
							
							Jin, Qiao
						
					
					2024-09-03 16:50:42 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							643458d8f0
					
					
						Update GraphRAG QuickStart (#11995)
					
					
						
					
					
						
						
							
							Yuwen Hu
						
					
					2024-09-03 15:52:08 +0800
				
			 
		
			- 
				
					
						
						
							
							
						
						
							01099f08ee
					
					
						Revert prefill logic of qwen2-7b (#11992)
					
					
						
					
					
						
						
							
							binbin Deng
						
					
					2024-09-03 14:45:01 +0800