zxue2
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								6ba3138d7c
								
							
						 | 
						
							
							
								
								Fix ambiguous boolean evaluation in bert.py (#13236)
							
							
							
							
							
							
							
							Signed-off-by: Xue, Zhan <zhan.xue@intel.com> 
							
						 | 
						
							2025-06-30 14:14:01 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								3f6d407be4
								
							
						 | 
						
							
							
								
								Fix engine.py (#13215)
							
							
							
							
							
						 | 
						
							2025-06-09 09:03:17 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Shaojun Liu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								5a629ae470
								
							
						 | 
						
							
							
								
								update vllm patch (#13211)
							
							
							
							
							
							
							
							Co-authored-by: gc-fu <guancheng.fu@intel.com> 
							
						 | 
						
							2025-06-06 17:20:45 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								ac04992278
								
							
						 | 
						
							
							
								
								Update engine.py (#13209)
							
							
							
							
							
						 | 
						
							2025-06-06 15:47:33 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								dd49368e0c
								
							
						 | 
						
							
							
								
								only install onednn for windows when torch 2.6 (#13207)
							
							
							
							
							
						 | 
						
							2025-06-05 17:28:21 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								5a1c1297e1
								
							
						 | 
						
							
							
								
								Fix internvl fp16 error (#13205)
							
							
							
							
							
						 | 
						
							2025-06-05 11:17:44 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								45864790f7
								
							
						 | 
						
							
							
								
								Enable phi-4 with vision and audio (#13203)
							
							
							
							
							
							
							
							* add phi4
* update
* enable audio
* update and add readme 
							
						 | 
						
							2025-06-05 10:15:20 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								e032156518
								
							
						 | 
						
							
							
								
								Support torch_fp8 (#13196)
							
							
							
							
							
							
							
							* support torch_fp8 
							
						 | 
						
							2025-06-04 20:08:01 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								3accc31b86
								
							
						 | 
						
							
							
								
								Update 1ccl_for_multi_arc.patch (#13199)
							
							
							
							
							
						 | 
						
							2025-05-30 17:13:59 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								bb50cd0881
								
							
						 | 
						
							
							
								
								Update api_server.py (#13198)
							
							
							
							
							
						 | 
						
							2025-05-30 09:26:53 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								9df610f80d
								
							
						 | 
						
							
							
								
								fix trl import when not running speculative (#13187)
							
							
							
							
							
							
							
							* fix trl import when not running speculative
* fix style 
							
						 | 
						
							2025-05-26 13:21:54 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Shaojun Liu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								c5d919b151
								
							
						 | 
						
							
							
								
								update vllm patch (#13185)
							
							
							
							
							
							
							
							Co-authored-by: gc-fu <guancheng.fu@intel.com> 
							
						 | 
						
							2025-05-23 15:02:50 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xiangyu Tian
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								531bef2810
								
							
						 | 
						
							
							
								
								vLLM: Fix conver_to_half condition  (#13177)
							
							
							
							
							
							
							
							* fix
* format 
							
						 | 
						
							2025-05-22 15:44:10 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								e3130a06ed
								
							
						 | 
						
							
							
								
								Fix multimodal errors (#13178)
							
							
							
							
							
							
							
							* fix glm4v int4 output error
* fix glm-4v qwen2.5-vl fp16 error
* update 
							
						 | 
						
							2025-05-22 15:39:27 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xiangyu Tian
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								154af7d7f7
								
							
						 | 
						
							
							
								
								vLLM: set convert_to_half to False by default (#13172)
							
							
							
							
							
							
							
							* init
* remove
* fix 
							
						 | 
						
							2025-05-21 18:41:28 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Shaojun Liu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								1576347892
								
							
						 | 
						
							
							
								
								Update Dockerfile (#13168)
							
							
							
							
							
						 | 
						
							2025-05-20 16:41:13 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								66eb054988
								
							
						 | 
						
							
							
								
								Update vllm patch (#13164)
							
							
							
							
							
						 | 
						
							2025-05-19 16:54:21 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								d83e5068d2
								
							
						 | 
						
							
							
								
								Enable whisper (#13162)
							
							
							
							
							
							
							
							* fix error
* update dockerfile 
							
						 | 
						
							2025-05-19 14:07:51 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								8ba57b41cd
								
							
						 | 
						
							
							
								
								Add merge quantized qkv (#13160)
							
							
							
							
							
							
							
							* add merge quantized qkv
* fix style & device
* add check 
							
						 | 
						
							2025-05-16 15:46:47 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Emmanuel Ferdman
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								1e4e1353a0
								
							
						 | 
						
							
							
								
								Resolve messages formatting issues (#13095)
							
							
							
							
							
							
							
							Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> 
							
						 | 
						
							2025-05-15 16:46:52 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Kai Huang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								35b49e4d91
								
							
						 | 
						
							
							
								
								Add trl version in error message (#13049)
							
							
							
							
							
							
							
							* add version in error msg
* fix style 
							
						 | 
						
							2025-05-15 09:16:27 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Pranav Singh
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								bd45bf7584
								
							
						 | 
						
							
							
								
								Update llama_cpp_quickstart.md (#13145)
							
							
							
							
							
							
							
							Signed-off-by: Pranav Singh <pranav.singh@intel.com> 
							
						 | 
						
							2025-05-15 08:40:53 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Shaojun Liu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								bd71739e64
								
							
						 | 
						
							
							
								
								Update docs and scripts to align with new Docker image release (#13156)
							
							
							
							
							
							
							
							* Update vllm_docker_quickstart.md
* Update start-vllm-service.sh
* Update vllm_docker_quickstart.md
* Update start-vllm-service.sh 
							
						 | 
						
							2025-05-13 17:06:29 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								f6441b4e3d
								
							
						 | 
						
							
							
								
								Add moe_softmax_topk (#13157)
							
							
							
							
							
							
							
							* add moe_softmax_topk
* address comments
* update 
							
						 | 
						
							2025-05-13 14:50:59 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yuwen Hu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								aa12f69bbf
								
							
						 | 
						
							
							
								
								Update Ollama portable zip QuickStart regarding saving VRAM (#13155)
							
							
							
							
							
							
							
							* Update Ollama portable zip quickstart regarding saving VRAM
* Small fix 
							
						 | 
						
							2025-05-13 13:25:22 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								086a8b3ab9
								
							
						 | 
						
							
							
								
								Update flashmoe_quickstart (#13154)
							
							
							
							
							
						 | 
						
							2025-05-13 07:56:09 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xiangyu Tian
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								886c7632b2
								
							
						 | 
						
							
							
								
								Add IPEX_LLM_FORCE_BATCH_FORWARD for vLLM docker image (#13151)
							
							
							
							
							
						 | 
						
							2025-05-12 13:44:33 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								5df03ced2c
								
							
						 | 
						
							
							
								
								Update vllm patch for fix telechat2 and baichuan2 error(#13150)
							
							
							
							
							
						 | 
						
							2025-05-12 10:54:22 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								9da1c56fa8
								
							
						 | 
						
							
							
								
								Create flashmoe quickstart (#13147)
							
							
							
							
							
						 | 
						
							2025-05-12 10:11:22 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								da08c9ca60
								
							
						 | 
						
							
							
								
								Update Dockerfile (#13148)
							
							
							
							
							
						 | 
						
							2025-05-12 09:19:18 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yuwen Hu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								0438e39f3e
								
							
						 | 
						
							
							
								
								Add PyTorch 2.6 support in Latest Update (#13144)
							
							
							
							
							
						 | 
						
							2025-05-09 13:26:49 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Shaojun Liu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								45f7bf6688
								
							
						 | 
						
							
							
								
								Refactor vLLM Documentation: Centralize Benchmarking and Improve Readability (#13141)
							
							
							
							
							
							
							
							* update vllm doc
* update image name
* update
* update
* update
* update 
							
						 | 
						
							2025-05-09 10:19:42 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								f5d9c49a2a
								
							
						 | 
						
							
							
								
								add rotary_half_with_cache_inplaced to ipex_llm.transformers.models.common (#13143)
							
							
							
							
							
							
							
							* update
* small fix 
							
						 | 
						
							2025-05-09 09:20:44 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								f2598b119e
								
							
						 | 
						
							
							
								
								update for bge-m3 (#13138)
							
							
							
							
							
						 | 
						
							2025-05-07 16:59:52 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									SONG Ge
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								e88a2aa65b
								
							
						 | 
						
							
							
								
								Modify ollama num_ctx related doc (#13139)
							
							
							
							
							
							
							
							* Modify ollama num_ctx related doc
* meet comments 
							
						 | 
						
							2025-05-07 16:44:58 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								3a28b69202
								
							
						 | 
						
							
							
								
								Add qwen3 support (#13137)
							
							
							
							
							
						 | 
						
							2025-05-07 14:03:16 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								be76918b61
								
							
						 | 
						
							
							
								
								Update 083 multimodal benchmark (#13135)
							
							
							
							
							
							
							
							* update multimodal benchmark
* update 
							
						 | 
						
							2025-05-07 09:35:09 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								01bc7e9eb9
								
							
						 | 
						
							
							
								
								Fix 083 lm_head error (#13132)
							
							
							
							
							
							
							
							* fix no quantize error
* update
* update style 
							
						 | 
						
							2025-05-06 15:47:20 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									SONG Ge
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								685a749adb
								
							
						 | 
						
							
							
								
								Update ollama-release doc into v0.6.2 (#13094)
							
							
							
							
							
							
							
							* Update ollama-release doc into v0.6.2
* update
* revert signature changes 
							
						 | 
						
							2025-04-30 16:22:42 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xiangyu Tian
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								51b41faad7
								
							
						 | 
						
							
							
								
								vLLM: update vLLM XPU to 0.8.3 version (#13118)
							
							
							
							
							
							
							
							vLLM: update vLLM XPU to 0.8.3 version 
							
						 | 
						
							2025-04-30 14:40:53 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yuwen Hu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								f66eee1d1d
								
							
						 | 
						
							
							
								
								Update BMG troubleshooting guides regarding PPA installation (#13119)
							
							
							
							
							
							
							
							* Update bmg troubleshooting guides regarding PPA installation
* Small fix
* Update based on comments
* Small fix 
							
						 | 
						
							2025-04-28 15:48:17 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								ad741503a9
								
							
						 | 
						
							
							
								
								Update bmg_quickstart.md (#13117)
							
							
							
							
							
						 | 
						
							2025-04-27 22:03:14 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								6b033f8982
								
							
						 | 
						
							
							
								
								Update readme (#13116)
							
							
							
							
							
						 | 
						
							2025-04-27 18:18:19 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								d222eaffd7
								
							
						 | 
						
							
							
								
								Update README.md (#13113)
							
							
							
							
							
						 | 
						
							2025-04-27 17:13:18 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								16fa778e65
								
							
						 | 
						
							
							
								
								enable glm4v and gemma-3 on vllm 083 (#13114)
							
							
							
							
							
							
							
							* enable glm4v and gemma-3
* update
* add qwen2.5-vl 
							
						 | 
						
							2025-04-27 17:10:56 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								cf97d8f1d7
								
							
						 | 
						
							
							
								
								Update start-vllm-service.sh (#13109)
							
							
							
							
							
						 | 
						
							2025-04-25 15:42:15 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								9808fb1ac2
								
							
						 | 
						
							
							
								
								update doc about flash-moe (#13103)
							
							
							
							
							
							
							
							* update doc about flashmoe
* revert toc
* meet review, add version note
* small fix 
							
						 | 
						
							2025-04-24 17:53:14 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								0cfdd399e7
								
							
						 | 
						
							
							
								
								Update README.md (#13104)
							
							
							
							
							
						 | 
						
							2025-04-24 10:21:17 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								908fdb982e
								
							
						 | 
						
							
							
								
								small refactor and fix (#13101)
							
							
							
							
							
						 | 
						
							2025-04-22 14:45:31 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
								
								
							
							
							
								
							
							
								14cd613fe1
								
							
						 | 
						
							
							
								
								Update vLLM docs with some new features (#13092)
							
							
							
							
							
							
							
							* done
* fix
* done
* Update README.md 
							
						 | 
						
							2025-04-22 14:39:28 +08:00 | 
						
						
							
							
							
								
							
							
						 |