ipex-llm

Author	SHA1	Message	Date
Shaojun Liu	25e1709050	To avoid errors caused by a Transformers version that is too new. (#13291 )	2025-08-14 14:52:47 +08:00
Shaojun Liu	cac90a9238	update patches (#13290 ) Signed-off-by: liu-shaojun <shaojun.liu@intel.com>	2025-08-14 10:15:48 +08:00
Yina Chen	9cfdf143a2	delete the deprecated llm win test (#13275 )	2025-08-01 11:27:46 +08:00
Qiyuan Gong	891e1f511b	[Doc] Add note about avoiding sourcing oneAPI for flashmoe and llama.cpp portable zip (#13274 ) * Add note about avoiding sourcing oneAPI * Move note ahead of cli	2025-07-30 13:58:52 +08:00
SheldonChen	951c23739d	update quickstart md related to llama.cpp/ollama (#13265 ) * update quickstart md related to llama.cpp/ollama * update troubleshooting * update quickstart/troubleshooting according to RuonanWang's comments	2025-07-21 16:20:20 +08:00
Emmanuel Ferdman	68c5103a0a	[NPU] Update quickstart reference (#13262 ) Fix the wrong QuickStart URLs in NPU `Save-Load/README.md`. Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>	2025-07-21 09:55:40 +08:00
Jason Dai	b229e5ad60	Update README.md (#13258 )	2025-07-18 07:27:01 +08:00
Yina Chen	f0b600da77	update llama.cpp version (#13251 ) * update llama.cpp version	2025-07-09 17:30:27 +08:00
Ruonan Wang	28f72123bd	update ollama version (#13244 )	2025-07-01 09:20:46 +08:00
zxue2	6ba3138d7c	Fix ambiguous boolean evaluation in bert.py (#13236 ) Signed-off-by: Xue, Zhan <zhan.xue@intel.com>	2025-06-30 14:14:01 +08:00
Guancheng Fu	3f6d407be4	Fix engine.py (#13215 )	2025-06-09 09:03:17 +08:00
Shaojun Liu	5a629ae470	update vllm patch (#13211 ) Co-authored-by: gc-fu <guancheng.fu@intel.com>	2025-06-06 17:20:45 +08:00
Guancheng Fu	ac04992278	Update engine.py (#13209 )	2025-06-06 15:47:33 +08:00
Ruonan Wang	dd49368e0c	only install onednn for windows when torch 2.6 (#13207 )	2025-06-05 17:28:21 +08:00
Wang, Jian4	5a1c1297e1	Fix internvl fp16 error (#13205 )	2025-06-05 11:17:44 +08:00
Wang, Jian4	45864790f7	Enable phi-4 with vision and audio (#13203 ) * add phi4 * update * enable audio * update and add readme	2025-06-05 10:15:20 +08:00
Yina Chen	e032156518	Support torch_fp8 (#13196 ) * support torch_fp8	2025-06-04 20:08:01 +08:00
Guancheng Fu	3accc31b86	Update 1ccl_for_multi_arc.patch (#13199 )	2025-05-30 17:13:59 +08:00
Guancheng Fu	bb50cd0881	Update api_server.py (#13198 )	2025-05-30 09:26:53 +08:00
Ruonan Wang	9df610f80d	fix trl import when not running speculative (#13187 ) * fix trl import when not running speculative * fix style	2025-05-26 13:21:54 +08:00
Shaojun Liu	c5d919b151	update vllm patch (#13185 ) Co-authored-by: gc-fu <guancheng.fu@intel.com>	2025-05-23 15:02:50 +08:00
Xiangyu Tian	531bef2810	vLLM: Fix conver_to_half condition (#13177 ) * fix * format	2025-05-22 15:44:10 +08:00
Wang, Jian4	e3130a06ed	Fix multimodal errors (#13178 ) * fix glm4v int4 output error * fix glm-4v qwen2.5-vl fp16 error * update	2025-05-22 15:39:27 +08:00
Xiangyu Tian	154af7d7f7	vLLM: set convert_to_half to False by default (#13172 ) * init * remove * fix	2025-05-21 18:41:28 +08:00
Shaojun Liu	1576347892	Update Dockerfile (#13168 )	2025-05-20 16:41:13 +08:00
Wang, Jian4	66eb054988	Update vllm patch (#13164 )	2025-05-19 16:54:21 +08:00
Wang, Jian4	d83e5068d2	Enable whisper (#13162 ) * fix error * update dockerfile	2025-05-19 14:07:51 +08:00
Yina Chen	8ba57b41cd	Add merge quantized qkv (#13160 ) * add merge quantized qkv * fix style & device * add check	2025-05-16 15:46:47 +08:00
Emmanuel Ferdman	1e4e1353a0	Resolve messages formatting issues (#13095 ) Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>	2025-05-15 16:46:52 +08:00
Kai Huang	35b49e4d91	Add trl version in error message (#13049 ) * add version in error msg * fix style	2025-05-15 09:16:27 +08:00
Pranav Singh	bd45bf7584	Update llama_cpp_quickstart.md (#13145 ) Signed-off-by: Pranav Singh <pranav.singh@intel.com>	2025-05-15 08:40:53 +08:00
Shaojun Liu	bd71739e64	Update docs and scripts to align with new Docker image release (#13156 ) * Update vllm_docker_quickstart.md * Update start-vllm-service.sh * Update vllm_docker_quickstart.md * Update start-vllm-service.sh	2025-05-13 17:06:29 +08:00
Yina Chen	f6441b4e3d	Add moe_softmax_topk (#13157 ) * add moe_softmax_topk * address comments * update	2025-05-13 14:50:59 +08:00
Yuwen Hu	aa12f69bbf	Update Ollama portable zip QuickStart regarding saving VRAM (#13155 ) * Update Ollama portable zip quickstart regarding saving VRAM * Small fix	2025-05-13 13:25:22 +08:00
Jason Dai	086a8b3ab9	Update flashmoe_quickstart (#13154 )	2025-05-13 07:56:09 +08:00
Xiangyu Tian	886c7632b2	Add IPEX_LLM_FORCE_BATCH_FORWARD for vLLM docker image (#13151 )	2025-05-12 13:44:33 +08:00
Wang, Jian4	5df03ced2c	Update vllm patch for fix telechat2 and baichuan2 error(#13150 )	2025-05-12 10:54:22 +08:00
Jason Dai	9da1c56fa8	Create flashmoe quickstart (#13147 )	2025-05-12 10:11:22 +08:00
Guancheng Fu	da08c9ca60	Update Dockerfile (#13148 )	2025-05-12 09:19:18 +08:00
Yuwen Hu	0438e39f3e	Add PyTorch 2.6 support in Latest Update (#13144 )	2025-05-09 13:26:49 +08:00
Shaojun Liu	45f7bf6688	Refactor vLLM Documentation: Centralize Benchmarking and Improve Readability (#13141 ) * update vllm doc * update image name * update * update * update * update	2025-05-09 10:19:42 +08:00
Ruonan Wang	f5d9c49a2a	add `rotary_half_with_cache_inplaced` to `ipex_llm.transformers.models.common` (#13143 ) * update * small fix	2025-05-09 09:20:44 +08:00
Wang, Jian4	f2598b119e	update for bge-m3 (#13138 )	2025-05-07 16:59:52 +08:00
SONG Ge	e88a2aa65b	Modify ollama num_ctx related doc (#13139 ) * Modify ollama num_ctx related doc * meet comments	2025-05-07 16:44:58 +08:00
Yishuo Wang	3a28b69202	Add qwen3 support (#13137 )	2025-05-07 14:03:16 +08:00
Wang, Jian4	be76918b61	Update 083 multimodal benchmark (#13135 ) * update multimodal benchmark * update	2025-05-07 09:35:09 +08:00
Wang, Jian4	01bc7e9eb9	Fix 083 lm_head error (#13132 ) * fix no quantize error * update * update style	2025-05-06 15:47:20 +08:00
SONG Ge	685a749adb	Update ollama-release doc into v0.6.2 (#13094 ) * Update ollama-release doc into v0.6.2 * update * revert signature changes	2025-04-30 16:22:42 +08:00
Xiangyu Tian	51b41faad7	vLLM: update vLLM XPU to 0.8.3 version (#13118 ) vLLM: update vLLM XPU to 0.8.3 version	2025-04-30 14:40:53 +08:00
Yuwen Hu	f66eee1d1d	Update BMG troubleshooting guides regarding PPA installation (#13119 ) * Update bmg troubleshooting guides regarding PPA installation * Small fix * Update based on comments * Small fix	2025-04-28 15:48:17 +08:00

1 2 3 4 5 ...

4109 commits