-
25e1709050
To avoid errors caused by a Transformers version that is too new. (#13291)
main
Shaojun Liu
2025-08-14 14:52:47 +0800
-
cac90a9238
update patches (#13290)
Shaojun Liu
2025-08-14 10:15:48 +0800
-
9cfdf143a2
delete the deprecated llm win test (#13275)
Yina Chen
2025-08-01 11:27:46 +0800
-
891e1f511b
[Doc] Add note about avoiding sourcing oneAPI for flashmoe and llama.cpp portable zip (#13274)
Qiyuan Gong
2025-07-30 13:58:52 +0800
-
951c23739d
update quickstart md related to llama.cpp/ollama (#13265)
SheldonChen
2025-07-21 16:20:20 +0800
-
68c5103a0a
[NPU] Update quickstart reference (#13262)
Emmanuel Ferdman
2025-07-21 04:55:40 +0300
-
b229e5ad60
Update README.md (#13258)
Jason Dai
2025-07-18 07:27:01 +0800
-
f0b600da77
update llama.cpp version (#13251)
Yina Chen
2025-07-09 17:30:27 +0800
-
28f72123bd
update ollama version (#13244)
Ruonan Wang
2025-07-01 09:20:46 +0800
-
6ba3138d7c
Fix ambiguous boolean evaluation in bert.py (#13236)
zxue2
2025-06-30 14:14:01 +0800
-
3f6d407be4
Fix engine.py (#13215)
Guancheng Fu
2025-06-09 09:03:17 +0800
-
5a629ae470
update vllm patch (#13211)
Shaojun Liu
2025-06-06 17:20:45 +0800
-
ac04992278
Update engine.py (#13209)
Guancheng Fu
2025-06-06 15:47:33 +0800
-
dd49368e0c
only install onednn for windows when torch 2.6 (#13207)
Ruonan Wang
2025-06-05 17:28:21 +0800
-
5a1c1297e1
Fix internvl fp16 error (#13205)
Wang, Jian4
2025-06-05 11:17:44 +0800
-
45864790f7
Enable phi-4 with vision and audio (#13203)
Wang, Jian4
2025-06-05 10:15:20 +0800
-
e032156518
Support torch_fp8 (#13196)
Yina Chen
2025-06-04 20:08:01 +0800
-
3accc31b86
Update 1ccl_for_multi_arc.patch (#13199)
Guancheng Fu
2025-05-30 17:13:59 +0800
-
bb50cd0881
Update api_server.py (#13198)
Guancheng Fu
2025-05-30 09:26:53 +0800
-
9df610f80d
fix trl import when not running speculative (#13187)
Ruonan Wang
2025-05-26 13:21:54 +0800
-
c5d919b151
update vllm patch (#13185)
Shaojun Liu
2025-05-23 15:02:50 +0800
-
531bef2810
vLLM: Fix conver_to_half condition (#13177)
Xiangyu Tian
2025-05-22 15:44:10 +0800
-
e3130a06ed
Fix multimodal errors (#13178)
Wang, Jian4
2025-05-22 15:39:27 +0800
-
154af7d7f7
vLLM: set convert_to_half to False by default (#13172)
Xiangyu Tian
2025-05-21 18:41:28 +0800
-
1576347892
Update Dockerfile (#13168)
Shaojun Liu
2025-05-20 16:41:13 +0800
-
66eb054988
Update vllm patch (#13164)
Wang, Jian4
2025-05-19 16:54:21 +0800
-
d83e5068d2
Enable whisper (#13162)
Wang, Jian4
2025-05-19 14:07:51 +0800
-
8ba57b41cd
Add merge quantized qkv (#13160)
Yina Chen
2025-05-16 15:46:47 +0800
-
1e4e1353a0
Resolve messages formatting issues (#13095)
Emmanuel Ferdman
2025-05-15 11:46:52 +0300
-
35b49e4d91
Add trl version in error message (#13049)
Kai Huang
2025-05-15 09:16:27 +0800
-
bd45bf7584
Update llama_cpp_quickstart.md (#13145)
Pranav Singh
2025-05-15 06:10:53 +0530
-
bd71739e64
Update docs and scripts to align with new Docker image release (#13156)
Shaojun Liu
2025-05-13 17:06:29 +0800
-
f6441b4e3d
Add moe_softmax_topk (#13157)
Yina Chen
2025-05-13 14:50:59 +0800
-
aa12f69bbf
Update Ollama portable zip QuickStart regarding saving VRAM (#13155)
Yuwen Hu
2025-05-13 13:25:22 +0800
-
086a8b3ab9
Update flashmoe_quickstart (#13154)
Jason Dai
2025-05-13 07:56:09 +0800
-
886c7632b2
Add IPEX_LLM_FORCE_BATCH_FORWARD for vLLM docker image (#13151)
Xiangyu Tian
2025-05-12 13:44:33 +0800
-
5df03ced2c
Update vllm patch for fix telechat2 and baichuan2 error(#13150)
Wang, Jian4
2025-05-12 10:54:22 +0800
-
9da1c56fa8
Create flashmoe quickstart (#13147)
Jason Dai
2025-05-12 10:11:22 +0800
-
da08c9ca60
Update Dockerfile (#13148)
Guancheng Fu
2025-05-12 09:19:18 +0800
-
0438e39f3e
Add PyTorch 2.6 support in Latest Update (#13144)
Yuwen Hu
2025-05-09 13:26:49 +0800
-
45f7bf6688
Refactor vLLM Documentation: Centralize Benchmarking and Improve Readability (#13141)
Shaojun Liu
2025-05-09 10:19:42 +0800
-
f5d9c49a2a
add
rotary_half_with_cache_inplaced to ipex_llm.transformers.models.common (#13143)
Ruonan Wang
2025-05-09 09:20:44 +0800
-
f2598b119e
update for bge-m3 (#13138)
Wang, Jian4
2025-05-07 16:59:52 +0800
-
e88a2aa65b
Modify ollama num_ctx related doc (#13139)
SONG Ge
2025-05-07 16:44:58 +0800
-
3a28b69202
Add qwen3 support (#13137)
Yishuo Wang
2025-05-07 14:03:16 +0800
-
be76918b61
Update 083 multimodal benchmark (#13135)
Wang, Jian4
2025-05-07 09:35:09 +0800
-
01bc7e9eb9
Fix 083 lm_head error (#13132)
Wang, Jian4
2025-05-06 15:47:20 +0800
-
685a749adb
Update ollama-release doc into v0.6.2 (#13094)
SONG Ge
2025-04-30 16:22:42 +0800
-
51b41faad7
vLLM: update vLLM XPU to 0.8.3 version (#13118)
Xiangyu Tian
2025-04-30 14:40:53 +0800
-
f66eee1d1d
Update BMG troubleshooting guides regarding PPA installation (#13119)
Yuwen Hu
2025-04-28 15:48:17 +0800
-
ad741503a9
Update bmg_quickstart.md (#13117)
Jason Dai
2025-04-27 22:03:14 +0800
-
6b033f8982
Update readme (#13116)
Jason Dai
2025-04-27 18:18:19 +0800
-
d222eaffd7
Update README.md (#13113)
Guancheng Fu
2025-04-27 17:13:18 +0800
-
16fa778e65
enable glm4v and gemma-3 on vllm 083 (#13114)
Wang, Jian4
2025-04-27 17:10:56 +0800
-
cf97d8f1d7
Update start-vllm-service.sh (#13109)
Guancheng Fu
2025-04-25 15:42:15 +0800
-
9808fb1ac2
update doc about flash-moe (#13103)
Ruonan Wang
2025-04-24 17:53:14 +0800
-
0cfdd399e7
Update README.md (#13104)
Guancheng Fu
2025-04-24 10:21:17 +0800
-
908fdb982e
small refactor and fix (#13101)
Yishuo Wang
2025-04-22 14:45:31 +0800
-
14cd613fe1
Update vLLM docs with some new features (#13092)
Guancheng Fu
2025-04-22 14:39:28 +0800
-
0801d27a6f
Remove PyTorch 2.3 support for Intel GPU (#13097)
Yuwen Hu
2025-04-22 10:26:16 +0800
-
a2a35fdfad
Update portable zip link (#13098)
Yina Chen
2025-04-21 17:25:35 +0800
-
2f78afcd2a
Refactor some functions to
ipex_llm.transformers.models.common (#13091)
Ruonan Wang
2025-04-18 11:15:43 +0800
-
73198d5b80
Update to b17 image (#13085)
Shaojun Liu
2025-04-17 16:18:22 +0800
-
db5edba786
Update Dockerfile (#13081)
Shaojun Liu
2025-04-16 09:18:46 +0800
-
fa56212bb3
Update vLLM patch (#13079)
Shaojun Liu
2025-04-15 16:55:29 +0800
-
f5aaa83649
Update serving-xpu Dockerfile (#13077)
Shaojun Liu
2025-04-15 13:34:14 +0800
-
cfadf3f2f7
upgrade linux-libc-dev to fix CVEs (#13076)
Shaojun Liu
2025-04-15 11:43:53 +0800
-
e08c6bd018
Fix several models based on sdp api change (#13075)
Ruonan Wang
2025-04-15 11:13:12 +0800
-
7826152f5a
update vllm patch (#13072)
Shaojun Liu
2025-04-14 14:56:10 +0800
-
10c30cdba9
set woq_int4 as default int4 (#13021)
Yishuo Wang
2025-04-14 14:10:59 +0800
-
6693e8ab04
Deepseek kv / sdp support (#13068)
Ruonan Wang
2025-04-11 11:26:15 +0800
-
3ee6dec0f8
update vllm patch (#13064)
Guancheng Fu
2025-04-10 15:03:37 +0800
-
1d7f4a83ac
Update documentation to build Docker image from Dockerfile instead of pulling from registry (#13057)
Shaojun Liu
2025-04-09 16:40:20 +0800
-
cd0d4857b8
ipex-llm 2.2.0 post-release update (#13053)
Yuwen Hu
2025-04-07 17:41:22 +0800
-
ef852dcb4a
add audio optimization for qwen2.5-omni (#13037)
Yishuo Wang
2025-04-07 17:20:26 +0800
-
7548c12b2c
Update portable zip QuickStart regarding signature verification (#13050)
Yuwen Hu
2025-04-07 13:34:00 +0800
-
33ae52d083
Small doc fix (#13045)
Yuwen Hu
2025-04-03 17:35:22 +0800
-
3cb718d715
Small updates to Ollama portable zip quickstart (#13043)
Yuwen Hu
2025-04-03 17:18:22 +0800
-
b73728c7ce
Small updates to Ollama portable zip Quickstart (#13040)
Yuwen Hu
2025-04-02 18:44:36 +0800
-
4427012672
Link updates to pytorch 2.6 quickstart (#13032)
Yuwen Hu
2025-04-01 10:35:22 +0800
-
633d1c72e7
Add PyTorch 2.6 QuickStart for Intel GPU (#13024)
Yuwen Hu
2025-04-01 10:21:38 +0800
-
34b1b14225
vLLM: Fix vLLM CPU dockerfile to resolve cmake deprecated issue (#13026)
Xiangyu Tian
2025-03-31 16:09:25 +0800
-
300eb01d98
Add basic optimization for Qwen2.5 omni (#13022)
Yishuo Wang
2025-03-28 17:21:52 +0800
-
61c2e9c271
Refactor docker image by applying patch method (#13011)
Guancheng Fu
2025-03-28 08:13:50 +0800
-
7809ca9864
Reuse --privileged (#13015)
Wang, Jian4
2025-03-27 10:00:50 +0800
-
f437b36678
Fix vllm glm edge model (#13007)
Guancheng Fu
2025-03-26 09:25:32 +0800
-
374747b492
Update bert optimization to fit higher transformers/torch version (#13006)
Yuwen Hu
2025-03-25 16:12:03 +0800
-
27d669210f
remove fschat in EAGLE example (#13005)
Ruonan Wang
2025-03-25 15:48:48 +0800
-
08f96a5139
Rename LICENSE-Intel®-OpenMP*-Runtime-Library.txt to LICENSE-Intel®-OpenMP-Runtime-Library.txt (#13002)
Shaojun Liu
2025-03-25 10:07:55 +0800
-
0e0786a63c
update llama.cpp related quickstart with rebased llama.cpp (#12996)
Ruonan Wang
2025-03-25 09:49:39 +0800
-
7a86dd0569
Remove unused Gradio (#12995)
Shaojun Liu
2025-03-24 10:51:06 +0800
-
46a4f53967
OSPDT: add tpp licenses for release 2.2.0 (#12840)
Shaojun Liu
2025-03-21 15:52:22 +0800
-
5bdf57327d
Remove ipex import in fastchat loader (#12984)
Yuwen Hu
2025-03-20 18:29:00 +0800
-
6f634b41da
Update model support list regarding Gemma3 for Ollama portable zip QuickStart (#12979)
Yuwen Hu
2025-03-19 11:16:45 +0800
-
dd026db50b
Add SNC to llama.cpp portable zip quick start (#12972)
Qiyuan Gong
2025-03-17 10:58:06 +0800
-
b0d56273a8
Fix Docker build failure due to outdated ipex-llm pip index URL (#12977)
Shaojun Liu
2025-03-17 10:46:01 +0800
-
760abc47aa
Fix Docker build failure due to outdated ipex-llm pip index URL (#12976)
Shaojun Liu
2025-03-17 09:50:09 +0800
-
03c9024209
Update README (#12973)
Jason Dai
2025-03-14 19:04:10 +0800
-
6a7819f1ac
Update portable zip related quickstart regarding recommanded driver (#12970)
Yuwen Hu
2025-03-14 16:34:24 +0800
-
c9ecb7a113
Fix qwen nan value issue on vllm (#12971)
Wang, Jian4
2025-03-14 14:43:54 +0800