Wang, Jian4
16fa778e65
enable glm4v and gemma-3 on vllm 083 ( #13114 )
...
* enable glm4v and gemma-3
* update
* add qwen2.5-vl
2025-04-27 17:10:56 +08:00
Guancheng Fu
cf97d8f1d7
Update start-vllm-service.sh ( #13109 )
2025-04-25 15:42:15 +08:00
Ruonan Wang
9808fb1ac2
update doc about flash-moe ( #13103 )
...
* update doc about flashmoe
* revert toc
* meet review, add version note
* small fix
2025-04-24 17:53:14 +08:00
Guancheng Fu
0cfdd399e7
Update README.md ( #13104 )
2025-04-24 10:21:17 +08:00
Yishuo Wang
908fdb982e
small refactor and fix ( #13101 )
2025-04-22 14:45:31 +08:00
Guancheng Fu
14cd613fe1
Update vLLM docs with some new features ( #13092 )
...
* done
* fix
* done
* Update README.md
2025-04-22 14:39:28 +08:00
Yuwen Hu
0801d27a6f
Remove PyTorch 2.3 support for Intel GPU ( #13097 )
...
* Remove PyTorch 2.3 installation option for GPU
* Remove xpu_lnl option in installation guides for docs
* Update BMG quickstart
* Remove PyTorch 2.3 dependencies for GPU examples
* Update the graphmode example to use stable version 2.2.0
* Fix based on comments
2025-04-22 10:26:16 +08:00
Yina Chen
a2a35fdfad
Update portable zip link ( #13098 )
...
* update portable zip link
* update CN
* address comments
* update latest updates
* revert
2025-04-21 17:25:35 +08:00
Ruonan Wang
2f78afcd2a
Refactor some functions to ipex_llm.transformers.models.common ( #13091 )
...
* add quantize_linear & linear_forward
* add moe_group_topk
* rotary_two_with_cache_inplaced
* fix code style
* update related models
2025-04-18 11:15:43 +08:00
Shaojun Liu
73198d5b80
Update to b17 image ( #13085 )
...
* update vllm patch
* fix
* fix triton
---------
Co-authored-by: gc-fu <guancheng.fu@intel.com>
2025-04-17 16:18:22 +08:00
Shaojun Liu
db5edba786
Update Dockerfile ( #13081 )
2025-04-16 09:18:46 +08:00
Shaojun Liu
fa56212bb3
Update vLLM patch ( #13079 )
...
* update vllm patch
* Update Dockerfile
2025-04-15 16:55:29 +08:00
Shaojun Liu
f5aaa83649
Update serving-xpu Dockerfile ( #13077 )
...
* Update Dockerfile
* Update Dockerfile
2025-04-15 13:34:14 +08:00
Shaojun Liu
cfadf3f2f7
upgrade linux-libc-dev to fix CVEs ( #13076 )
2025-04-15 11:43:53 +08:00
Ruonan Wang
e08c6bd018
Fix several models based on sdp api change ( #13075 )
...
* fix baichuan based on sdp api change
* fix several models based on api change
* fix style
2025-04-15 11:13:12 +08:00
Shaojun Liu
7826152f5a
update vllm patch ( #13072 )
2025-04-14 14:56:10 +08:00
Yishuo Wang
10c30cdba9
set woq_int4 as default int4 ( #13021 )
2025-04-14 14:10:59 +08:00
Ruonan Wang
6693e8ab04
Deepseek kv / sdp support ( #13068 )
...
* update kv
* fix
* fix style
2025-04-11 11:26:15 +08:00
Guancheng Fu
3ee6dec0f8
update vllm patch ( #13064 )
2025-04-10 15:03:37 +08:00
Shaojun Liu
1d7f4a83ac
Update documentation to build Docker image from Dockerfile instead of pulling from registry ( #13057 )
...
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update docker_cpp_xpu_quickstart.md
* Update vllm_cpu_docker_quickstart.md
* Update docker_cpp_xpu_quickstart.md
* Update vllm_docker_quickstart.md
* Update fastchat_docker_quickstart.md
* Update docker_pytorch_inference_gpu.md
2025-04-09 16:40:20 +08:00
Yuwen Hu
cd0d4857b8
ipex-llm 2.2.0 post-release update (#13053 )
...
* Update ollama/llama.cpp release link to 2.2.0 (#13052 )
* Post-update for releasing ipex-llm 2.2.0
2025-04-07 17:41:22 +08:00
Yishuo Wang
ef852dcb4a
add audio optimization for qwen2.5-omni ( #13037 )
2025-04-07 17:20:26 +08:00
Yuwen Hu
7548c12b2c
Update portable zip QuickStart regarding signature verification ( #13050 )
...
* Update portable zip QuickStart regarding sigurature verification
* Small fix
* Small fix
2025-04-07 13:34:00 +08:00
Yuwen Hu
33ae52d083
Small doc fix ( #13045 )
2025-04-03 17:35:22 +08:00
Yuwen Hu
3cb718d715
Small updates to Ollama portable zip quickstart ( #13043 )
2025-04-03 17:18:22 +08:00
Yuwen Hu
b73728c7ce
Small updates to Ollama portable zip Quickstart ( #13040 )
2025-04-02 18:44:36 +08:00
Yuwen Hu
4427012672
Link updates to pytorch 2.6 quickstart ( #13032 )
2025-04-01 10:35:22 +08:00
Yuwen Hu
633d1c72e7
Add PyTorch 2.6 QuickStart for Intel GPU ( #13024 )
...
* Add quickstart for install IPEX-LLM with PyTorch 2.6 on Intel GPUs
* Add jump links
* Rename
* Small fix
* Small fix
* Update based on comments
* Small fix
2025-04-01 10:21:38 +08:00
Xiangyu Tian
34b1b14225
vLLM: Fix vLLM CPU dockerfile to resolve cmake deprecated issue ( #13026 )
2025-03-31 16:09:25 +08:00
Yishuo Wang
300eb01d98
Add basic optimization for Qwen2.5 omni ( #13022 )
2025-03-28 17:21:52 +08:00
Guancheng Fu
61c2e9c271
Refactor docker image by applying patch method ( #13011 )
...
* first stage try
* second try
* add ninja
* Done
* fix
2025-03-28 08:13:50 +08:00
Wang, Jian4
7809ca9864
Reuse --privileged ( #13015 )
...
* fix
* add
2025-03-27 10:00:50 +08:00
Guancheng Fu
f437b36678
Fix vllm glm edge model ( #13007 )
...
* fix done
* fix
2025-03-26 09:25:32 +08:00
Yuwen Hu
374747b492
Update bert optimization to fit higher transformers/torch version ( #13006 )
2025-03-25 16:12:03 +08:00
Ruonan Wang
27d669210f
remove fschat in EAGLE example ( #13005 )
...
* update fschat version
* fix
2025-03-25 15:48:48 +08:00
Shaojun Liu
08f96a5139
Rename LICENSE-Intel®-OpenMP*-Runtime-Library.txt to LICENSE-Intel®-OpenMP-Runtime-Library.txt ( #13002 )
2025-03-25 10:07:55 +08:00
Ruonan Wang
0e0786a63c
update llama.cpp related quickstart with rebased llama.cpp ( #12996 )
...
* update doc with reabsed llama.cpp
* revert table of contents
* update demo output log
2025-03-25 09:49:39 +08:00
Shaojun Liu
7a86dd0569
Remove unused Gradio ( #12995 )
2025-03-24 10:51:06 +08:00
Shaojun Liu
46a4f53967
OSPDT: add tpp licenses for release 2.2.0 ( #12840 )
...
* Create LICENSE-zstd.txt
* Create LICENSE-libcxx.txt
* Create LICENSE-libcxxabi.txt
* Create LICENSE-safestring.txt
* Create LICENSE-stb-image.txt
* Create LICENSE-cluster-agent.txt
* Create LICENSE-hd-agent.txt
* Create LICENSE-platform-telemetry-agent.txt
* Create LICENSE-platform-update-agent.txt
* Create LICENSE-OpenCL-ICD-Loader.txt
* Create LICENSE-xptifw.txt
* Create LICENSE-intel-openmp.txt
* Create LICENSE-Intel®-OpenMP*-Runtime-Library.txt
* Create LICENSE-Intel®-C-C++-Fortran-Compiler-Mainline.txt
* add TPP files
* Add TPP files
* add tpp
* add tpp
* update
* update
2025-03-21 15:52:22 +08:00
Yuwen Hu
5bdf57327d
Remove ipex import in fastchat loader ( #12984 )
2025-03-20 18:29:00 +08:00
Yuwen Hu
6f634b41da
Update model support list regarding Gemma3 for Ollama portable zip QuickStart ( #12979 )
...
* Update model support list regarding Gemma3 for Ollama portable zip QuickStart
* Small fix
* Small fix
* Small fix
2025-03-19 11:16:45 +08:00
Qiyuan Gong
dd026db50b
Add SNC to llama.cpp portable zip quick start ( #12972 )
...
* Add SNC to quick start
2025-03-17 10:58:06 +08:00
Shaojun Liu
b0d56273a8
Fix Docker build failure due to outdated ipex-llm pip index URL ( #12977 )
2025-03-17 10:46:01 +08:00
Shaojun Liu
760abc47aa
Fix Docker build failure due to outdated ipex-llm pip index URL ( #12976 )
2025-03-17 09:50:09 +08:00
Jason Dai
03c9024209
Update README ( #12973 )
2025-03-14 19:04:10 +08:00
Yuwen Hu
6a7819f1ac
Update portable zip related quickstart regarding recommanded driver ( #12970 )
2025-03-14 16:34:24 +08:00
Wang, Jian4
c9ecb7a113
Fix qwen nan value issue on vllm ( #12971 )
...
* add to fix qwen nan value issue
* update
2025-03-14 14:43:54 +08:00
Heyang Sun
cd109bb061
Gemma QLoRA example ( #12969 )
...
* Gemma QLoRA example
* Update README.md
* Update README.md
---------
Co-authored-by: sgwhat <ge.song@intel.com>
2025-03-14 14:27:51 +08:00
Yuwen Hu
8bc41c13ab
Support PyTorch 2.6 with Arrow Lake-H AOT on Windows ( #12967 )
2025-03-13 15:29:47 +08:00
Wang, Jian4
c8a0462507
Add vllm api_server input output log ( #12962 )
2025-03-12 20:58:04 +08:00