Guancheng Fu
|
3accc31b86
|
Update 1ccl_for_multi_arc.patch (#13199)
|
2025-05-30 17:13:59 +08:00 |
|
Shaojun Liu
|
c5d919b151
|
update vllm patch (#13185)
Co-authored-by: gc-fu <guancheng.fu@intel.com>
|
2025-05-23 15:02:50 +08:00 |
|
Shaojun Liu
|
1576347892
|
Update Dockerfile (#13168)
|
2025-05-20 16:41:13 +08:00 |
|
Wang, Jian4
|
66eb054988
|
Update vllm patch (#13164)
|
2025-05-19 16:54:21 +08:00 |
|
Wang, Jian4
|
d83e5068d2
|
Enable whisper (#13162)
* fix error
* update dockerfile
|
2025-05-19 14:07:51 +08:00 |
|
Shaojun Liu
|
bd71739e64
|
Update docs and scripts to align with new Docker image release (#13156)
* Update vllm_docker_quickstart.md
* Update start-vllm-service.sh
* Update vllm_docker_quickstart.md
* Update start-vllm-service.sh
|
2025-05-13 17:06:29 +08:00 |
|
Xiangyu Tian
|
886c7632b2
|
Add IPEX_LLM_FORCE_BATCH_FORWARD for vLLM docker image (#13151)
|
2025-05-12 13:44:33 +08:00 |
|
Wang, Jian4
|
5df03ced2c
|
Update vllm patch for fix telechat2 and baichuan2 error(#13150)
|
2025-05-12 10:54:22 +08:00 |
|
Guancheng Fu
|
da08c9ca60
|
Update Dockerfile (#13148)
|
2025-05-12 09:19:18 +08:00 |
|
Shaojun Liu
|
45f7bf6688
|
Refactor vLLM Documentation: Centralize Benchmarking and Improve Readability (#13141)
* update vllm doc
* update image name
* update
* update
* update
* update
|
2025-05-09 10:19:42 +08:00 |
|
Wang, Jian4
|
be76918b61
|
Update 083 multimodal benchmark (#13135)
* update multimodal benchmark
* update
|
2025-05-07 09:35:09 +08:00 |
|
Wang, Jian4
|
01bc7e9eb9
|
Fix 083 lm_head error (#13132)
* fix no quantize error
* update
* update style
|
2025-05-06 15:47:20 +08:00 |
|
Xiangyu Tian
|
51b41faad7
|
vLLM: update vLLM XPU to 0.8.3 version (#13118)
vLLM: update vLLM XPU to 0.8.3 version
|
2025-04-30 14:40:53 +08:00 |
|
Wang, Jian4
|
16fa778e65
|
enable glm4v and gemma-3 on vllm 083 (#13114)
* enable glm4v and gemma-3
* update
* add qwen2.5-vl
|
2025-04-27 17:10:56 +08:00 |
|
Guancheng Fu
|
cf97d8f1d7
|
Update start-vllm-service.sh (#13109)
|
2025-04-25 15:42:15 +08:00 |
|
Shaojun Liu
|
73198d5b80
|
Update to b17 image (#13085)
* update vllm patch
* fix
* fix triton
---------
Co-authored-by: gc-fu <guancheng.fu@intel.com>
|
2025-04-17 16:18:22 +08:00 |
|
Shaojun Liu
|
db5edba786
|
Update Dockerfile (#13081)
|
2025-04-16 09:18:46 +08:00 |
|
Shaojun Liu
|
fa56212bb3
|
Update vLLM patch (#13079)
* update vllm patch
* Update Dockerfile
|
2025-04-15 16:55:29 +08:00 |
|
Shaojun Liu
|
f5aaa83649
|
Update serving-xpu Dockerfile (#13077)
* Update Dockerfile
* Update Dockerfile
|
2025-04-15 13:34:14 +08:00 |
|
Shaojun Liu
|
cfadf3f2f7
|
upgrade linux-libc-dev to fix CVEs (#13076)
|
2025-04-15 11:43:53 +08:00 |
|
Shaojun Liu
|
7826152f5a
|
update vllm patch (#13072)
|
2025-04-14 14:56:10 +08:00 |
|
Guancheng Fu
|
3ee6dec0f8
|
update vllm patch (#13064)
|
2025-04-10 15:03:37 +08:00 |
|
Shaojun Liu
|
1d7f4a83ac
|
Update documentation to build Docker image from Dockerfile instead of pulling from registry (#13057)
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update docker_cpp_xpu_quickstart.md
* Update vllm_cpu_docker_quickstart.md
* Update docker_cpp_xpu_quickstart.md
* Update vllm_docker_quickstart.md
* Update fastchat_docker_quickstart.md
* Update docker_pytorch_inference_gpu.md
|
2025-04-09 16:40:20 +08:00 |
|
Xiangyu Tian
|
34b1b14225
|
vLLM: Fix vLLM CPU dockerfile to resolve cmake deprecated issue (#13026)
|
2025-03-31 16:09:25 +08:00 |
|
Guancheng Fu
|
61c2e9c271
|
Refactor docker image by applying patch method (#13011)
* first stage try
* second try
* add ninja
* Done
* fix
|
2025-03-28 08:13:50 +08:00 |
|
Wang, Jian4
|
7809ca9864
|
Reuse --privileged (#13015)
* fix
* add
|
2025-03-27 10:00:50 +08:00 |
|
Shaojun Liu
|
7a86dd0569
|
Remove unused Gradio (#12995)
|
2025-03-24 10:51:06 +08:00 |
|
Shaojun Liu
|
b0d56273a8
|
Fix Docker build failure due to outdated ipex-llm pip index URL (#12977)
|
2025-03-17 10:46:01 +08:00 |
|
Shaojun Liu
|
760abc47aa
|
Fix Docker build failure due to outdated ipex-llm pip index URL (#12976)
|
2025-03-17 09:50:09 +08:00 |
|
Shaojun Liu
|
6a2d87e40f
|
add --entrypoint /bin/bash (#12957)
Co-authored-by: gc-fu <guancheng.fu@intel.com>
|
2025-03-10 10:10:27 +08:00 |
|
Shaojun Liu
|
015a4c8c43
|
Add CPU and GPU Frequency Locking Instructions to Documentation (#12947)
|
2025-03-07 09:20:40 +08:00 |
|
Shaojun Liu
|
f81d89d908
|
Remove Unnecessary --privileged Flag While Keeping It for WSL Users (#12920)
|
2025-03-03 11:11:42 +08:00 |
|
Shaojun Liu
|
7810b8fb49
|
OSPDT: update dockerfile header (#12908)
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
|
2025-03-03 09:59:11 +08:00 |
|
Shaojun Liu
|
5c100ac105
|
Add ENTRYPOINT to Dockerfile to auto-start vllm service on container launch (for CVTE customer) (#12901)
* Add ENTRYPOINT to Dockerfile to auto-start service on container launch (for CVTE client)
* Update start-vllm-service.sh
* Update README.md
* Update README.md
* Update start-vllm-service.sh
* Update README.md
|
2025-02-27 17:33:58 +08:00 |
|
Xiangyu Tian
|
ae9f5320da
|
vLLM CPU: Fix Triton Version to Resolve Related Error(#12893)
|
2025-02-25 15:00:41 +08:00 |
|
Shaojun Liu
|
dd30d12cb6
|
Fix serving-cpu image: setuptools-scm requires setuptools>=61 (#12876)
* setuptools-scm requires setuptools>=61
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
|
2025-02-25 09:10:14 +08:00 |
|
Wang, Jian4
|
4f2f92afa3
|
Update inference-cpp docker (#12882)
* remove nouse run.py
* add WORKDIR /llm
|
2025-02-24 14:32:44 +08:00 |
|
Shaojun Liu
|
afad979168
|
Add Apache 2.0 License Information in Dockerfile to Comply with OSPDT Requirements (#12878)
* ospdt: add Header for Dockerfile
* OSPDT: add Header for Dockerfile
* OSPDT: add Header for Dockerfile
* OSPDT: add Header for Dockerfile
|
2025-02-24 14:00:46 +08:00 |
|
Wang, Jian4
|
e1809a6295
|
Update multimodal on vllm 0.6.6 (#12816)
* add glm4v and minicpmv example
* fix
|
2025-02-19 10:04:42 +08:00 |
|
Shaojun Liu
|
f7b5a093a7
|
Merge CPU & XPU Dockerfiles with Serving Images and Refactor (#12815)
* Update Dockerfile
* Update Dockerfile
* Ensure scripts are executable
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* update
* Update Dockerfile
* remove inference-cpu and inference-xpu
* update README
|
2025-02-17 14:23:22 +08:00 |
|
Wang, Jian4
|
1083fe5508
|
Reenable pp and lightweight-serving serving on 0.6.6 (#12814)
* reenable pp ang lightweight serving on 066
* update readme
* updat
* update tag
|
2025-02-13 10:16:00 +08:00 |
|
Guancheng Fu
|
af693425f1
|
Upgrade to vLLM 0.6.6 (#12796)
* init
* update engine init
* fix serving load_in_low_bit problem
* temp
* temp
* temp
* temp
* temp
* fix
* fixed
* done
* fix
* fix all arguments
* fix
* fix throughput script
* fix
* fix
* use official ipex-llm
* Fix readme
* fix
---------
Co-authored-by: hzjane <a1015616934@qq.com>
|
2025-02-12 16:47:51 +08:00 |
|
Shaojun Liu
|
bd815a4d96
|
Update the base image of inference-cpp image to oneapi 2025.0.2 (#12802)
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
|
2025-02-12 14:15:08 +08:00 |
|
Xiangyu Tian
|
9e9b6c9f2b
|
Fix cpu serving docker image (#12783)
|
2025-02-07 11:12:42 +08:00 |
|
Xiangyu Tian
|
f924880694
|
vLLM: Fix vLLM-CPU docker image (#12741)
|
2025-01-24 10:00:29 +08:00 |
|
Xiangyu Tian
|
c9b6c94a59
|
vLLM: Update vLLM-cpu to v0.6.6-post1 (#12728)
Update vLLM-cpu to v0.6.6-post1
|
2025-01-22 15:03:01 +08:00 |
|
Wang, Jian4
|
716d4fe563
|
Add vllm 0.6.2 vision offline example (#12721)
* add vision offline example
* add to docker
|
2025-01-21 09:58:01 +08:00 |
|
Shaojun Liu
|
2673792de6
|
Update Dockerfile (#12688)
|
2025-01-10 09:01:29 +08:00 |
|
Shaojun Liu
|
28737c250c
|
Update Dockerfile (#12585)
|
2024-12-26 10:20:52 +08:00 |
|
Shaojun Liu
|
51ff9ebd8a
|
Upgrade oneccl version to 0.0.6.3 (#12560)
* Update Dockerfile
* Update Dockerfile
* Update start-vllm-service.sh
|
2024-12-20 09:29:16 +08:00 |
|