Wang, Jian4
|
be76918b61
|
Update 083 multimodal benchmark (#13135)
* update multimodal benchmark
* update
|
2025-05-07 09:35:09 +08:00 |
|
Wang, Jian4
|
01bc7e9eb9
|
Fix 083 lm_head error (#13132)
* fix no quantize error
* update
* update style
|
2025-05-06 15:47:20 +08:00 |
|
Xiangyu Tian
|
51b41faad7
|
vLLM: update vLLM XPU to 0.8.3 version (#13118)
vLLM: update vLLM XPU to 0.8.3 version
|
2025-04-30 14:40:53 +08:00 |
|
Wang, Jian4
|
16fa778e65
|
enable glm4v and gemma-3 on vllm 083 (#13114)
* enable glm4v and gemma-3
* update
* add qwen2.5-vl
|
2025-04-27 17:10:56 +08:00 |
|
Guancheng Fu
|
cf97d8f1d7
|
Update start-vllm-service.sh (#13109)
|
2025-04-25 15:42:15 +08:00 |
|
Shaojun Liu
|
73198d5b80
|
Update to b17 image (#13085)
* update vllm patch
* fix
* fix triton
---------
Co-authored-by: gc-fu <guancheng.fu@intel.com>
|
2025-04-17 16:18:22 +08:00 |
|
Shaojun Liu
|
db5edba786
|
Update Dockerfile (#13081)
|
2025-04-16 09:18:46 +08:00 |
|
Shaojun Liu
|
fa56212bb3
|
Update vLLM patch (#13079)
* update vllm patch
* Update Dockerfile
|
2025-04-15 16:55:29 +08:00 |
|
Shaojun Liu
|
f5aaa83649
|
Update serving-xpu Dockerfile (#13077)
* Update Dockerfile
* Update Dockerfile
|
2025-04-15 13:34:14 +08:00 |
|
Shaojun Liu
|
cfadf3f2f7
|
upgrade linux-libc-dev to fix CVEs (#13076)
|
2025-04-15 11:43:53 +08:00 |
|
Shaojun Liu
|
7826152f5a
|
update vllm patch (#13072)
|
2025-04-14 14:56:10 +08:00 |
|
Guancheng Fu
|
3ee6dec0f8
|
update vllm patch (#13064)
|
2025-04-10 15:03:37 +08:00 |
|
Shaojun Liu
|
1d7f4a83ac
|
Update documentation to build Docker image from Dockerfile instead of pulling from registry (#13057)
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update docker_cpp_xpu_quickstart.md
* Update vllm_cpu_docker_quickstart.md
* Update docker_cpp_xpu_quickstart.md
* Update vllm_docker_quickstart.md
* Update fastchat_docker_quickstart.md
* Update docker_pytorch_inference_gpu.md
|
2025-04-09 16:40:20 +08:00 |
|
Xiangyu Tian
|
34b1b14225
|
vLLM: Fix vLLM CPU dockerfile to resolve cmake deprecated issue (#13026)
|
2025-03-31 16:09:25 +08:00 |
|
Guancheng Fu
|
61c2e9c271
|
Refactor docker image by applying patch method (#13011)
* first stage try
* second try
* add ninja
* Done
* fix
|
2025-03-28 08:13:50 +08:00 |
|
Wang, Jian4
|
7809ca9864
|
Reuse --privileged (#13015)
* fix
* add
|
2025-03-27 10:00:50 +08:00 |
|
Shaojun Liu
|
7a86dd0569
|
Remove unused Gradio (#12995)
|
2025-03-24 10:51:06 +08:00 |
|
Shaojun Liu
|
b0d56273a8
|
Fix Docker build failure due to outdated ipex-llm pip index URL (#12977)
|
2025-03-17 10:46:01 +08:00 |
|
Shaojun Liu
|
760abc47aa
|
Fix Docker build failure due to outdated ipex-llm pip index URL (#12976)
|
2025-03-17 09:50:09 +08:00 |
|
Shaojun Liu
|
6a2d87e40f
|
add --entrypoint /bin/bash (#12957)
Co-authored-by: gc-fu <guancheng.fu@intel.com>
|
2025-03-10 10:10:27 +08:00 |
|
Shaojun Liu
|
015a4c8c43
|
Add CPU and GPU Frequency Locking Instructions to Documentation (#12947)
|
2025-03-07 09:20:40 +08:00 |
|
Shaojun Liu
|
7810b8fb49
|
OSPDT: update dockerfile header (#12908)
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
|
2025-03-03 09:59:11 +08:00 |
|
Shaojun Liu
|
5c100ac105
|
Add ENTRYPOINT to Dockerfile to auto-start vllm service on container launch (for CVTE customer) (#12901)
* Add ENTRYPOINT to Dockerfile to auto-start service on container launch (for CVTE client)
* Update start-vllm-service.sh
* Update README.md
* Update README.md
* Update start-vllm-service.sh
* Update README.md
|
2025-02-27 17:33:58 +08:00 |
|
Xiangyu Tian
|
ae9f5320da
|
vLLM CPU: Fix Triton Version to Resolve Related Error(#12893)
|
2025-02-25 15:00:41 +08:00 |
|
Shaojun Liu
|
dd30d12cb6
|
Fix serving-cpu image: setuptools-scm requires setuptools>=61 (#12876)
* setuptools-scm requires setuptools>=61
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
|
2025-02-25 09:10:14 +08:00 |
|
Shaojun Liu
|
afad979168
|
Add Apache 2.0 License Information in Dockerfile to Comply with OSPDT Requirements (#12878)
* ospdt: add Header for Dockerfile
* OSPDT: add Header for Dockerfile
* OSPDT: add Header for Dockerfile
* OSPDT: add Header for Dockerfile
|
2025-02-24 14:00:46 +08:00 |
|
Wang, Jian4
|
e1809a6295
|
Update multimodal on vllm 0.6.6 (#12816)
* add glm4v and minicpmv example
* fix
|
2025-02-19 10:04:42 +08:00 |
|
Shaojun Liu
|
f7b5a093a7
|
Merge CPU & XPU Dockerfiles with Serving Images and Refactor (#12815)
* Update Dockerfile
* Update Dockerfile
* Ensure scripts are executable
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* update
* Update Dockerfile
* remove inference-cpu and inference-xpu
* update README
|
2025-02-17 14:23:22 +08:00 |
|
Wang, Jian4
|
1083fe5508
|
Reenable pp and lightweight-serving serving on 0.6.6 (#12814)
* reenable pp ang lightweight serving on 066
* update readme
* updat
* update tag
|
2025-02-13 10:16:00 +08:00 |
|
Guancheng Fu
|
af693425f1
|
Upgrade to vLLM 0.6.6 (#12796)
* init
* update engine init
* fix serving load_in_low_bit problem
* temp
* temp
* temp
* temp
* temp
* fix
* fixed
* done
* fix
* fix all arguments
* fix
* fix throughput script
* fix
* fix
* use official ipex-llm
* Fix readme
* fix
---------
Co-authored-by: hzjane <a1015616934@qq.com>
|
2025-02-12 16:47:51 +08:00 |
|
Xiangyu Tian
|
9e9b6c9f2b
|
Fix cpu serving docker image (#12783)
|
2025-02-07 11:12:42 +08:00 |
|
Xiangyu Tian
|
f924880694
|
vLLM: Fix vLLM-CPU docker image (#12741)
|
2025-01-24 10:00:29 +08:00 |
|
Xiangyu Tian
|
c9b6c94a59
|
vLLM: Update vLLM-cpu to v0.6.6-post1 (#12728)
Update vLLM-cpu to v0.6.6-post1
|
2025-01-22 15:03:01 +08:00 |
|
Wang, Jian4
|
716d4fe563
|
Add vllm 0.6.2 vision offline example (#12721)
* add vision offline example
* add to docker
|
2025-01-21 09:58:01 +08:00 |
|
Shaojun Liu
|
28737c250c
|
Update Dockerfile (#12585)
|
2024-12-26 10:20:52 +08:00 |
|
Shaojun Liu
|
51ff9ebd8a
|
Upgrade oneccl version to 0.0.6.3 (#12560)
* Update Dockerfile
* Update Dockerfile
* Update start-vllm-service.sh
|
2024-12-20 09:29:16 +08:00 |
|
Shaojun Liu
|
429bf1ffeb
|
Change: Use cn mirror for PyTorch extension installation to resolve network issues (#12559)
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
|
2024-12-17 14:22:50 +08:00 |
|
Wang, Jian4
|
922958c018
|
vllm oneccl upgrade to b9 (#12520)
|
2024-12-10 15:02:56 +08:00 |
|
Guancheng Fu
|
8331875f34
|
Fix (#12390)
|
2024-11-27 10:41:58 +08:00 |
|
Pepijn de Vos
|
71e1f11aa6
|
update serving image runtime (#12433)
|
2024-11-26 14:55:30 +08:00 |
|
Shaojun Liu
|
c089b6c10d
|
Update english prompt to 34k (#12429)
|
2024-11-22 11:20:35 +08:00 |
|
Wang, Jian4
|
1bfcbc0640
|
Add multimodal benchmark (#12415)
* add benchmark multimodal
* update
* update
* update
|
2024-11-20 14:21:13 +08:00 |
|
Guancheng Fu
|
d6057f6dd2
|
Update benchmark_vllm_throughput.py (#12414)
|
2024-11-19 10:41:43 +08:00 |
|
Xu, Shuo
|
6726b198fd
|
Update readme & doc for the vllm upgrade to v0.6.2 (#12399)
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
|
2024-11-14 10:28:15 +08:00 |
|
Guancheng Fu
|
0ee54fc55f
|
Upgrade to vllm 0.6.2 (#12338)
* Initial updates for vllm 0.6.2
* fix
* Change Dockerfile to support v062
* Fix
* fix examples
* Fix
* done
* fix
* Update engine.py
* Fix Dockerfile to original path
* fix
* add option
* fix
* fix
* fix
* fix
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
|
2024-11-12 20:35:34 +08:00 |
|
Shaojun Liu
|
c92d76b997
|
Update oneccl-binding.patch (#12377)
* Add files via upload
* upload oneccl-binding.patch
* Update Dockerfile
|
2024-11-11 22:34:08 +08:00 |
|
Shaojun Liu
|
fad15c8ca0
|
Update fastchat demo script (#12367)
* Update README.md
* Update vllm_docker_quickstart.md
|
2024-11-08 15:42:17 +08:00 |
|
Xu, Shuo
|
ce0c6ae423
|
Update Readme for FastChat docker demo (#12354)
* update Readme for FastChat docker demo
* update readme
* add 'Serving with FastChat' part in docs
* polish docs
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
|
2024-11-07 15:22:42 +08:00 |
|
Xu, Shuo
|
899a30331a
|
Replace gradio_web_server.patch to adjust webui (#12329)
* replace gradio_web_server.patch to adjust webui
* fix patch problem
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
|
2024-11-06 09:16:32 +08:00 |
|
Jun Wang
|
3700e81977
|
[fix] vllm-online-benchmark first token latency error (#12271)
|
2024-10-29 17:54:36 +08:00 |
|