Commit graph

56 commits

Author SHA1 Message Date
Guancheng Fu
da08c9ca60
Update Dockerfile (#13148) 2025-05-12 09:19:18 +08:00
Wang, Jian4
be76918b61
Update 083 multimodal benchmark (#13135)
* update multimodal benchmark

* update
2025-05-07 09:35:09 +08:00
Xiangyu Tian
51b41faad7
vLLM: update vLLM XPU to 0.8.3 version (#13118)
vLLM: update vLLM XPU to 0.8.3 version
2025-04-30 14:40:53 +08:00
Shaojun Liu
73198d5b80
Update to b17 image (#13085)
* update vllm patch

* fix

* fix triton

---------

Co-authored-by: gc-fu <guancheng.fu@intel.com>
2025-04-17 16:18:22 +08:00
Shaojun Liu
db5edba786
Update Dockerfile (#13081) 2025-04-16 09:18:46 +08:00
Shaojun Liu
fa56212bb3
Update vLLM patch (#13079)
* update vllm patch

* Update Dockerfile
2025-04-15 16:55:29 +08:00
Shaojun Liu
f5aaa83649
Update serving-xpu Dockerfile (#13077)
* Update Dockerfile

* Update Dockerfile
2025-04-15 13:34:14 +08:00
Shaojun Liu
cfadf3f2f7
upgrade linux-libc-dev to fix CVEs (#13076) 2025-04-15 11:43:53 +08:00
Guancheng Fu
61c2e9c271
Refactor docker image by applying patch method (#13011)
* first stage try

* second try

* add ninja

* Done

* fix
2025-03-28 08:13:50 +08:00
Shaojun Liu
7a86dd0569
Remove unused Gradio (#12995) 2025-03-24 10:51:06 +08:00
Shaojun Liu
b0d56273a8
Fix Docker build failure due to outdated ipex-llm pip index URL (#12977) 2025-03-17 10:46:01 +08:00
Shaojun Liu
760abc47aa
Fix Docker build failure due to outdated ipex-llm pip index URL (#12976) 2025-03-17 09:50:09 +08:00
Shaojun Liu
7810b8fb49
OSPDT: update dockerfile header (#12908)
* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile
2025-03-03 09:59:11 +08:00
Shaojun Liu
5c100ac105
Add ENTRYPOINT to Dockerfile to auto-start vllm service on container launch (for CVTE customer) (#12901)
* Add ENTRYPOINT to Dockerfile to auto-start service on container launch (for CVTE client)

* Update start-vllm-service.sh

* Update README.md

* Update README.md

* Update start-vllm-service.sh

* Update README.md
2025-02-27 17:33:58 +08:00
Shaojun Liu
afad979168
Add Apache 2.0 License Information in Dockerfile to Comply with OSPDT Requirements (#12878)
* ospdt: add Header for Dockerfile

* OSPDT: add Header for Dockerfile

* OSPDT: add Header for Dockerfile

* OSPDT: add Header for Dockerfile
2025-02-24 14:00:46 +08:00
Shaojun Liu
f7b5a093a7
Merge CPU & XPU Dockerfiles with Serving Images and Refactor (#12815)
* Update Dockerfile

* Update Dockerfile

* Ensure scripts are executable

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* update

* Update Dockerfile

* remove inference-cpu and inference-xpu

* update README
2025-02-17 14:23:22 +08:00
Wang, Jian4
1083fe5508
Reenable pp and lightweight-serving serving on 0.6.6 (#12814)
* reenable pp ang lightweight serving on 066

* update readme

* updat

* update tag
2025-02-13 10:16:00 +08:00
Guancheng Fu
af693425f1
Upgrade to vLLM 0.6.6 (#12796)
* init

* update engine init

* fix serving load_in_low_bit problem

* temp

* temp

* temp

* temp

* temp

* fix

* fixed

* done

* fix

* fix all arguments

* fix

* fix throughput script

* fix

* fix

* use official ipex-llm

* Fix readme

* fix

---------

Co-authored-by: hzjane <a1015616934@qq.com>
2025-02-12 16:47:51 +08:00
Wang, Jian4
716d4fe563
Add vllm 0.6.2 vision offline example (#12721)
* add vision offline example

* add to docker
2025-01-21 09:58:01 +08:00
Shaojun Liu
28737c250c
Update Dockerfile (#12585) 2024-12-26 10:20:52 +08:00
Shaojun Liu
51ff9ebd8a
Upgrade oneccl version to 0.0.6.3 (#12560)
* Update Dockerfile

* Update Dockerfile

* Update start-vllm-service.sh
2024-12-20 09:29:16 +08:00
Shaojun Liu
429bf1ffeb
Change: Use cn mirror for PyTorch extension installation to resolve network issues (#12559)
* Update Dockerfile

* Update Dockerfile

* Update Dockerfile
2024-12-17 14:22:50 +08:00
Wang, Jian4
922958c018
vllm oneccl upgrade to b9 (#12520) 2024-12-10 15:02:56 +08:00
Pepijn de Vos
71e1f11aa6
update serving image runtime (#12433) 2024-11-26 14:55:30 +08:00
Guancheng Fu
0ee54fc55f
Upgrade to vllm 0.6.2 (#12338)
* Initial updates for vllm 0.6.2

* fix

* Change Dockerfile to support v062

* Fix

* fix examples

* Fix

* done

* fix

* Update engine.py

* Fix Dockerfile to original path

* fix

* add option

* fix

* fix

* fix

* fix

---------

Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
2024-11-12 20:35:34 +08:00
Shaojun Liu
c92d76b997
Update oneccl-binding.patch (#12377)
* Add files via upload

* upload oneccl-binding.patch

* Update Dockerfile
2024-11-11 22:34:08 +08:00
Guancheng Fu
67014cb29f
Add benchmark_latency.py to docker serving image (#12283) 2024-10-28 16:19:59 +08:00
Shaojun Liu
48fc63887d
use oneccl 0.0.5.1 (#12262) 2024-10-24 16:12:24 +08:00
Shaojun Liu
7825dc1398
Upgrade oneccl to 0.0.5 (#12223) 2024-10-18 09:29:19 +08:00
Shaojun Liu
26390f9213
Update oneccl_wks_installer to 2024.0.0.4.1 (#12217) 2024-10-17 10:11:55 +08:00
Shaojun Liu
1daab4531f
Upgrade oneccl to 0.0.4 in serving-xpu image (#12185)
* Update oneccl to 0.0.4

* upgrade transformers to 4.44.2
2024-10-11 16:54:50 +08:00
Guancheng Fu
b36359e2ab
Fix xpu serving image oneccl (#12100) 2024-09-20 15:25:41 +08:00
Guancheng Fu
a6cbc01911
Use new oneccl for ipex-llm serving image (#12097) 2024-09-20 14:52:49 +08:00
Xiangyu Tian
c2774e1a43
Update oneccl to 0.0.3 in serving-xpu image (#12088) 2024-09-18 14:29:17 +08:00
Shaojun Liu
beb876665d
pin gradio version to fix connection error (#12069) 2024-09-12 14:36:09 +08:00
Shaojun Liu
7e1e51d91a
Update vllm setting (#12059)
* revert

* update

* update

* update
2024-09-11 11:45:08 +08:00
Guancheng Fu
69c8d36f16
Switching from vLLM v0.3.3 to vLLM 0.5.4 (#12042)
* Enable single card sync engine

* enable ipex-llm optimizations for vllm

* enable optimizations for lm_head

* Fix chatglm multi-reference problem

* Remove duplicate layer

* LLM: Update vLLM to v0.5.4 (#11746)

* Enable single card sync engine

* enable ipex-llm optimizations for vllm

* enable optimizations for lm_head

* Fix chatglm multi-reference problem

* update 0.5.4 api_server

* add dockerfile

* fix

* fix

* refine

* fix

---------

Co-authored-by: gc-fu <guancheng.fu@intel.com>

* Add vllm-0.5.4 Dockerfile (#11838)

* Update BIGDL_LLM_SDP_IGNORE_MASK in start-vllm-service.sh (#11957)

* Fix vLLM not convert issues (#11817) (#11918)

* Fix not convert issues

* refine

Co-authored-by: Guancheng Fu <110874468+gc-fu@users.noreply.github.com>

* Fix glm4-9b-chat nan error on vllm 0.5.4 (#11969)

* init

* update mlp forward

* fix minicpm error in vllm 0.5.4

* fix dependabot alerts (#12008)

* Update 0.5.4 dockerfile (#12021)

* Add vllm awq loading logic (#11987)

* [ADD] Add vllm awq loading logic

* [FIX] fix the module.linear_method path

* [FIX] fix quant_config path error

* Enable Qwen padding mlp to 256 to support batch_forward (#12030)

* Enable padding mlp

* padding to 256

* update style

* Install 27191 runtime in 0.5.4 docker image (#12040)

* fix rebase error

* fix rebase error

* vLLM: format for 0.5.4 rebase (#12043)

* format

* Update model_convert.py

* Fix serving docker related modifications (#12046)

* Fix undesired modifications (#12048)

* fix

* Refine offline_inference arguments

---------

Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com>
Co-authored-by: Jun Wang <thoughts.times@gmail.com>
Co-authored-by: Wang, Jian4 <61138589+hzjane@users.noreply.github.com>
Co-authored-by: liu-shaojun <johnssalyn@outlook.com>
Co-authored-by: Shaojun Liu <61072813+liu-shaojun@users.noreply.github.com>
2024-09-10 15:37:43 +08:00
Shaojun Liu
4cf640c548
update docker image tag to 2.2.0-SNAPSHOT (#11904) 2024-08-23 13:57:41 +08:00
Guancheng Fu
86fc0492f4
Update oneccl used (#11647)
* Add internal oneccl

* fix

* fix

* add oneccl
2024-07-26 09:38:39 +08:00
Wang, Jian4
1eed0635f2
Add lightweight serving and support tgi parameter (#11600)
* init tgi request

* update openai api

* update for pp

* update and add readme

* add to docker

* add start bash

* update

* update

* update
2024-07-19 13:15:56 +08:00
Wang, Jian4
e000ac90c4
Add pp_serving example to serving image (#11433)
* init pp

* update

* update

* no clone ipex-llm again
2024-06-28 16:45:25 +08:00
Wang, Jian4
b7bc1023fb
Add vllm_online_benchmark.py (#11458)
* init

* update and add

* update
2024-06-28 14:59:06 +08:00
Shaojun Liu
5aa3e427a9
Fix docker images (#11362)
* Fix docker images

* add-apt-repository requires gnupg, gpg-agent, software-properties-common

* update

* avoid importing ipex again
2024-06-20 15:44:55 +08:00
Guancheng Fu
c9b4cadd81
fix vLLM/docker issues (#11348)
* fix

* fix

* ffix
2024-06-18 16:23:53 +08:00
Shaojun Liu
9760ffc256
Fix SDLe CT222 Vulnerabilities (#11237)
* fix ct222 vuln

* update

* fix

* update ENTRYPOINT

* revert ENTRYPOINT

* Fix CT222 Vulns

* fix

* revert changes

* fix

* revert

* add sudo permission to ipex-llm user

* do not use ipex-llm user
2024-06-13 15:31:22 +08:00
Guancheng Fu
3ef4aa98d1
Refine vllm_quickstart doc (#11199)
* refine doc

* refine
2024-06-04 18:46:27 +08:00
Guancheng Fu
7e29928865
refactor serving docker image (#11028) 2024-05-16 09:30:36 +08:00
Guancheng Fu
2c64754eb0
Add vLLM to ipex-llm serving image (#10807)
* add vllm

* done

* doc work

* fix done

* temp

* add docs

* format

* add start-fastchat-service.sh

* fix
2024-04-29 17:25:42 +08:00
Shaojun Liu
59058bb206
replace 2.5.0-SNAPSHOT with 2.1.0-SNAPSHOT for llm docker images (#10603) 2024-04-01 09:58:51 +08:00
Wang, Jian4
e2d25de17d
Update_docker by heyang (#29) 2024-03-25 10:05:46 +08:00