Wang, Jian4
|
1bfcbc0640
|
Add multimodal benchmark (#12415)
* add benchmark multimodal
* update
* update
* update
|
2024-11-20 14:21:13 +08:00 |
|
Xu, Shuo
|
6726b198fd
|
Update readme & doc for the vllm upgrade to v0.6.2 (#12399)
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
|
2024-11-14 10:28:15 +08:00 |
|
Shaojun Liu
|
fad15c8ca0
|
Update fastchat demo script (#12367)
* Update README.md
* Update vllm_docker_quickstart.md
|
2024-11-08 15:42:17 +08:00 |
|
Xu, Shuo
|
ce0c6ae423
|
Update Readme for FastChat docker demo (#12354)
* update Readme for FastChat docker demo
* update readme
* add 'Serving with FastChat' part in docs
* polish docs
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
|
2024-11-07 15:22:42 +08:00 |
|
Jun Wang
|
b10fc892e1
|
Update new reference link of xpu/docker/readme.md (#12188)
* [ADD] rewrite new vllm docker quick start
* [ADD] lora adapter doc finished
* [ADD] mulit lora adapter test successfully
* [ADD] add ipex-llm quantization doc
* [Merge] rebase main
* [REMOVE] rm tmp file
* [Merge] rebase main
* [ADD] add prefix caching experiment and result
* [REMOVE] rm cpu offloading chapter
* [ADD] rewrite new vllm docker quick start
* [ADD] lora adapter doc finished
* [ADD] mulit lora adapter test successfully
* [ADD] add ipex-llm quantization doc
* [Merge] rebase main
* [REMOVE] rm tmp file
* [Merge] rebase main
* [ADD] rewrite new vllm docker quick start
* [ADD] lora adapter doc finished
* [ADD] mulit lora adapter test successfully
* [ADD] add ipex-llm quantization doc
* [Merge] rebase main
* [REMOVE] rm tmp file
* [Merge] rebase main
* [UPDATE] update the link to new vllm-docker-quickstart
|
2024-10-18 13:18:08 +08:00 |
|
Shaojun Liu
|
1295898830
|
update vllm_online_benchmark script to support long input (#12095)
* update vllm_online_benchmark script to support long input
* update guide
|
2024-09-20 14:18:30 +08:00 |
|
Shaojun Liu
|
4cf640c548
|
update docker image tag to 2.2.0-SNAPSHOT (#11904)
|
2024-08-23 13:57:41 +08:00 |
|
Wang, Jian4
|
1eed0635f2
|
Add lightweight serving and support tgi parameter (#11600)
* init tgi request
* update openai api
* update for pp
* update and add readme
* add to docker
* add start bash
* update
* update
* update
|
2024-07-19 13:15:56 +08:00 |
|
Xiangyu Tian
|
7f5111a998
|
LLM: Refine start script for Pipeline Parallel Serving (#11557)
Refine start script and readme for Pipeline Parallel Serving
|
2024-07-11 15:45:27 +08:00 |
|
Wang, Jian4
|
e000ac90c4
|
Add pp_serving example to serving image (#11433)
* init pp
* update
* update
* no clone ipex-llm again
|
2024-06-28 16:45:25 +08:00 |
|
Wang, Jian4
|
b7bc1023fb
|
Add vllm_online_benchmark.py (#11458)
* init
* update and add
* update
|
2024-06-28 14:59:06 +08:00 |
|
Guancheng Fu
|
7e29928865
|
refactor serving docker image (#11028)
|
2024-05-16 09:30:36 +08:00 |
|
Guancheng Fu
|
2c64754eb0
|
Add vLLM to ipex-llm serving image (#10807)
* add vllm
* done
* doc work
* fix done
* temp
* add docs
* format
* add start-fastchat-service.sh
* fix
|
2024-04-29 17:25:42 +08:00 |
|
Shaojun Liu
|
59058bb206
|
replace 2.5.0-SNAPSHOT with 2.1.0-SNAPSHOT for llm docker images (#10603)
|
2024-04-01 09:58:51 +08:00 |
|
Wang, Jian4
|
e2d25de17d
|
Update_docker by heyang (#29)
|
2024-03-25 10:05:46 +08:00 |
|
Shaojun Liu
|
0e5ab5ebfc
|
update docker tag to 2.5.0-SNAPSHOT (#9443)
|
2023-11-13 16:53:40 +08:00 |
|
Guancheng Fu
|
cc84ed70b3
|
Create serving images (#9048)
* Finished & Tested
* Install latest pip from base images
* Add blank line
* Delete unused comment
* fix typos
|
2023-09-25 15:51:45 +08:00 |
|