ipex-llm

Author	SHA1	Message	Date
Wang, Jian4	1bfcbc0640	Add multimodal benchmark (#12415 ) * add benchmark multimodal * update * update * update	2024-11-20 14:21:13 +08:00
Xu, Shuo	6726b198fd	Update readme & doc for the vllm upgrade to v0.6.2 (#12399 ) Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-11-14 10:28:15 +08:00
Shaojun Liu	fad15c8ca0	Update fastchat demo script (#12367 ) * Update README.md * Update vllm_docker_quickstart.md	2024-11-08 15:42:17 +08:00
Xu, Shuo	ce0c6ae423	Update Readme for FastChat docker demo (#12354 ) * update Readme for FastChat docker demo * update readme * add 'Serving with FastChat' part in docs * polish docs --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-11-07 15:22:42 +08:00
Jun Wang	b10fc892e1	Update new reference link of xpu/docker/readme.md (#12188 ) * [ADD] rewrite new vllm docker quick start * [ADD] lora adapter doc finished * [ADD] mulit lora adapter test successfully * [ADD] add ipex-llm quantization doc * [Merge] rebase main * [REMOVE] rm tmp file * [Merge] rebase main * [ADD] add prefix caching experiment and result * [REMOVE] rm cpu offloading chapter * [ADD] rewrite new vllm docker quick start * [ADD] lora adapter doc finished * [ADD] mulit lora adapter test successfully * [ADD] add ipex-llm quantization doc * [Merge] rebase main * [REMOVE] rm tmp file * [Merge] rebase main * [ADD] rewrite new vllm docker quick start * [ADD] lora adapter doc finished * [ADD] mulit lora adapter test successfully * [ADD] add ipex-llm quantization doc * [Merge] rebase main * [REMOVE] rm tmp file * [Merge] rebase main * [UPDATE] update the link to new vllm-docker-quickstart	2024-10-18 13:18:08 +08:00
Shaojun Liu	1295898830	update vllm_online_benchmark script to support long input (#12095 ) * update vllm_online_benchmark script to support long input * update guide	2024-09-20 14:18:30 +08:00
Shaojun Liu	4cf640c548	update docker image tag to 2.2.0-SNAPSHOT (#11904 )	2024-08-23 13:57:41 +08:00
Wang, Jian4	1eed0635f2	Add lightweight serving and support tgi parameter (#11600 ) * init tgi request * update openai api * update for pp * update and add readme * add to docker * add start bash * update * update * update	2024-07-19 13:15:56 +08:00
Xiangyu Tian	7f5111a998	LLM: Refine start script for Pipeline Parallel Serving (#11557 ) Refine start script and readme for Pipeline Parallel Serving	2024-07-11 15:45:27 +08:00
Wang, Jian4	e000ac90c4	Add pp_serving example to serving image (#11433 ) * init pp * update * update * no clone ipex-llm again	2024-06-28 16:45:25 +08:00
Wang, Jian4	b7bc1023fb	Add vllm_online_benchmark.py (#11458 ) * init * update and add * update	2024-06-28 14:59:06 +08:00
Guancheng Fu	7e29928865	refactor serving docker image (#11028 )	2024-05-16 09:30:36 +08:00
Guancheng Fu	2c64754eb0	Add vLLM to ipex-llm serving image (#10807 ) * add vllm * done * doc work * fix done * temp * add docs * format * add start-fastchat-service.sh * fix	2024-04-29 17:25:42 +08:00
Shaojun Liu	59058bb206	replace 2.5.0-SNAPSHOT with 2.1.0-SNAPSHOT for llm docker images (#10603 )	2024-04-01 09:58:51 +08:00
Wang, Jian4	e2d25de17d	Update_docker by heyang (#29 )	2024-03-25 10:05:46 +08:00
Shaojun Liu	0e5ab5ebfc	update docker tag to 2.5.0-SNAPSHOT (#9443 )	2023-11-13 16:53:40 +08:00
Guancheng Fu	cc84ed70b3	Create serving images (#9048 ) * Finished & Tested * Install latest pip from base images * Add blank line * Delete unused comment * fix typos	2023-09-25 15:51:45 +08:00

17 commits