ipex-llm

History

Shaojun Liu 25e1709050 To avoid errors caused by a Transformers version that is too new. (#13291 )		2025-08-14 14:52:47 +08:00
..
1ccl_for_multi_arc.patch	Update 1ccl_for_multi_arc.patch (#13199 )	2025-05-30 17:13:59 +08:00
audio_language.py	Enable phi-4 with vision and audio (#13203 )	2025-06-05 10:15:20 +08:00
benchmark.sh	Merge CPU & XPU Dockerfiles with Serving Images and Refactor (#12815 )	2025-02-17 14:23:22 +08:00
benchmark_vllm_latency.py	Upgrade to vLLM 0.6.6 (#12796 )	2025-02-12 16:47:51 +08:00
benchmark_vllm_throughput.py	Upgrade to vLLM 0.6.6 (#12796 )	2025-02-12 16:47:51 +08:00
ccl_torch.patch	Upgrade to vLLM 0.6.6 (#12796 )	2025-02-12 16:47:51 +08:00
chat.py	Merge CPU & XPU Dockerfiles with Serving Images and Refactor (#12815 )	2025-02-17 14:23:22 +08:00
Dockerfile	To avoid errors caused by a Transformers version that is too new. (#13291 )	2025-08-14 14:52:47 +08:00
oneccl-binding.patch	Update oneccl-binding.patch (#12377 )	2024-11-11 22:34:08 +08:00
payload-1024.lua	Add vLLM to ipex-llm serving image (#10807 )	2024-04-29 17:25:42 +08:00
README.md	Refactor vLLM Documentation: Centralize Benchmarking and Improve Readability (#13141 )	2025-05-09 10:19:42 +08:00
setvars.sh	Refactor docker image by applying patch method (#13011 )	2025-03-28 08:13:50 +08:00
start-lightweight_serving-service.sh	Reenable pp and lightweight-serving serving on 0.6.6 (#12814 )	2025-02-13 10:16:00 +08:00
start-pp_serving-service.sh	Reenable pp and lightweight-serving serving on 0.6.6 (#12814 )	2025-02-13 10:16:00 +08:00
start-vllm-service.sh	Update docs and scripts to align with new Docker image release (#13156 )	2025-05-13 17:06:29 +08:00
vllm_for_multi_arc.patch	update patches (#13290 )	2025-08-14 10:15:48 +08:00
vllm_offline_inference.py	Fix (#12390 )	2024-11-27 10:41:58 +08:00
vllm_offline_inference_vision_language.py	Enable phi-4 with vision and audio (#13203 )	2025-06-05 10:15:20 +08:00
vllm_online_benchmark.py	Update english prompt to 34k (#12429 )	2024-11-22 11:20:35 +08:00
vllm_online_benchmark_multimodal.py	Update 083 multimodal benchmark (#13135 )	2025-05-07 09:35:09 +08:00

README.md

💡 Tip: For a detailed and up-to-date guide on running vLLM serving with IPEX-LLM on Intel GPUs via Docker, please refer to our official documentation:
vllm_docker_quickstart