..
1ccl_for_multi_arc.patch
Update 1ccl_for_multi_arc.patch ( #13199 )
2025-05-30 17:13:59 +08:00
audio_language.py
Enable phi-4 with vision and audio ( #13203 )
2025-06-05 10:15:20 +08:00
benchmark.sh
Merge CPU & XPU Dockerfiles with Serving Images and Refactor ( #12815 )
2025-02-17 14:23:22 +08:00
benchmark_vllm_latency.py
Upgrade to vLLM 0.6.6 ( #12796 )
2025-02-12 16:47:51 +08:00
benchmark_vllm_throughput.py
Upgrade to vLLM 0.6.6 ( #12796 )
2025-02-12 16:47:51 +08:00
ccl_torch.patch
Upgrade to vLLM 0.6.6 ( #12796 )
2025-02-12 16:47:51 +08:00
chat.py
Merge CPU & XPU Dockerfiles with Serving Images and Refactor ( #12815 )
2025-02-17 14:23:22 +08:00
Dockerfile
To avoid errors caused by a Transformers version that is too new. ( #13291 )
2025-08-14 14:52:47 +08:00
oneccl-binding.patch
Update oneccl-binding.patch ( #12377 )
2024-11-11 22:34:08 +08:00
payload-1024.lua
Add vLLM to ipex-llm serving image ( #10807 )
2024-04-29 17:25:42 +08:00
README.md
Refactor vLLM Documentation: Centralize Benchmarking and Improve Readability ( #13141 )
2025-05-09 10:19:42 +08:00
setvars.sh
Refactor docker image by applying patch method ( #13011 )
2025-03-28 08:13:50 +08:00
start-lightweight_serving-service.sh
Reenable pp and lightweight-serving serving on 0.6.6 ( #12814 )
2025-02-13 10:16:00 +08:00
start-pp_serving-service.sh
Reenable pp and lightweight-serving serving on 0.6.6 ( #12814 )
2025-02-13 10:16:00 +08:00
start-vllm-service.sh
Update docs and scripts to align with new Docker image release ( #13156 )
2025-05-13 17:06:29 +08:00
vllm_for_multi_arc.patch
update patches ( #13290 )
2025-08-14 10:15:48 +08:00
vllm_offline_inference.py
Fix ( #12390 )
2024-11-27 10:41:58 +08:00
vllm_offline_inference_vision_language.py
Enable phi-4 with vision and audio ( #13203 )
2025-06-05 10:15:20 +08:00
vllm_online_benchmark.py
Update english prompt to 34k ( #12429 )
2024-11-22 11:20:35 +08:00
vllm_online_benchmark_multimodal.py
Update 083 multimodal benchmark ( #13135 )
2025-05-07 09:35:09 +08:00