Wang, Jian4
45864790f7
Enable phi-4 with vision and audio ( #13203 )
...
* add phi4
* update
* enable audio
* update and add readme
2025-06-05 10:15:20 +08:00
Shaojun Liu
bd71739e64
Update docs and scripts to align with new Docker image release ( #13156 )
...
* Update vllm_docker_quickstart.md
* Update start-vllm-service.sh
* Update vllm_docker_quickstart.md
* Update start-vllm-service.sh
2025-05-13 17:06:29 +08:00
Shaojun Liu
45f7bf6688
Refactor vLLM Documentation: Centralize Benchmarking and Improve Readability ( #13141 )
...
* update vllm doc
* update image name
* update
* update
* update
* update
2025-05-09 10:19:42 +08:00
Xiangyu Tian
51b41faad7
vLLM: update vLLM XPU to 0.8.3 version ( #13118 )
...
vLLM: update vLLM XPU to 0.8.3 version
2025-04-30 14:40:53 +08:00
Shaojun Liu
1d7f4a83ac
Update documentation to build Docker image from Dockerfile instead of pulling from registry ( #13057 )
...
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update docker_cpp_xpu_quickstart.md
* Update vllm_cpu_docker_quickstart.md
* Update docker_cpp_xpu_quickstart.md
* Update vllm_docker_quickstart.md
* Update fastchat_docker_quickstart.md
* Update docker_pytorch_inference_gpu.md
2025-04-09 16:40:20 +08:00
Wang, Jian4
7809ca9864
Reuse --privileged ( #13015 )
...
* fix
* add
2025-03-27 10:00:50 +08:00
Shaojun Liu
6a2d87e40f
add --entrypoint /bin/bash ( #12957 )
...
Co-authored-by: gc-fu <guancheng.fu@intel.com>
2025-03-10 10:10:27 +08:00
Shaojun Liu
015a4c8c43
Add CPU and GPU Frequency Locking Instructions to Documentation ( #12947 )
2025-03-07 09:20:40 +08:00
Shaojun Liu
f81d89d908
Remove Unnecessary --privileged Flag While Keeping It for WSL Users ( #12920 )
2025-03-03 11:11:42 +08:00
Shaojun Liu
f7b5a093a7
Merge CPU & XPU Dockerfiles with Serving Images and Refactor ( #12815 )
...
* Update Dockerfile
* Update Dockerfile
* Ensure scripts are executable
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* Update Dockerfile
* update
* Update Dockerfile
* remove inference-cpu and inference-xpu
* update README
2025-02-17 14:23:22 +08:00
logicat
0534d7254f
Update docker_cpp_xpu_quickstart.md ( #12667 )
2025-01-08 09:56:56 +08:00
Chu,Youcheng
acd77d9e87
Remove env variable BIGDL_LLM_XMX_DISABLED in documentation ( #12445 )
...
* fix: remove BIGDL_LLM_XMX_DISABLED in mddocs
* fix: remove set SYCL_CACHE_PERSISTENT=1 in example
* fix: remove BIGDL_LLM_XMX_DISABLED in workflows
* fix: merge igpu and A-series Graphics
* fix: remove set BIGDL_LLM_XMX_DISABLED=1 in example
* fix: remove BIGDL_LLM_XMX_DISABLED in workflows
* fix: merge igpu and A-series Graphics
* fix: textual adjustment
* fix: textual adjustment
* fix: textual adjustment
2024-11-27 11:16:36 +08:00
Jun Wang
cb7b08948b
update vllm-docker-quick-start for vllm0.6.2 ( #12392 )
...
* update vllm-docker-quick-start for vllm0.6.2
* [UPDATE] rm max-num-seqs parameter in vllm-serving script
2024-11-27 08:47:03 +08:00
Xu, Shuo
6726b198fd
Update readme & doc for the vllm upgrade to v0.6.2 ( #12399 )
...
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-11-14 10:28:15 +08:00
Jun Wang
4376fdee62
Decouple the openwebui and the ollama. in inference-cpp-xpu dockerfile ( #12382 )
...
* remove the openwebui in inference-cpp-xpu dockerfile
* update docker_cpp_xpu_quickstart.md
* add sample output in inference-cpp/readme
* remove the openwebui in main readme
* remove the openwebui in main readme
2024-11-12 20:15:23 +08:00
Shaojun Liu
fad15c8ca0
Update fastchat demo script ( #12367 )
...
* Update README.md
* Update vllm_docker_quickstart.md
2024-11-08 15:42:17 +08:00
Xu, Shuo
ce0c6ae423
Update Readme for FastChat docker demo ( #12354 )
...
* update Readme for FastChat docker demo
* update readme
* add 'Serving with FastChat' part in docs
* polish docs
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-11-07 15:22:42 +08:00
Jun Wang
aedc4edfba
[ADD] add open webui + vllm serving ( #12246 )
2024-10-23 10:13:14 +08:00
Jun Wang
fe3b5cd89b
[Update] mmdocs/dockerguide vllm-quick-start awq,gptq online serving document ( #12227 )
...
* [FIX] fix the docker start script error
* [ADD] add awq online serving doc
* [ADD] add gptq online serving doc
* [Fix] small fix
2024-10-18 09:46:59 +08:00
Shaojun Liu
49eb20613a
add --blocksize to doc and script ( #12187 )
2024-10-12 09:17:42 +08:00
Jun Wang
6ffaec66a2
[UPDATE] add prefix caching document into vllm_docker_quickstart.md ( #12173 )
...
* [ADD] rewrite new vllm docker quick start
* [ADD] lora adapter doc finished
* [ADD] mulit lora adapter test successfully
* [ADD] add ipex-llm quantization doc
* [Merge] rebase main
* [REMOVE] rm tmp file
* [Merge] rebase main
* [ADD] add prefix caching experiment and result
* [REMOVE] rm cpu offloading chapter
2024-10-11 19:12:22 +08:00
Jun Wang
412cf8e20c
[UPDATE] update mddocs/DockerGuides/vllm_docker_quickstart.md ( #12166 )
...
* [ADD] rewrite new vllm docker quick start
* [ADD] lora adapter doc finished
* [ADD] mulit lora adapter test successfully
* [ADD] add ipex-llm quantization doc
* [UPDATE] update mmdocs vllm_docker_quickstart content
* [REMOVE] rm tmp file
* [UPDATE] tp and pp explaination and readthedoc link change
* [FIX] fix the error description of tp+pp and quantization part
* [FIX] fix the table of verifed model
* [UPDATE] add full low bit para list
* [UPDATE] update the load_in_low_bit params to verifed dtype
2024-10-09 11:19:32 +08:00
Shaojun Liu
fac4c01a6e
Revert to use out-of-tree GPU driver ( #11761 )
...
* Revert to use out-of-tree GPU driver since the performance with out-of-tree driver is better than upsteam's
* add spaces
* add troubleshooting case
* update Troubleshooting
2024-08-12 13:41:47 +08:00
binbin Deng
66f6ffe4b2
Update GPU HF-Transformers example structure ( #11526 )
2024-07-08 17:58:06 +08:00
Shaojun Liu
72b4efaad4
Enhanced XPU Dockerfiles: Optimized Environment Variables and Documentation ( #11506 )
...
* Added SYCL_CACHE_PERSISTENT=1 to xpu Dockerfile
* Update the document to add explanations for environment variables.
* update quickstart
2024-07-04 20:18:38 +08:00
Yuwen Hu
4cb9a4728e
Add index page for API doc & links update in mddocs ( #11393 )
...
* Small fixes
* Add initial api doc index
* Change index.md -> README.md
* Fix on API links
2024-06-21 17:34:34 +08:00
Yuwen Hu
a027121530
Small mddoc fixed based on review ( #11391 )
...
* Fix based on review
* Further fix
* Small fix
* Small fix
2024-06-21 17:09:30 +08:00
Yuwen Hu
54f9d07d8f
Further mddocs fixes ( #11386 )
...
* Update mddocs for ragflow quickstart
* Fixes for docker guides mddocs
* Further fixes
2024-06-21 13:27:43 +08:00
Yuwen Hu
9b475c07db
Add missing ragflow quickstart in mddocs and update legecy contents ( #11385 )
2024-06-21 12:28:26 +08:00
Xu, Shuo
fed79f106b
Update mddocs for DockerGuides ( #11380 )
...
* transfer files in DockerGuides from rst to md
* add some dividing lines
* adjust the title hierarchy in docker_cpp_xpu_quickstart.md
* restore
* switch to the correct branch
* small change
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-06-21 12:10:35 +08:00
Yuwen Hu
769728c1eb
Add initial md docs ( #11371 )
2024-06-20 13:47:49 +08:00