ipex-llm

Author	SHA1	Message	Date
Wang, Jian4	45864790f7	Enable phi-4 with vision and audio (#13203 ) * add phi4 * update * enable audio * update and add readme	2025-06-05 10:15:20 +08:00
Shaojun Liu	bd71739e64	Update docs and scripts to align with new Docker image release (#13156 ) * Update vllm_docker_quickstart.md * Update start-vllm-service.sh * Update vllm_docker_quickstart.md * Update start-vllm-service.sh	2025-05-13 17:06:29 +08:00
Shaojun Liu	45f7bf6688	Refactor vLLM Documentation: Centralize Benchmarking and Improve Readability (#13141 ) * update vllm doc * update image name * update * update * update * update	2025-05-09 10:19:42 +08:00
Xiangyu Tian	51b41faad7	vLLM: update vLLM XPU to 0.8.3 version (#13118 ) vLLM: update vLLM XPU to 0.8.3 version	2025-04-30 14:40:53 +08:00
Shaojun Liu	1d7f4a83ac	Update documentation to build Docker image from Dockerfile instead of pulling from registry (#13057 ) * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update docker_cpp_xpu_quickstart.md * Update vllm_cpu_docker_quickstart.md * Update docker_cpp_xpu_quickstart.md * Update vllm_docker_quickstart.md * Update fastchat_docker_quickstart.md * Update docker_pytorch_inference_gpu.md	2025-04-09 16:40:20 +08:00
Wang, Jian4	7809ca9864	Reuse --privileged (#13015 ) * fix * add	2025-03-27 10:00:50 +08:00
Shaojun Liu	6a2d87e40f	add `--entrypoint /bin/bash` (#12957 ) Co-authored-by: gc-fu <guancheng.fu@intel.com>	2025-03-10 10:10:27 +08:00
Shaojun Liu	015a4c8c43	Add CPU and GPU Frequency Locking Instructions to Documentation (#12947 )	2025-03-07 09:20:40 +08:00
Shaojun Liu	f81d89d908	Remove Unnecessary --privileged Flag While Keeping It for WSL Users (#12920 )	2025-03-03 11:11:42 +08:00
Shaojun Liu	f7b5a093a7	Merge CPU & XPU Dockerfiles with Serving Images and Refactor (#12815 ) * Update Dockerfile * Update Dockerfile * Ensure scripts are executable * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * update * Update Dockerfile * remove inference-cpu and inference-xpu * update README	2025-02-17 14:23:22 +08:00
logicat	0534d7254f	Update docker_cpp_xpu_quickstart.md (#12667 )	2025-01-08 09:56:56 +08:00
Chu,Youcheng	acd77d9e87	Remove env variable `BIGDL_LLM_XMX_DISABLED` in documentation (#12445 ) * fix: remove BIGDL_LLM_XMX_DISABLED in mddocs * fix: remove set SYCL_CACHE_PERSISTENT=1 in example * fix: remove BIGDL_LLM_XMX_DISABLED in workflows * fix: merge igpu and A-series Graphics * fix: remove set BIGDL_LLM_XMX_DISABLED=1 in example * fix: remove BIGDL_LLM_XMX_DISABLED in workflows * fix: merge igpu and A-series Graphics * fix: textual adjustment * fix: textual adjustment * fix: textual adjustment	2024-11-27 11:16:36 +08:00
Jun Wang	cb7b08948b	update vllm-docker-quick-start for vllm0.6.2 (#12392 ) * update vllm-docker-quick-start for vllm0.6.2 * [UPDATE] rm max-num-seqs parameter in vllm-serving script	2024-11-27 08:47:03 +08:00
Xu, Shuo	6726b198fd	Update readme & doc for the vllm upgrade to v0.6.2 (#12399 ) Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-11-14 10:28:15 +08:00
Jun Wang	4376fdee62	Decouple the openwebui and the ollama. in inference-cpp-xpu dockerfile (#12382 ) * remove the openwebui in inference-cpp-xpu dockerfile * update docker_cpp_xpu_quickstart.md * add sample output in inference-cpp/readme * remove the openwebui in main readme * remove the openwebui in main readme	2024-11-12 20:15:23 +08:00
Shaojun Liu	fad15c8ca0	Update fastchat demo script (#12367 ) * Update README.md * Update vllm_docker_quickstart.md	2024-11-08 15:42:17 +08:00
Xu, Shuo	ce0c6ae423	Update Readme for FastChat docker demo (#12354 ) * update Readme for FastChat docker demo * update readme * add 'Serving with FastChat' part in docs * polish docs --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-11-07 15:22:42 +08:00
Jun Wang	aedc4edfba	[ADD] add open webui + vllm serving (#12246 )	2024-10-23 10:13:14 +08:00
Jun Wang	fe3b5cd89b	[Update] mmdocs/dockerguide vllm-quick-start awq,gptq online serving document (#12227 ) * [FIX] fix the docker start script error * [ADD] add awq online serving doc * [ADD] add gptq online serving doc * [Fix] small fix	2024-10-18 09:46:59 +08:00
Shaojun Liu	49eb20613a	add --blocksize to doc and script (#12187 )	2024-10-12 09:17:42 +08:00
Jun Wang	6ffaec66a2	[UPDATE] add prefix caching document into `vllm_docker_quickstart.md` (#12173 ) * [ADD] rewrite new vllm docker quick start * [ADD] lora adapter doc finished * [ADD] mulit lora adapter test successfully * [ADD] add ipex-llm quantization doc * [Merge] rebase main * [REMOVE] rm tmp file * [Merge] rebase main * [ADD] add prefix caching experiment and result * [REMOVE] rm cpu offloading chapter	2024-10-11 19:12:22 +08:00
Jun Wang	412cf8e20c	[UPDATE] update mddocs/DockerGuides/vllm_docker_quickstart.md (#12166 ) * [ADD] rewrite new vllm docker quick start * [ADD] lora adapter doc finished * [ADD] mulit lora adapter test successfully * [ADD] add ipex-llm quantization doc * [UPDATE] update mmdocs vllm_docker_quickstart content * [REMOVE] rm tmp file * [UPDATE] tp and pp explaination and readthedoc link change * [FIX] fix the error description of tp+pp and quantization part * [FIX] fix the table of verifed model * [UPDATE] add full low bit para list * [UPDATE] update the load_in_low_bit params to verifed dtype	2024-10-09 11:19:32 +08:00
Shaojun Liu	fac4c01a6e	Revert to use out-of-tree GPU driver (#11761 ) * Revert to use out-of-tree GPU driver since the performance with out-of-tree driver is better than upsteam's * add spaces * add troubleshooting case * update Troubleshooting	2024-08-12 13:41:47 +08:00
binbin Deng	66f6ffe4b2	Update GPU HF-Transformers example structure (#11526 )	2024-07-08 17:58:06 +08:00
Shaojun Liu	72b4efaad4	Enhanced XPU Dockerfiles: Optimized Environment Variables and Documentation (#11506 ) * Added SYCL_CACHE_PERSISTENT=1 to xpu Dockerfile * Update the document to add explanations for environment variables. * update quickstart	2024-07-04 20:18:38 +08:00
Yuwen Hu	4cb9a4728e	Add index page for API doc & links update in mddocs (#11393 ) * Small fixes * Add initial api doc index * Change index.md -> README.md * Fix on API links	2024-06-21 17:34:34 +08:00
Yuwen Hu	a027121530	Small mddoc fixed based on review (#11391 ) * Fix based on review * Further fix * Small fix * Small fix	2024-06-21 17:09:30 +08:00
Yuwen Hu	54f9d07d8f	Further mddocs fixes (#11386 ) * Update mddocs for ragflow quickstart * Fixes for docker guides mddocs * Further fixes	2024-06-21 13:27:43 +08:00
Yuwen Hu	9b475c07db	Add missing ragflow quickstart in mddocs and update legecy contents (#11385 )	2024-06-21 12:28:26 +08:00
Xu, Shuo	fed79f106b	Update mddocs for DockerGuides (#11380 ) * transfer files in DockerGuides from rst to md * add some dividing lines * adjust the title hierarchy in docker_cpp_xpu_quickstart.md * restore * switch to the correct branch * small change --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-06-21 12:10:35 +08:00
Yuwen Hu	769728c1eb	Add initial md docs (#11371 )	2024-06-20 13:47:49 +08:00

31 commits