ipex-llm

Author	SHA1	Message	Date
Jason Dai	2a8f624f4b	Update README (#12956 )	2025-03-09 09:04:13 +08:00
binbin Deng	5ee09b4b28	[NPU] Small update about zip doc (#12951 )	2025-03-07 15:22:14 +08:00
Shaojun Liu	015a4c8c43	Add CPU and GPU Frequency Locking Instructions to Documentation (#12947 )	2025-03-07 09:20:40 +08:00
Jason Dai	cb3c4b26ad	Update llamacpp_portable_zip_gpu_quickstart.md (#12945 )	2025-03-06 11:58:11 +08:00
Jason Dai	1432c5d9a0	Update llamacpp_portable_zip_gpu_quickstart (#12941 )	2025-03-06 10:01:56 +08:00
Jason Dai	32480cc8ed	Update llamacpp_portable_zip_gpu_quickstart (#12940 )	2025-03-06 08:42:18 +08:00
Jason Dai	975cf5f21f	Update README.md (#12939 )	2025-03-06 08:04:27 +08:00
joan726	eccb5b817e	Add llamacpp_portable_zip_gpu_quickstart.zh-CN.md (#12930 ) * Add llamacpp_portable_zip_gpu_quickstart.zh-CN.md Add llamacpp_portable_zip_gpu_quickstart.zh-CN.md * Update README.zh-CN.md Changed and Linked to llamacpp portable zip.zh-CN.md. * Update llamacpp_portable_zip_gpu_quickstart.md Added CN version link * Update README.zh-CN.md Update all links to "llamacpp_portable_zip_gpu_quickstart.zh-CN.md * Update llama_cpp_quickstart.zh-CN.md * Update llamacpp_portable_zip_gpu_quickstart.zh-CN.md Modify based on comments. * Update llamacpp_portable_zip_gpu_quickstart.zh-CN.md Modify based on comments. * Update llamacpp_portable_zip_gpu_quickstart.zh-CN.md Update the doc based on #12928 * Update llamacpp_portable_zip_gpu_quickstart.zh-CN.md Add “More Details” on Table of Contents * Update README.zh-CN.md Update llamacpp_portable_zip_gpu_quickstart CN link * Update README.zh-CN.md Change llama.cpp link * Update README.zh-CN.md * Update README.md	2025-03-05 14:55:44 +08:00
Yuwen Hu	7c0c77cce3	Tiny fixes (#12936 )	2025-03-05 14:55:26 +08:00
Yuwen Hu	68a770745b	Add moonlight GPU example (#12929 ) * Add moonlight GPU example and update table * Small fix * Fix based on comments * Small fix	2025-03-05 11:31:14 +08:00
Xin Qiu	33da3a3cb7	Update llama cpp portable zip quickstart (#12928 ) * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md	2025-03-05 09:22:10 +08:00
Jason Dai	de09590ca3	Update llamacpp_portable_zip_gpu_quickstart.md (#12932 )	2025-03-05 07:59:32 +08:00
Jason Dai	69edc8b6f6	Update quickstart (#12927 )	2025-03-04 15:34:52 +08:00
Qiyuan Gong	0b5079833c	llama.cpp portable Zip for Linux quickstart (#12923 ) * llamacpp Linux portable doc & flashmoe	2025-03-04 14:50:21 +08:00
binbin Deng	091ab2bd59	[NPU] Add troubleshooting in portable zip doc (#12924 )	2025-03-04 10:41:39 +08:00
Yuwen Hu	b2d676f1c6	Further update Ollama portable zip quickstart (#12921 ) * Update Chinese doc for ollama quickstart tips and troubleshooting * Update for recommanded Windows OS * Small fix * Small fix	2025-03-03 18:07:57 +08:00
Shaojun Liu	f81d89d908	Remove Unnecessary --privileged Flag While Keeping It for WSL Users (#12920 )	2025-03-03 11:11:42 +08:00
Shaojun Liu	7810b8fb49	OSPDT: update dockerfile header (#12908 ) * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile	2025-03-03 09:59:11 +08:00
Yishuo Wang	b6f33d5c4d	optimize moonlight again (#12909 )	2025-03-03 09:21:15 +08:00
Jason Dai	35e5fa851c	Update README.md (#12911 )	2025-02-28 17:55:45 +08:00
binbin Deng	8351f6c455	[NPU] Add QuickStart for llama.cpp NPU portable zip (#12899 )	2025-02-28 17:19:18 +08:00
Xin Qiu	029480f4a8	llama cpp portable zip Quickstart (#12894 ) * llamacpp_quickstart * update * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md	2025-02-28 15:45:11 +08:00
Yuwen Hu	443cb5d4e0	Update Janus-Pro GPU example (#12906 )	2025-02-28 15:39:03 +08:00
Yuwen Hu	8d94752c4b	Ollama portable zip QuickStart updates regarding more tips (#12905 ) * Update for select multiple GPUs * Update Ollama portable zip quickstarts regarding more tips * Small fix	2025-02-28 15:10:56 +08:00
Yishuo Wang	39e360fe9d	add grouped topk optimization for moonlight (#12903 )	2025-02-28 13:25:56 +08:00
Xin Qiu	e946127613	glm 4v 1st sdp for vision (#12904 ) * glm4v 1st sdp * update glm4v example * meet code review * fix style	2025-02-28 13:23:27 +08:00
Shaojun Liu	5c100ac105	Add ENTRYPOINT to Dockerfile to auto-start vllm service on container launch (for CVTE customer) (#12901 ) * Add ENTRYPOINT to Dockerfile to auto-start service on container launch (for CVTE client) * Update start-vllm-service.sh * Update README.md * Update README.md * Update start-vllm-service.sh * Update README.md	2025-02-27 17:33:58 +08:00
Yishuo Wang	be1f073866	add fuse moe optimization for moonlight (#12898 )	2025-02-27 09:15:24 +08:00
Jason Dai	ad65e2b03a	Update README.md (#12900 )	2025-02-27 08:30:06 +08:00
Yishuo Wang	5faba06409	simple optimization for moonlight moe decoding forward (#12891 )	2025-02-25 16:18:27 +08:00
Xiangyu Tian	ae9f5320da	vLLM CPU: Fix Triton Version to Resolve Related Error(#12893 )	2025-02-25 15:00:41 +08:00
Yishuo Wang	ab3fc66eb7	optimize attention part of moonlight-14B-A3B (#12886 )	2025-02-25 09:38:13 +08:00
Shaojun Liu	dd30d12cb6	Fix serving-cpu image: setuptools-scm requires setuptools>=61 (#12876 ) * setuptools-scm requires setuptools>=61 * Update Dockerfile * Update Dockerfile * Update Dockerfile	2025-02-25 09:10:14 +08:00
Yuwen Hu	06694ba61a	Further fix portable zip file link (#12885 )	2025-02-24 18:06:57 +08:00
Yuwen Hu	671ddfd847	Update wrong file name for portable zip quickstart (#12883 )	2025-02-24 17:52:09 +08:00
Yuwen Hu	a9c8e73a77	Update llama.cpp Prerequisites guide regarding oneAPI 2025.0 (#12881 ) * Update llama.cpp Prerequisites guide regarding oneAPI 2025.0 * Update based on comments * Small fix * Small fix	2025-02-24 16:32:23 +08:00
Wang, Jian4	4f2f92afa3	Update inference-cpp docker (#12882 ) * remove nouse run.py * add WORKDIR /llm	2025-02-24 14:32:44 +08:00
Yishuo Wang	3f6ecce508	support using xgrammar to get json output (#12870 )	2025-02-24 14:10:58 +08:00
Shaojun Liu	afad979168	Add Apache 2.0 License Information in Dockerfile to Comply with OSPDT Requirements (#12878 ) * ospdt: add Header for Dockerfile * OSPDT: add Header for Dockerfile * OSPDT: add Header for Dockerfile * OSPDT: add Header for Dockerfile	2025-02-24 14:00:46 +08:00
Guancheng Fu	02ec313eab	Update README.md (#12877 )	2025-02-24 09:59:17 +08:00
Shaojun Liu	10400abfb7	Fix CodeQL workflow (#12875 ) * Update codeql.yml * Update codeql.yml	2025-02-24 09:16:54 +08:00
Xu, Shuo	1e00bed001	Add GPU example for Janus-Pro (#12869 ) * Add example for Janus-Pro * Update model link * Fixes * Fixes --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com> Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2025-02-21 18:36:50 +08:00
Yuwen Hu	21d6a78be0	Update Ollama portable zip QuickStart to fit new version (#12871 ) * Update ollama portable zip quickstart * Update demo images	2025-02-21 17:54:14 +08:00
Wang, Jian4	3ea5389a99	Fix vllm api_server v1/models error (#12867 )	2025-02-21 11:08:29 +08:00
binbin Deng	8077850452	[NPU GGUF] Add simple example (#12853 )	2025-02-21 09:58:00 +08:00
Wang, Jian4	348dc8056d	Fix vllm gptq awq error (#12863 ) * fix gptq awq error * fix python style	2025-02-20 16:27:23 +08:00
Yuwen Hu	a488981f3f	Ollama portable zip QuickStart tiny fix (#12862 ) * Tiny fix to ollama portable zip quickstart * Tiny fix	2025-02-20 14:11:12 +08:00
Yuwen Hu	0f2706be42	Update CN Ollama portable zip QuickStart for troubleshooting & tips (#12860 ) * Small fix for english version * Update CN ollama portable zip quickstart for troubleshooting & tips * Small fix	2025-02-20 11:32:06 +08:00
Jason Dai	38a682adb1	Update Readme (#12855 )	2025-02-19 19:55:29 +08:00
Guancheng Fu	4eed0c7d99	initial implementation for low_bit_loader vLLM (#12838 ) * initial * add logic for handling tensor parallel models * fix * Add some comments * add doc * fix done	2025-02-19 19:45:34 +08:00

1 2 3 4 5 ...

4103 commits