ipex-llm

Author	SHA1	Message	Date
Jason Dai	35e5fa851c	Update README.md (#12911 )	2025-02-28 17:55:45 +08:00
binbin Deng	8351f6c455	[NPU] Add QuickStart for llama.cpp NPU portable zip (#12899 )	2025-02-28 17:19:18 +08:00
Xin Qiu	029480f4a8	llama cpp portable zip Quickstart (#12894 ) * llamacpp_quickstart * update * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md * Update llamacpp_portable_zip_gpu_quickstart.md	2025-02-28 15:45:11 +08:00
Yuwen Hu	8d94752c4b	Ollama portable zip QuickStart updates regarding more tips (#12905 ) * Update for select multiple GPUs * Update Ollama portable zip quickstarts regarding more tips * Small fix	2025-02-28 15:10:56 +08:00
Yuwen Hu	671ddfd847	Update wrong file name for portable zip quickstart (#12883 )	2025-02-24 17:52:09 +08:00
Yuwen Hu	a9c8e73a77	Update llama.cpp Prerequisites guide regarding oneAPI 2025.0 (#12881 ) * Update llama.cpp Prerequisites guide regarding oneAPI 2025.0 * Update based on comments * Small fix * Small fix	2025-02-24 16:32:23 +08:00
Yuwen Hu	21d6a78be0	Update Ollama portable zip QuickStart to fit new version (#12871 ) * Update ollama portable zip quickstart * Update demo images	2025-02-21 17:54:14 +08:00
binbin Deng	8077850452	[NPU GGUF] Add simple example (#12853 )	2025-02-21 09:58:00 +08:00
Yuwen Hu	a488981f3f	Ollama portable zip QuickStart tiny fix (#12862 ) * Tiny fix to ollama portable zip quickstart * Tiny fix	2025-02-20 14:11:12 +08:00
Yuwen Hu	0f2706be42	Update CN Ollama portable zip QuickStart for troubleshooting & tips (#12860 ) * Small fix for english version * Update CN ollama portable zip quickstart for troubleshooting & tips * Small fix	2025-02-20 11:32:06 +08:00
Jason Dai	38a682adb1	Update Readme (#12855 )	2025-02-19 19:55:29 +08:00
Xin Qiu	c81b7fc003	Add Portable zip Linux QuickStart (#12849 ) * linux doc * update * Update ollama_portablze_zip_quickstart.md * Update ollama_portablze_zip_quickstart.md * Update ollama_portablze_zip_quickstart.zh-CN.md * Update ollama_portablze_zip_quickstart.md * meet code review * update * Add tips & troubleshooting sections for both Linux & Windows * Rebase * Fix based on comments * Small fix * Fix img * Update table for linux * Small fix --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>	2025-02-19 19:13:55 +08:00
SONG Ge	5d041f9ebf	Add latest models list in ollama quickstart (#12850 ) * Add latest models llist on ollama quickstart * update oneapi version describe * move models list to ollama_portable_zip doc * update CN readme	2025-02-19 18:29:43 +08:00
Yuwen Hu	637543e135	Update Ollama portable zip QuickStart with troubleshooting (#12846 ) * Update ollama portable zip quickstart with runtime configurations * Small fix * Update based on comments * Small fix * Small fix	2025-02-19 11:04:03 +08:00
binbin Deng	bde8acc303	[NPU] Update doc of gguf support (#12837 )	2025-02-19 10:46:35 +08:00
Shaojun Liu	f7b5a093a7	Merge CPU & XPU Dockerfiles with Serving Images and Refactor (#12815 ) * Update Dockerfile * Update Dockerfile * Ensure scripts are executable * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * Update Dockerfile * update * Update Dockerfile * remove inference-cpu and inference-xpu * update README	2025-02-17 14:23:22 +08:00
joan726	59e8e1e91e	Added ollama_portablze_zip_quickstart.zh-CN.md (#12822 )	2025-02-14 18:54:12 +08:00
Jason Dai	a09552e59a	Update ollama quickstart (#12823 )	2025-02-14 09:55:48 +08:00
Yuwen Hu	f67986021c	Update download link for Ollama portable zip QuickStart (#12821 ) * Update download link for Ollama portable zip quickstart * Update based on comments	2025-02-13 17:48:02 +08:00
Jason Dai	16e63cbc18	Update readme (#12820 )	2025-02-13 14:26:04 +08:00
Yuwen Hu	68414afcb9	Add initial QuickStart for Ollama portable zip (#12817 ) * Add initial quickstart for Ollama portable zip * Small fix * Fixed based on comments * Small fix * Add demo image for run ollama * Update download link	2025-02-13 13:18:14 +08:00
binbin Deng	d093b75aa0	[NPU] Update driver installation in QuickStart (#12807 )	2025-02-11 15:49:21 +08:00
binbin Deng	6ff7faa781	[NPU] Update deepseek support in python examples and quickstart (#12786 )	2025-02-07 11:25:16 +08:00
Shaojun Liu	ee809e71df	add troubleshooting section (#12755 )	2025-01-26 11:03:58 +08:00
Shaojun Liu	53aae24616	Add note about enabling Resizable BAR in BIOS for GPU setup (#12715 )	2025-01-16 16:22:35 +08:00
binbin Deng	36bf3d8e29	[NPU doc] Update ARL product in QuickStart (#12708 )	2025-01-15 15:57:06 +08:00
SONG Ge	e2d58f733e	Update ollama v0.5.1 document (#12699 ) * Update ollama document version and known issue	2025-01-10 18:04:49 +08:00
joan726	584c1c5373	Update B580 CN doc (#12695 )	2025-01-10 11:20:47 +08:00
Jason Dai	cbb8e2a2d5	Update documents (#12693 )	2025-01-10 10:47:11 +08:00
Jason Dai	f9b29a4f56	Update B580 doc (#12691 )	2025-01-10 08:59:35 +08:00
joan726	66d4385cc9	Update B580 CN Doc (#12686 )	2025-01-09 19:10:57 +08:00
Jason Dai	aa9e70a347	Update B580 Doc (#12678 )	2025-01-08 22:36:48 +08:00
Shaojun Liu	2c23ce2553	Create a BattleMage QuickStart (#12663 ) * Create bmg_quickstart.md * Update bmg_quickstart.md * Clarify IPEX-LLM package installation based on use case * Update bmg_quickstart.md * Update bmg_quickstart.md	2025-01-08 14:58:37 +08:00
logicat	0534d7254f	Update docker_cpp_xpu_quickstart.md (#12667 )	2025-01-08 09:56:56 +08:00
Yuwen Hu	381d448ee2	[NPU] Example & Quickstart updates (#12650 ) * Remove model with optimize_model=False in NPU verified models tables, and remove related example * Remove experimental in run optimized model section title * Unify model table order & example cmd * Move embedding example to separate folder & update quickstart example link * Add Quickstart reference in main NPU readme * Small fix * Small fix * Move save/load examples under NPU/HF-Transformers-AutoModels * Add low-bit and polish arguments for LLM Python examples * Small fix * Add low-bit and polish arguments for Multi-Model examples * Polish argument for Embedding models * Polish argument for LLM CPP examples * Add low-bit and polish argument for Save-Load examples * Add accuracy tuning tips for examples * Update NPU qucikstart accuracy tuning with low-bit optimizations * Add save/load section to qucikstart * Update CPP example sample output to EN * Add installation regarding cmake for CPP examples * Small fix * Small fix * Small fix * Small fix * Small fix * Small fix * Unify max prompt length to 512 * Change recommended low-bit for Qwen2.5-3B-Instruct to asym_int4 * Update based on comments * Small fix	2025-01-07 13:52:41 +08:00
SONG Ge	550fa01649	[Doc] Update ipex-llm ollama troubleshooting for v0.4.6 (#12642 ) * update ollama v0.4.6 troubleshooting * update chinese ollama-doc	2025-01-02 17:28:54 +08:00
Yishuo Wang	2d08155513	remove bmm, which is only required in ipex 2.0 (#12630 )	2024-12-27 17:28:57 +08:00
binbin Deng	796ee571a5	[NPU doc] Update verified platforms (#12621 )	2024-12-26 17:39:13 +08:00
Mingqi Hu	0477fe6480	[docs] Update doc for latest open webui: 0.4.8 (#12591 ) * Update open webui doc * Resolve comments	2024-12-26 09:18:20 +08:00
binbin Deng	4e7e988f70	[NPU] Fix MTL and ARL support (#12580 )	2024-12-19 16:55:30 +08:00
SONG Ge	28e81fda8e	Replace runner doc in ollama quickstart (#12575 )	2024-12-18 19:05:28 +08:00
SONG Ge	f7a2bd21cf	Update ollama and llama.cpp readme (#12574 )	2024-12-18 17:33:20 +08:00
binbin Deng	694d14b2b4	[NPU doc] Add ARL runtime configuration (#12562 )	2024-12-17 16:08:42 +08:00
Yuwen Hu	d127a8654c	Small typo fixes (#12558 )	2024-12-17 13:54:13 +08:00
binbin Deng	680ea7e4a8	[NPU doc] Update configuration for different platforms (#12554 )	2024-12-17 10:15:09 +08:00
binbin Deng	caf15cc5ef	[NPU] Add `IPEX_LLM_NPU_MTL` to enable support on mtl (#12543 )	2024-12-13 17:01:13 +08:00
SONG Ge	5402fc65c8	[Ollama] Update ipex-llm ollama readme to v0.4.6 (#12542 ) * Update ipex-llm ollama readme to v0.4.6	2024-12-13 16:26:12 +08:00
Yuwen Hu	b747f3f6b8	Small fix to GPU installation guide (#12536 )	2024-12-13 10:02:47 +08:00
binbin Deng	6fc27da9c1	[NPU] Update glm-edge support in docs (#12529 )	2024-12-12 11:14:09 +08:00
Jinhe	5e1416c9aa	fix readme for npu cpp examples and llama.cpp (#12505 ) * fix cpp readme * fix cpp readme * fix cpp readme	2024-12-05 12:32:42 +08:00
joan726	ae9c2154f4	Added cross-links (#12494 ) * Update install_linux_gpu.zh-CN.md Add the link for guide of windows installation. * Update install_windows_gpu.zh-CN.md Add the link for guide of linux installation. * Update install_windows_gpu.md Add the link for guide of Linux installation. * Update install_linux_gpu.md Add the link for guide of Windows installation. * Update install_linux_gpu.md Modify based on comments. * Update install_windows_gpu.md Modify based on comments	2024-12-04 16:53:13 +08:00
Yuwen Hu	aee9acb303	Add NPU QuickStart & update example links (#12470 ) * Add initial NPU quickstart (c++ part unfinished) * Small update * Update based on comments * Update main readme * Remove LLaMA description * Small fix * Small fix * Remove subsection link in main README * Small fix * Update based on comments * Small fix * TOC update and other small fixes * Update for Chinese main readme * Update based on comments and other small fixes * Change order	2024-12-02 17:03:10 +08:00
Yuwen Hu	a2272b70d3	Small fix in llama.cpp troubleshooting guide (#12457 )	2024-11-27 19:22:11 +08:00
Chu,Youcheng	acd77d9e87	Remove env variable `BIGDL_LLM_XMX_DISABLED` in documentation (#12445 ) * fix: remove BIGDL_LLM_XMX_DISABLED in mddocs * fix: remove set SYCL_CACHE_PERSISTENT=1 in example * fix: remove BIGDL_LLM_XMX_DISABLED in workflows * fix: merge igpu and A-series Graphics * fix: remove set BIGDL_LLM_XMX_DISABLED=1 in example * fix: remove BIGDL_LLM_XMX_DISABLED in workflows * fix: merge igpu and A-series Graphics * fix: textual adjustment * fix: textual adjustment * fix: textual adjustment	2024-11-27 11:16:36 +08:00
Jun Wang	cb7b08948b	update vllm-docker-quick-start for vllm0.6.2 (#12392 ) * update vllm-docker-quick-start for vllm0.6.2 * [UPDATE] rm max-num-seqs parameter in vllm-serving script	2024-11-27 08:47:03 +08:00
joan726	a9cb70a71c	Add install_windows_gpu.zh-CN.md and install_linux_gpu.zh-CN.md (#12409 ) * Add install_linux_gpu.zh-CN.md * Add install_windows_gpu.zh-CN.md * Update llama_cpp_quickstart.zh-CN.md Related links updated to zh-CN version. * Update install_linux_gpu.zh-CN.md Added link to English version. * Update install_windows_gpu.zh-CN.md Add the link to English version. * Update install_windows_gpu.md Add the link to CN version. * Update install_linux_gpu.md Add the link to CN version. * Update README.zh-CN.md Modified the related link to zh-CN version.	2024-11-19 14:39:53 +08:00
Yuwen Hu	d1cde7fac4	Tiny doc fix (#12405 )	2024-11-15 10:28:38 +08:00
Xu, Shuo	6726b198fd	Update readme & doc for the vllm upgrade to v0.6.2 (#12399 ) Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-11-14 10:28:15 +08:00
Jun Wang	4376fdee62	Decouple the openwebui and the ollama. in inference-cpp-xpu dockerfile (#12382 ) * remove the openwebui in inference-cpp-xpu dockerfile * update docker_cpp_xpu_quickstart.md * add sample output in inference-cpp/readme * remove the openwebui in main readme * remove the openwebui in main readme	2024-11-12 20:15:23 +08:00
Shaojun Liu	fad15c8ca0	Update fastchat demo script (#12367 ) * Update README.md * Update vllm_docker_quickstart.md	2024-11-08 15:42:17 +08:00
Xin Qiu	7ef7696956	update linux installation doc (#12365 ) * update linux doc * update	2024-11-08 09:44:58 +08:00
Xin Qiu	520af4e9b5	Update install_linux_gpu.md (#12353 )	2024-11-07 16:08:01 +08:00
Jinhe	71ea539351	Add troubleshootings for ollama and llama.cpp (#12358 ) * add ollama troubleshoot en * zh ollama troubleshoot * llamacpp trouble shoot * llamacpp trouble shoot * fix * save gpu memory	2024-11-07 15:49:20 +08:00
Xu, Shuo	ce0c6ae423	Update Readme for FastChat docker demo (#12354 ) * update Readme for FastChat docker demo * update readme * add 'Serving with FastChat' part in docs * polish docs --------- Co-authored-by: ATMxsp01 <shou.xu@intel.com>	2024-11-07 15:22:42 +08:00
Jin, Qiao	3df6195cb0	Fix application quickstart (#12305 ) * fix graphrag quickstart * fix axolotl quickstart * fix ragflow quickstart * fix ragflow quickstart * fix graphrag toc * fix comments * fix comment * fix comments	2024-10-31 16:57:35 +08:00
joan726	0bbc04b5ec	Add ollama_quickstart.zh-CN.md (#12284 ) * Add ollama_quickstart.zh-CN.md Add ollama_quickstart.zh-CN.md * Update ollama_quickstart.zh-CN.md Add Chinese and English switching * Update ollama_quickstart.md Add Chinese and English switching * Update README.zh-CN.md Modify the related link to ollama_quickstart.zh-CN.md * Update ollama_quickstart.zh-CN.md Modified based on comments. * Update ollama_quickstart.zh-CN.md Modified based on comments	2024-10-29 15:12:44 +08:00
Yuwen Hu	42a528ded9	Small update to MTL iGPU Linux Prerequisites installation guide (#12281 ) * Small update MTL iGPU Linux Prerequisites installation guide * Small fix	2024-10-28 14:12:07 +08:00
Yuwen Hu	16074ae2a4	Update Linux prerequisites installation guide for MTL iGPU (#12263 ) * Update Linux prerequisites installation guide for MTL iGPU * Further link update * Small fixes * Small fix * Update based on comments * Small fix * Make oneAPI installation a shared section for both MTL iGPU and other GPU * Small fix * Small fix * Clarify description	2024-10-28 09:27:14 +08:00
Yuwen Hu	94c4568988	Update windows installation guide regarding troubleshooting (#12270 )	2024-10-25 14:32:38 +08:00
joan726	e0a95eb2d6	Add llama_cpp_quickstart.zh-CN.md (#12221 )	2024-10-24 16:08:31 +08:00
Jun Wang	aedc4edfba	[ADD] add open webui + vllm serving (#12246 )	2024-10-23 10:13:14 +08:00
Jun Wang	fe3b5cd89b	[Update] mmdocs/dockerguide vllm-quick-start awq,gptq online serving document (#12227 ) * [FIX] fix the docker start script error * [ADD] add awq online serving doc * [ADD] add gptq online serving doc * [Fix] small fix	2024-10-18 09:46:59 +08:00
Yuwen Hu	a768d71581	Small fix to LNL installation guide (#12192 )	2024-10-14 12:03:03 +08:00
Shaojun Liu	49eb20613a	add --blocksize to doc and script (#12187 )	2024-10-12 09:17:42 +08:00
Jun Wang	6ffaec66a2	[UPDATE] add prefix caching document into `vllm_docker_quickstart.md` (#12173 ) * [ADD] rewrite new vllm docker quick start * [ADD] lora adapter doc finished * [ADD] mulit lora adapter test successfully * [ADD] add ipex-llm quantization doc * [Merge] rebase main * [REMOVE] rm tmp file * [Merge] rebase main * [ADD] add prefix caching experiment and result * [REMOVE] rm cpu offloading chapter	2024-10-11 19:12:22 +08:00
Yuwen Hu	ddcdf47539	Support Windows ARL release (#12183 ) * Support release for ARL * Small fix * Small fix to doc * Temp for test * Remove temp commit for test	2024-10-11 18:30:52 +08:00
Yuwen Hu	ac44e98b7d	Update Windows guide regarding LNL support (#12178 ) * Update windows guide regarding LNL support * Update based on comments	2024-10-11 09:20:08 +08:00
Guancheng Fu	0ef7e1d101	fix vllm docs (#12176 )	2024-10-10 15:44:36 +08:00
Jun Wang	412cf8e20c	[UPDATE] update mddocs/DockerGuides/vllm_docker_quickstart.md (#12166 ) * [ADD] rewrite new vllm docker quick start * [ADD] lora adapter doc finished * [ADD] mulit lora adapter test successfully * [ADD] add ipex-llm quantization doc * [UPDATE] update mmdocs vllm_docker_quickstart content * [REMOVE] rm tmp file * [UPDATE] tp and pp explaination and readthedoc link change * [FIX] fix the error description of tp+pp and quantization part * [FIX] fix the table of verifed model * [UPDATE] add full low bit para list * [UPDATE] update the load_in_low_bit params to verifed dtype	2024-10-09 11:19:32 +08:00
Shaojun Liu	e2ef9e938e	Delete deprecated docs/readthedocs directory (#12164 )	2024-10-08 14:48:02 +08:00
Ch1y0q	9b75806d14	Update Windows GPU quickstart regarding demo (#12124 ) * use Qwen2-1.5B-Instruct in demo * update * add reference link * update * update	2024-09-29 18:08:49 +08:00
Ruonan Wang	a767438546	fix typo (#12076 ) * fix typo * fix	2024-09-13 11:44:42 +08:00
Ruonan Wang	3f0b24ae2b	update cpp quickstart (#12075 ) * update cpp quickstart * fix style	2024-09-13 11:35:32 +08:00
Ruonan Wang	48d9092b5a	upgrade OneAPI version for cpp Windows (#12063 ) * update version * update quickstart	2024-09-12 11:12:12 +08:00
Shaojun Liu	e5581e6ded	Select the Appropriate APT Repository Based on CPU Type (#12023 )	2024-09-05 17:06:07 +08:00
Yuwen Hu	643458d8f0	Update GraphRAG QuickStart (#11995 ) * Update GraphRAG QuickStart * Further updates * Small fixes * Small fix	2024-09-03 15:52:08 +08:00
Jinhe	e895e1b4c5	modification on llamacpp readme after Ipex-llm latest update (#11971 ) * update on readme after ipex-llm update * update on readme after ipex-llm update * rebase & delete redundancy * revise * add numbers for troubleshooting	2024-08-30 11:36:45 +08:00
Ch1y0q	77b04efcc5	add notes for `SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS` (#11936 ) * add notes for `SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS` * also update other quickstart	2024-08-30 09:26:47 +08:00
Jinhe	6fc9340d53	restore ollama webui quickstart (#11955 )	2024-08-29 17:53:19 +08:00
Jinhe	ec67ee7177	added accelerate version specification in open webui quickstart(#11948 )	2024-08-28 15:02:39 +08:00
Ruonan Wang	460bc96d32	update version of llama.cpp / ollama (#11930 ) * update version * fix version	2024-08-27 21:21:44 +08:00
Ch1y0q	5a8fc1baa2	update troubleshooting for llama.cpp and ollama (#11890 ) * update troubleshooting for llama.cpp and ollama * update * update	2024-08-26 20:55:23 +08:00
Jinhe	dbd14251dd	Troubleshoot for sycl not found (#11774 ) * added troubleshoot for sycl not found problem * added troubleshoot for sycl not found problem * revision on troubleshoot * revision on troubleshoot	2024-08-14 10:26:01 +08:00
Shaojun Liu	fac4c01a6e	Revert to use out-of-tree GPU driver (#11761 ) * Revert to use out-of-tree GPU driver since the performance with out-of-tree driver is better than upsteam's * add spaces * add troubleshooting case * update Troubleshooting	2024-08-12 13:41:47 +08:00
Yuwen Hu	7e61fa1af7	Revise GPU driver related guide in for Windows users (#11740 )	2024-08-08 11:26:26 +08:00
Jinhe	d0c89fb715	updated llama.cpp and ollama quickstart (#11732 ) * updated llama.cpp and ollama quickstart.md * added qwen2-1.5B sample output * revision on quickstart updates * revision on quickstart updates * revision on qwen2 readme * added 2 troubleshoots“ ” * troubleshoot revision	2024-08-08 11:04:01 +08:00
Qiyuan Gong	e32d13d78c	Remove Out of tree Driver from GPU driver installation document (#11728 ) GPU drivers are already upstreamed to Kernel 6.2+. Remove the out-of-tree driver (intel-i915-dkms) for 6.2-6.5. https://dgpu-docs.intel.com/driver/kernel-driver-types.html#gpu-driver-support * Remove intel-i915-dkms intel-fw-gpu (only for kernel 5.19)	2024-08-07 09:38:19 +08:00
Jason Dai	418640e466	Update install_gpu.md	2024-07-27 08:30:10 +08:00
Ruonan Wang	ac97b31664	update cpp quickstart about `ONEAPI_DEVICE_SELECTOR` (#11630 ) * update * update * small fix	2024-07-22 13:40:28 +08:00
Yuwen Hu	af6d406178	Add section title for conduct graphrag indexing (#11628 )	2024-07-22 10:23:26 +08:00

1 2 3 4 5 ...

904 commits