Commit graph

904 commits

Author SHA1 Message Date
Jason Dai
35e5fa851c
Update README.md (#12911) 2025-02-28 17:55:45 +08:00
binbin Deng
8351f6c455
[NPU] Add QuickStart for llama.cpp NPU portable zip (#12899) 2025-02-28 17:19:18 +08:00
Xin Qiu
029480f4a8
llama cpp portable zip Quickstart (#12894)
* llamacpp_quickstart

* update

* Update llamacpp_portable_zip_gpu_quickstart.md

* Update llamacpp_portable_zip_gpu_quickstart.md

* Update llamacpp_portable_zip_gpu_quickstart.md

* Update llamacpp_portable_zip_gpu_quickstart.md

* Update llamacpp_portable_zip_gpu_quickstart.md

* Update llamacpp_portable_zip_gpu_quickstart.md

* Update llamacpp_portable_zip_gpu_quickstart.md

* Update llamacpp_portable_zip_gpu_quickstart.md

* Update llamacpp_portable_zip_gpu_quickstart.md
2025-02-28 15:45:11 +08:00
Yuwen Hu
8d94752c4b
Ollama portable zip QuickStart updates regarding more tips (#12905)
* Update for select multiple GPUs

* Update Ollama portable zip quickstarts regarding more tips

* Small fix
2025-02-28 15:10:56 +08:00
Yuwen Hu
671ddfd847
Update wrong file name for portable zip quickstart (#12883) 2025-02-24 17:52:09 +08:00
Yuwen Hu
a9c8e73a77
Update llama.cpp Prerequisites guide regarding oneAPI 2025.0 (#12881)
* Update llama.cpp Prerequisites guide regarding oneAPI 2025.0

* Update based on comments

* Small fix

* Small fix
2025-02-24 16:32:23 +08:00
Yuwen Hu
21d6a78be0
Update Ollama portable zip QuickStart to fit new version (#12871)
* Update ollama portable zip quickstart

* Update demo images
2025-02-21 17:54:14 +08:00
binbin Deng
8077850452
[NPU GGUF] Add simple example (#12853) 2025-02-21 09:58:00 +08:00
Yuwen Hu
a488981f3f
Ollama portable zip QuickStart tiny fix (#12862)
* Tiny fix to ollama portable zip quickstart

* Tiny fix
2025-02-20 14:11:12 +08:00
Yuwen Hu
0f2706be42
Update CN Ollama portable zip QuickStart for troubleshooting & tips (#12860)
* Small fix for english version

* Update CN ollama portable zip quickstart for troubleshooting & tips

* Small fix
2025-02-20 11:32:06 +08:00
Jason Dai
38a682adb1
Update Readme (#12855) 2025-02-19 19:55:29 +08:00
Xin Qiu
c81b7fc003
Add Portable zip Linux QuickStart (#12849)
* linux doc

* update

* Update ollama_portablze_zip_quickstart.md

* Update ollama_portablze_zip_quickstart.md

* Update ollama_portablze_zip_quickstart.zh-CN.md

* Update ollama_portablze_zip_quickstart.md

* meet code review

* update

* Add tips & troubleshooting sections for both Linux & Windows

* Rebase

* Fix based on comments

* Small fix

* Fix img

* Update table for linux

* Small fix

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2025-02-19 19:13:55 +08:00
SONG Ge
5d041f9ebf
Add latest models list in ollama quickstart (#12850)
* Add latest models llist on ollama quickstart

* update oneapi version describe

* move models list to ollama_portable_zip doc

* update CN readme
2025-02-19 18:29:43 +08:00
Yuwen Hu
637543e135
Update Ollama portable zip QuickStart with troubleshooting (#12846)
* Update ollama portable zip quickstart with runtime configurations

* Small fix

* Update based on comments

* Small fix

* Small fix
2025-02-19 11:04:03 +08:00
binbin Deng
bde8acc303
[NPU] Update doc of gguf support (#12837) 2025-02-19 10:46:35 +08:00
Shaojun Liu
f7b5a093a7
Merge CPU & XPU Dockerfiles with Serving Images and Refactor (#12815)
* Update Dockerfile

* Update Dockerfile

* Ensure scripts are executable

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* update

* Update Dockerfile

* remove inference-cpu and inference-xpu

* update README
2025-02-17 14:23:22 +08:00
joan726
59e8e1e91e
Added ollama_portablze_zip_quickstart.zh-CN.md (#12822) 2025-02-14 18:54:12 +08:00
Jason Dai
a09552e59a
Update ollama quickstart (#12823) 2025-02-14 09:55:48 +08:00
Yuwen Hu
f67986021c
Update download link for Ollama portable zip QuickStart (#12821)
* Update download link for Ollama portable zip quickstart

* Update based on comments
2025-02-13 17:48:02 +08:00
Jason Dai
16e63cbc18
Update readme (#12820) 2025-02-13 14:26:04 +08:00
Yuwen Hu
68414afcb9
Add initial QuickStart for Ollama portable zip (#12817)
* Add initial quickstart for Ollama portable zip

* Small fix

* Fixed based on comments

* Small fix

* Add demo image for run ollama

* Update download link
2025-02-13 13:18:14 +08:00
binbin Deng
d093b75aa0
[NPU] Update driver installation in QuickStart (#12807) 2025-02-11 15:49:21 +08:00
binbin Deng
6ff7faa781
[NPU] Update deepseek support in python examples and quickstart (#12786) 2025-02-07 11:25:16 +08:00
Shaojun Liu
ee809e71df
add troubleshooting section (#12755) 2025-01-26 11:03:58 +08:00
Shaojun Liu
53aae24616
Add note about enabling Resizable BAR in BIOS for GPU setup (#12715) 2025-01-16 16:22:35 +08:00
binbin Deng
36bf3d8e29
[NPU doc] Update ARL product in QuickStart (#12708) 2025-01-15 15:57:06 +08:00
SONG Ge
e2d58f733e
Update ollama v0.5.1 document (#12699)
* Update ollama document version and known issue
2025-01-10 18:04:49 +08:00
joan726
584c1c5373
Update B580 CN doc (#12695) 2025-01-10 11:20:47 +08:00
Jason Dai
cbb8e2a2d5
Update documents (#12693) 2025-01-10 10:47:11 +08:00
Jason Dai
f9b29a4f56
Update B580 doc (#12691) 2025-01-10 08:59:35 +08:00
joan726
66d4385cc9
Update B580 CN Doc (#12686) 2025-01-09 19:10:57 +08:00
Jason Dai
aa9e70a347
Update B580 Doc (#12678) 2025-01-08 22:36:48 +08:00
Shaojun Liu
2c23ce2553
Create a BattleMage QuickStart (#12663)
* Create bmg_quickstart.md

* Update bmg_quickstart.md

* Clarify IPEX-LLM package installation based on use case

* Update bmg_quickstart.md

* Update bmg_quickstart.md
2025-01-08 14:58:37 +08:00
logicat
0534d7254f
Update docker_cpp_xpu_quickstart.md (#12667) 2025-01-08 09:56:56 +08:00
Yuwen Hu
381d448ee2
[NPU] Example & Quickstart updates (#12650)
* Remove model with optimize_model=False in NPU verified models tables, and remove related example

* Remove experimental in run optimized model section title

* Unify model table order & example cmd

* Move embedding example to separate folder & update quickstart example link

* Add Quickstart reference in main NPU readme

* Small fix

* Small fix

* Move save/load examples under NPU/HF-Transformers-AutoModels

* Add low-bit and polish arguments for LLM Python examples

* Small fix

* Add low-bit and polish arguments for Multi-Model  examples

* Polish argument for Embedding models

* Polish argument for LLM CPP examples

* Add low-bit and polish argument for Save-Load examples

* Add accuracy tuning tips for examples

* Update NPU qucikstart accuracy tuning with low-bit optimizations

* Add save/load section to qucikstart

* Update CPP example sample output to EN

* Add installation regarding cmake for CPP examples

* Small fix

* Small fix

* Small fix

* Small fix

* Small fix

* Small fix

* Unify max prompt length to 512

* Change recommended low-bit for Qwen2.5-3B-Instruct to asym_int4

* Update based on comments

* Small fix
2025-01-07 13:52:41 +08:00
SONG Ge
550fa01649
[Doc] Update ipex-llm ollama troubleshooting for v0.4.6 (#12642)
* update ollama v0.4.6 troubleshooting

* update chinese ollama-doc
2025-01-02 17:28:54 +08:00
Yishuo Wang
2d08155513
remove bmm, which is only required in ipex 2.0 (#12630) 2024-12-27 17:28:57 +08:00
binbin Deng
796ee571a5
[NPU doc] Update verified platforms (#12621) 2024-12-26 17:39:13 +08:00
Mingqi Hu
0477fe6480
[docs] Update doc for latest open webui: 0.4.8 (#12591)
* Update open webui doc

* Resolve comments
2024-12-26 09:18:20 +08:00
binbin Deng
4e7e988f70
[NPU] Fix MTL and ARL support (#12580) 2024-12-19 16:55:30 +08:00
SONG Ge
28e81fda8e
Replace runner doc in ollama quickstart (#12575) 2024-12-18 19:05:28 +08:00
SONG Ge
f7a2bd21cf
Update ollama and llama.cpp readme (#12574) 2024-12-18 17:33:20 +08:00
binbin Deng
694d14b2b4
[NPU doc] Add ARL runtime configuration (#12562) 2024-12-17 16:08:42 +08:00
Yuwen Hu
d127a8654c
Small typo fixes (#12558) 2024-12-17 13:54:13 +08:00
binbin Deng
680ea7e4a8
[NPU doc] Update configuration for different platforms (#12554) 2024-12-17 10:15:09 +08:00
binbin Deng
caf15cc5ef
[NPU] Add IPEX_LLM_NPU_MTL to enable support on mtl (#12543) 2024-12-13 17:01:13 +08:00
SONG Ge
5402fc65c8
[Ollama] Update ipex-llm ollama readme to v0.4.6 (#12542)
* Update ipex-llm ollama readme to v0.4.6
2024-12-13 16:26:12 +08:00
Yuwen Hu
b747f3f6b8
Small fix to GPU installation guide (#12536) 2024-12-13 10:02:47 +08:00
binbin Deng
6fc27da9c1
[NPU] Update glm-edge support in docs (#12529) 2024-12-12 11:14:09 +08:00
Jinhe
5e1416c9aa
fix readme for npu cpp examples and llama.cpp (#12505)
* fix cpp readme

* fix cpp readme

* fix cpp readme
2024-12-05 12:32:42 +08:00
joan726
ae9c2154f4
Added cross-links (#12494)
* Update install_linux_gpu.zh-CN.md

Add the link for guide of windows installation.

* Update install_windows_gpu.zh-CN.md

Add the link for guide of linux installation.

* Update install_windows_gpu.md

Add the link for guide of Linux installation.

* Update install_linux_gpu.md

Add the link for guide of Windows installation.

* Update install_linux_gpu.md

Modify based on comments.

* Update install_windows_gpu.md

Modify based on comments
2024-12-04 16:53:13 +08:00
Yuwen Hu
aee9acb303
Add NPU QuickStart & update example links (#12470)
* Add initial NPU quickstart (c++ part unfinished)

* Small update

* Update based on comments

* Update main readme

* Remove LLaMA description

* Small fix

* Small fix

* Remove subsection link in main README

* Small fix

* Update based on comments

* Small fix

* TOC update and other small fixes

* Update for Chinese main readme

* Update based on comments and other small fixes

* Change order
2024-12-02 17:03:10 +08:00
Yuwen Hu
a2272b70d3
Small fix in llama.cpp troubleshooting guide (#12457) 2024-11-27 19:22:11 +08:00
Chu,Youcheng
acd77d9e87
Remove env variable BIGDL_LLM_XMX_DISABLED in documentation (#12445)
* fix: remove BIGDL_LLM_XMX_DISABLED in mddocs

* fix: remove set SYCL_CACHE_PERSISTENT=1 in example

* fix: remove BIGDL_LLM_XMX_DISABLED in workflows

* fix: merge igpu and A-series Graphics

* fix: remove set BIGDL_LLM_XMX_DISABLED=1 in example

* fix: remove BIGDL_LLM_XMX_DISABLED in workflows

* fix: merge igpu and A-series Graphics

* fix: textual adjustment

* fix: textual adjustment

* fix: textual adjustment
2024-11-27 11:16:36 +08:00
Jun Wang
cb7b08948b
update vllm-docker-quick-start for vllm0.6.2 (#12392)
* update vllm-docker-quick-start for vllm0.6.2

* [UPDATE] rm max-num-seqs parameter in vllm-serving script
2024-11-27 08:47:03 +08:00
joan726
a9cb70a71c
Add install_windows_gpu.zh-CN.md and install_linux_gpu.zh-CN.md (#12409)
* Add install_linux_gpu.zh-CN.md

* Add install_windows_gpu.zh-CN.md

* Update llama_cpp_quickstart.zh-CN.md

Related links updated to zh-CN version.

* Update install_linux_gpu.zh-CN.md

Added link to English version.

* Update install_windows_gpu.zh-CN.md

Add the link to English version.

* Update install_windows_gpu.md

Add the link to CN version.

* Update install_linux_gpu.md

Add the link to CN version.

* Update README.zh-CN.md

Modified the related link to zh-CN version.
2024-11-19 14:39:53 +08:00
Yuwen Hu
d1cde7fac4
Tiny doc fix (#12405) 2024-11-15 10:28:38 +08:00
Xu, Shuo
6726b198fd
Update readme & doc for the vllm upgrade to v0.6.2 (#12399)
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-11-14 10:28:15 +08:00
Jun Wang
4376fdee62
Decouple the openwebui and the ollama. in inference-cpp-xpu dockerfile (#12382)
* remove the openwebui in inference-cpp-xpu dockerfile

* update docker_cpp_xpu_quickstart.md

* add sample output in inference-cpp/readme

* remove the openwebui in main readme

* remove the openwebui in main readme
2024-11-12 20:15:23 +08:00
Shaojun Liu
fad15c8ca0
Update fastchat demo script (#12367)
* Update README.md

* Update vllm_docker_quickstart.md
2024-11-08 15:42:17 +08:00
Xin Qiu
7ef7696956
update linux installation doc (#12365)
* update linux doc

* update
2024-11-08 09:44:58 +08:00
Xin Qiu
520af4e9b5
Update install_linux_gpu.md (#12353) 2024-11-07 16:08:01 +08:00
Jinhe
71ea539351
Add troubleshootings for ollama and llama.cpp (#12358)
* add ollama troubleshoot en

* zh ollama troubleshoot

* llamacpp trouble shoot

* llamacpp trouble shoot

* fix

* save gpu memory
2024-11-07 15:49:20 +08:00
Xu, Shuo
ce0c6ae423
Update Readme for FastChat docker demo (#12354)
* update Readme for FastChat docker demo

* update readme

* add 'Serving with FastChat' part in docs

* polish docs

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-11-07 15:22:42 +08:00
Jin, Qiao
3df6195cb0
Fix application quickstart (#12305)
* fix graphrag quickstart

* fix axolotl quickstart

* fix ragflow quickstart

* fix ragflow quickstart

* fix graphrag toc

* fix comments

* fix comment

* fix comments
2024-10-31 16:57:35 +08:00
joan726
0bbc04b5ec
Add ollama_quickstart.zh-CN.md (#12284)
* Add ollama_quickstart.zh-CN.md

Add ollama_quickstart.zh-CN.md

* Update ollama_quickstart.zh-CN.md

Add Chinese and English switching

* Update ollama_quickstart.md

Add Chinese and English switching

* Update README.zh-CN.md

Modify the related link to ollama_quickstart.zh-CN.md

* Update ollama_quickstart.zh-CN.md

Modified based on comments.

* Update ollama_quickstart.zh-CN.md

Modified based on comments
2024-10-29 15:12:44 +08:00
Yuwen Hu
42a528ded9
Small update to MTL iGPU Linux Prerequisites installation guide (#12281)
* Small update MTL iGPU Linux Prerequisites installation guide

* Small fix
2024-10-28 14:12:07 +08:00
Yuwen Hu
16074ae2a4
Update Linux prerequisites installation guide for MTL iGPU (#12263)
* Update Linux prerequisites installation guide for MTL iGPU

* Further link update

* Small fixes

* Small fix

* Update based on comments

* Small fix

* Make oneAPI installation a shared section for both MTL iGPU and other GPU

* Small fix

* Small fix

* Clarify description
2024-10-28 09:27:14 +08:00
Yuwen Hu
94c4568988
Update windows installation guide regarding troubleshooting (#12270) 2024-10-25 14:32:38 +08:00
joan726
e0a95eb2d6
Add llama_cpp_quickstart.zh-CN.md (#12221) 2024-10-24 16:08:31 +08:00
Jun Wang
aedc4edfba
[ADD] add open webui + vllm serving (#12246) 2024-10-23 10:13:14 +08:00
Jun Wang
fe3b5cd89b
[Update] mmdocs/dockerguide vllm-quick-start awq,gptq online serving document (#12227)
* [FIX] fix the docker start script error

* [ADD] add awq online serving doc

* [ADD] add gptq online serving doc

* [Fix] small fix
2024-10-18 09:46:59 +08:00
Yuwen Hu
a768d71581
Small fix to LNL installation guide (#12192) 2024-10-14 12:03:03 +08:00
Shaojun Liu
49eb20613a
add --blocksize to doc and script (#12187) 2024-10-12 09:17:42 +08:00
Jun Wang
6ffaec66a2
[UPDATE] add prefix caching document into vllm_docker_quickstart.md (#12173)
* [ADD] rewrite new vllm docker quick start

* [ADD] lora adapter doc finished

* [ADD] mulit lora adapter test successfully

* [ADD] add ipex-llm quantization doc

* [Merge] rebase main

* [REMOVE] rm tmp file

* [Merge] rebase main

* [ADD] add prefix caching experiment and result

* [REMOVE] rm cpu offloading chapter
2024-10-11 19:12:22 +08:00
Yuwen Hu
ddcdf47539
Support Windows ARL release (#12183)
* Support release for ARL

* Small fix

* Small fix to doc

* Temp for test

* Remove temp commit for test
2024-10-11 18:30:52 +08:00
Yuwen Hu
ac44e98b7d
Update Windows guide regarding LNL support (#12178)
* Update windows guide regarding LNL support

* Update based on comments
2024-10-11 09:20:08 +08:00
Guancheng Fu
0ef7e1d101
fix vllm docs (#12176) 2024-10-10 15:44:36 +08:00
Jun Wang
412cf8e20c
[UPDATE] update mddocs/DockerGuides/vllm_docker_quickstart.md (#12166)
* [ADD] rewrite new vllm docker quick start

* [ADD] lora adapter doc finished

* [ADD] mulit lora adapter test successfully

* [ADD] add ipex-llm quantization doc

* [UPDATE] update mmdocs vllm_docker_quickstart content

* [REMOVE] rm tmp file

* [UPDATE] tp and pp explaination and readthedoc link change

* [FIX] fix the error description of tp+pp and quantization part

* [FIX] fix the table of verifed model

* [UPDATE] add full low bit para list

* [UPDATE] update the load_in_low_bit params to verifed dtype
2024-10-09 11:19:32 +08:00
Shaojun Liu
e2ef9e938e
Delete deprecated docs/readthedocs directory (#12164) 2024-10-08 14:48:02 +08:00
Ch1y0q
9b75806d14
Update Windows GPU quickstart regarding demo (#12124)
* use Qwen2-1.5B-Instruct in demo

* update

* add reference link

* update

* update
2024-09-29 18:08:49 +08:00
Ruonan Wang
a767438546
fix typo (#12076)
* fix typo

* fix
2024-09-13 11:44:42 +08:00
Ruonan Wang
3f0b24ae2b
update cpp quickstart (#12075)
* update cpp quickstart

* fix style
2024-09-13 11:35:32 +08:00
Ruonan Wang
48d9092b5a
upgrade OneAPI version for cpp Windows (#12063)
* update version

* update quickstart
2024-09-12 11:12:12 +08:00
Shaojun Liu
e5581e6ded
Select the Appropriate APT Repository Based on CPU Type (#12023) 2024-09-05 17:06:07 +08:00
Yuwen Hu
643458d8f0
Update GraphRAG QuickStart (#11995)
* Update GraphRAG QuickStart

* Further updates

* Small fixes

* Small fix
2024-09-03 15:52:08 +08:00
Jinhe
e895e1b4c5
modification on llamacpp readme after Ipex-llm latest update (#11971)
* update on readme after ipex-llm update

* update on readme after ipex-llm update

* rebase & delete redundancy

* revise

* add numbers for troubleshooting
2024-08-30 11:36:45 +08:00
Ch1y0q
77b04efcc5
add notes for SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS (#11936)
* add notes for `SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS`

* also update other quickstart
2024-08-30 09:26:47 +08:00
Jinhe
6fc9340d53
restore ollama webui quickstart (#11955) 2024-08-29 17:53:19 +08:00
Jinhe
ec67ee7177
added accelerate version specification in open webui quickstart(#11948) 2024-08-28 15:02:39 +08:00
Ruonan Wang
460bc96d32
update version of llama.cpp / ollama (#11930)
* update version

* fix version
2024-08-27 21:21:44 +08:00
Ch1y0q
5a8fc1baa2
update troubleshooting for llama.cpp and ollama (#11890)
* update troubleshooting for llama.cpp and ollama

* update

* update
2024-08-26 20:55:23 +08:00
Jinhe
dbd14251dd
Troubleshoot for sycl not found (#11774)
* added troubleshoot for sycl not found problem

* added troubleshoot for sycl not found problem

* revision on troubleshoot

* revision on troubleshoot
2024-08-14 10:26:01 +08:00
Shaojun Liu
fac4c01a6e
Revert to use out-of-tree GPU driver (#11761)
* Revert to use out-of-tree GPU driver since the performance with out-of-tree driver is better than upsteam's

* add spaces

* add troubleshooting case

* update Troubleshooting
2024-08-12 13:41:47 +08:00
Yuwen Hu
7e61fa1af7
Revise GPU driver related guide in for Windows users (#11740) 2024-08-08 11:26:26 +08:00
Jinhe
d0c89fb715
updated llama.cpp and ollama quickstart (#11732)
* updated llama.cpp and ollama quickstart.md

* added qwen2-1.5B sample output

* revision on quickstart updates

* revision on quickstart updates

* revision on qwen2 readme

* added 2 troubleshoots“
”

* troubleshoot revision
2024-08-08 11:04:01 +08:00
Qiyuan Gong
e32d13d78c
Remove Out of tree Driver from GPU driver installation document (#11728)
GPU drivers are already upstreamed to Kernel 6.2+. Remove the out-of-tree driver (intel-i915-dkms) for 6.2-6.5. https://dgpu-docs.intel.com/driver/kernel-driver-types.html#gpu-driver-support
* Remove intel-i915-dkms intel-fw-gpu (only for kernel 5.19)
2024-08-07 09:38:19 +08:00
Jason Dai
418640e466
Update install_gpu.md 2024-07-27 08:30:10 +08:00
Ruonan Wang
ac97b31664
update cpp quickstart about ONEAPI_DEVICE_SELECTOR (#11630)
* update

* update

* small fix
2024-07-22 13:40:28 +08:00
Yuwen Hu
af6d406178
Add section title for conduct graphrag indexing (#11628) 2024-07-22 10:23:26 +08:00