Commit graph

214 commits

Author SHA1 Message Date
Jason Dai
b229e5ad60
Update README.md (#13258) 2025-07-18 07:27:01 +08:00
Jason Dai
086a8b3ab9
Update flashmoe_quickstart (#13154) 2025-05-13 07:56:09 +08:00
Jason Dai
9da1c56fa8
Create flashmoe quickstart (#13147) 2025-05-12 10:11:22 +08:00
Yuwen Hu
0438e39f3e
Add PyTorch 2.6 support in Latest Update (#13144) 2025-05-09 13:26:49 +08:00
Jason Dai
6b033f8982
Update readme (#13116) 2025-04-27 18:18:19 +08:00
Yina Chen
a2a35fdfad
Update portable zip link (#13098)
* update  portable zip link

* update CN

* address comments

* update latest updates

* revert
2025-04-21 17:25:35 +08:00
Yuwen Hu
cd0d4857b8
ipex-llm 2.2.0 post-release update (#13053)
* Update ollama/llama.cpp release link to 2.2.0 (#13052)

* Post-update for releasing ipex-llm 2.2.0
2025-04-07 17:41:22 +08:00
Jason Dai
03c9024209
Update README (#12973) 2025-03-14 19:04:10 +08:00
Jason Dai
2a8f624f4b
Update README (#12956) 2025-03-09 09:04:13 +08:00
Jason Dai
975cf5f21f
Update README.md (#12939) 2025-03-06 08:04:27 +08:00
Yuwen Hu
68a770745b
Add moonlight GPU example (#12929)
* Add moonlight GPU example and update table

* Small fix

* Fix based on comments

* Small fix
2025-03-05 11:31:14 +08:00
Jason Dai
69edc8b6f6
Update quickstart (#12927) 2025-03-04 15:34:52 +08:00
Jason Dai
35e5fa851c
Update README.md (#12911) 2025-02-28 17:55:45 +08:00
Jason Dai
ad65e2b03a
Update README.md (#12900) 2025-02-27 08:30:06 +08:00
Yuwen Hu
06694ba61a
Further fix portable zip file link (#12885) 2025-02-24 18:06:57 +08:00
Xu, Shuo
1e00bed001
Add GPU example for Janus-Pro (#12869)
* Add example for Janus-Pro

* Update model link

* Fixes

* Fixes

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2025-02-21 18:36:50 +08:00
Jason Dai
38a682adb1
Update Readme (#12855) 2025-02-19 19:55:29 +08:00
Jason Dai
eaec64baca
Update README.md (#12826) 2025-02-14 21:20:57 +08:00
Jason Dai
16e63cbc18
Update readme (#12820) 2025-02-13 14:26:04 +08:00
Jason Dai
9c0daf6396
Fix readme links (#12771) 2025-02-05 19:24:25 +08:00
Jason Dai
a1e7bfc638
Update Readme (#12770) 2025-02-05 19:19:57 +08:00
Yuwen Hu
d11f257ee7
Add GPU example for MiniCPM-o-2_6 (#12735)
* Add init example for omni mode

* Small fix

* Small fix

* Add chat example

* Remove lagecy link

* Further update link

* Add readme

* Small fix

* Update main readme link

* Update based on comments

* Small fix

* Small fix

* Small fix
2025-01-23 16:10:19 +08:00
Jason Dai
7e29edcc4b
Update Readme (#12730) 2025-01-22 08:43:32 +08:00
Jason Dai
412bfd6644
Update readme (#12724) 2025-01-21 10:59:14 +08:00
Xu, Shuo
350fae285d
Add Qwen2-VL HF GPU example with ModelScope Support (#12606)
* Add qwen2-vl example

* complete generate.py & readme

* improve lint style

* update 1-6

* update main readme

* Format and other small fixes

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2025-01-13 15:42:04 +08:00
Jason Dai
cbb8e2a2d5
Update documents (#12693) 2025-01-10 10:47:11 +08:00
Jason Dai
aa9e70a347
Update B580 Doc (#12678) 2025-01-08 22:36:48 +08:00
Jason Dai
c6f57ad6ed
Update README.md (#12677) 2025-01-08 21:55:52 +08:00
Jason Dai
2321e8d60c
Update README.md (#12676) 2025-01-08 21:54:31 +08:00
Yuwen Hu
381d448ee2
[NPU] Example & Quickstart updates (#12650)
* Remove model with optimize_model=False in NPU verified models tables, and remove related example

* Remove experimental in run optimized model section title

* Unify model table order & example cmd

* Move embedding example to separate folder & update quickstart example link

* Add Quickstart reference in main NPU readme

* Small fix

* Small fix

* Move save/load examples under NPU/HF-Transformers-AutoModels

* Add low-bit and polish arguments for LLM Python examples

* Small fix

* Add low-bit and polish arguments for Multi-Model  examples

* Polish argument for Embedding models

* Polish argument for LLM CPP examples

* Add low-bit and polish argument for Save-Load examples

* Add accuracy tuning tips for examples

* Update NPU qucikstart accuracy tuning with low-bit optimizations

* Add save/load section to qucikstart

* Update CPP example sample output to EN

* Add installation regarding cmake for CPP examples

* Small fix

* Small fix

* Small fix

* Small fix

* Small fix

* Small fix

* Unify max prompt length to 512

* Change recommended low-bit for Qwen2.5-3B-Instruct to asym_int4

* Update based on comments

* Small fix
2025-01-07 13:52:41 +08:00
Xu, Shuo
55ce091242
Add GLM4-Edge-V GPU example (#12596)
* Add GLM4-Edge-V examples

* polish readme

* revert wrong changes

* polish readme

* polish readme

* little polish in reference info and indent

* Small fix and sample output updates

* Update main readme

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-12-27 09:40:29 +08:00
SONG Ge
f7a2bd21cf
Update ollama and llama.cpp readme (#12574) 2024-12-18 17:33:20 +08:00
Jason Dai
6e801bc4e1
Update readme (#12565) 2024-12-18 09:33:16 +08:00
Chu,Youcheng
a86487c539
Add GLM-Edge GPU example (#12483)
* feat: initial commit

* generate.py and README updates

* Update link for main readme

* Update based on comments

* Small fix

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-12-16 14:39:19 +08:00
binbin Deng
6fc27da9c1
[NPU] Update glm-edge support in docs (#12529) 2024-12-12 11:14:09 +08:00
Yuwen Hu
60bafab855
Small fixes to main readme (#12508) 2024-12-05 16:08:43 +08:00
Jason Dai
0a3eda06d0
Update README.md (#12507) 2024-12-05 15:46:53 +08:00
Yuwen Hu
727f29968c
Add NPU demo gif to main readme (#12503)
* Add NPU demo gif to main readme

* Small fix

* Update based on comments

* Test on style fix
2024-12-05 12:24:27 +08:00
Jason Dai
80f15e41f5
Update README.md (#12489) 2024-12-03 18:02:28 +08:00
Yuwen Hu
aee9acb303
Add NPU QuickStart & update example links (#12470)
* Add initial NPU quickstart (c++ part unfinished)

* Small update

* Update based on comments

* Update main readme

* Remove LLaMA description

* Small fix

* Small fix

* Remove subsection link in main README

* Small fix

* Update based on comments

* Small fix

* TOC update and other small fixes

* Update for Chinese main readme

* Update based on comments and other small fixes

* Change order
2024-12-02 17:03:10 +08:00
Jinhe
d2a37b6ab2
add Stable diffusion examples (#12418)
* add openjourney example

* add timing

* add stable diffusion to model page

* 4.1 fix

* small fix
2024-11-20 17:18:36 +08:00
Jun Wang
4376fdee62
Decouple the openwebui and the ollama. in inference-cpp-xpu dockerfile (#12382)
* remove the openwebui in inference-cpp-xpu dockerfile

* update docker_cpp_xpu_quickstart.md

* add sample output in inference-cpp/readme

* remove the openwebui in main readme

* remove the openwebui in main readme
2024-11-12 20:15:23 +08:00
Jason Dai
1cef0c4948
Update README.md (#12286) 2024-10-28 17:06:16 +08:00
Jason Dai
a35cf4d533
Update README.md (#12242) 2024-10-22 10:19:07 +08:00
Yuwen Hu
7da3ab7322
Add missing link for Llama3.2-Vision (#12197) 2024-10-14 17:19:49 +08:00
Jinhe
f983f1a8f4
Add Qwen2-VL gpu example (#12135)
* qwen2-vl readme

* add qwen2-vl example

* fix

* fix

* fix

* add link

* Update regarding modules_to_not_convert and readme

* Further fix

* Small fix

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-10-11 18:25:23 +08:00
Ch1y0q
17c23cd759
add llama3.2 GPU example (#12137)
* add llama3.2 GPU example

* change prompt format reference url

* update

* add Meta-Llama-3.2-1B-Instruct sample output

* update wording
2024-09-29 14:41:54 +08:00
Ch1y0q
2ea13d502f
Add minicpm3 gpu example (#12114)
* add minicpm3 gpu example

* update GPU example

* update

---------

Co-authored-by: Huang, Xinshengzi <xinshengzi.huang@intel.com>
2024-09-26 13:51:37 +08:00
Ch1y0q
2269768e71
add internvl2 example (#12102)
* add internvl2 example

* add to README.md

* update

* add link to zh-CN readme
2024-09-20 16:31:54 +08:00
joan726
ad1fe77fe6
Add language switching (#12096) 2024-09-20 16:05:20 +08:00