Commit graph

404 commits

Author SHA1 Message Date
Xiangyu Tian
51b41faad7
vLLM: update vLLM XPU to 0.8.3 version (#13118)
vLLM: update vLLM XPU to 0.8.3 version
2025-04-30 14:40:53 +08:00
Guancheng Fu
d222eaffd7
Update README.md (#13113) 2025-04-27 17:13:18 +08:00
Guancheng Fu
0cfdd399e7
Update README.md (#13104) 2025-04-24 10:21:17 +08:00
Guancheng Fu
14cd613fe1
Update vLLM docs with some new features (#13092)
* done

* fix

* done

* Update README.md
2025-04-22 14:39:28 +08:00
Yuwen Hu
0801d27a6f
Remove PyTorch 2.3 support for Intel GPU (#13097)
* Remove PyTorch 2.3 installation option for GPU

* Remove xpu_lnl option in installation guides for docs

* Update BMG quickstart

* Remove PyTorch 2.3 dependencies for GPU examples

* Update the graphmode example to use stable version 2.2.0

* Fix based on comments
2025-04-22 10:26:16 +08:00
Ruonan Wang
27d669210f
remove fschat in EAGLE example (#13005)
* update fschat version

* fix
2025-03-25 15:48:48 +08:00
Heyang Sun
cd109bb061
Gemma QLoRA example (#12969)
* Gemma QLoRA example

* Update README.md

* Update README.md

---------

Co-authored-by: sgwhat <ge.song@intel.com>
2025-03-14 14:27:51 +08:00
Yuwen Hu
7c0c77cce3
Tiny fixes (#12936) 2025-03-05 14:55:26 +08:00
Yuwen Hu
68a770745b
Add moonlight GPU example (#12929)
* Add moonlight GPU example and update table

* Small fix

* Fix based on comments

* Small fix
2025-03-05 11:31:14 +08:00
Yuwen Hu
443cb5d4e0
Update Janus-Pro GPU example (#12906) 2025-02-28 15:39:03 +08:00
Xin Qiu
e946127613
glm 4v 1st sdp for vision (#12904)
* glm4v 1st sdp

* update glm4v example

* meet code review

* fix style
2025-02-28 13:23:27 +08:00
Guancheng Fu
02ec313eab
Update README.md (#12877) 2025-02-24 09:59:17 +08:00
Xu, Shuo
1e00bed001
Add GPU example for Janus-Pro (#12869)
* Add example for Janus-Pro

* Update model link

* Fixes

* Fixes

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2025-02-21 18:36:50 +08:00
Guancheng Fu
4eed0c7d99
initial implementation for low_bit_loader vLLM (#12838)
* initial

* add logic for handling tensor parallel models

* fix

* Add some comments

* add doc

* fix done
2025-02-19 19:45:34 +08:00
Xiangyu Tian
b26409d53f
R1 Hybrid: Add Benchmark for DeepSeek R1 transformers example (#12854)
* init

* fix

* update

* update

* fix

* fix
2025-02-19 18:33:21 +08:00
Xiangyu Tian
93c10be762
LLM: Support hybrid convert for DeepSeek V3/R1 (#12834)
LLM: Support hybrid convert for DeepSeek V3/R1
2025-02-19 11:31:19 +08:00
Xiangyu Tian
09150b6058
Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 (#12832)
Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 with DeepseekV3Attention
and DeepseekV3MLP to XPU
2025-02-18 13:34:14 +08:00
Yishuo Wang
8aea5319bb
update more lora example (#12785) 2025-02-08 09:46:48 +08:00
Yishuo Wang
d0d9c9d636
remove load_in_8bit usage as it is not supported a long time ago (#12779) 2025-02-07 11:21:29 +08:00
Yishuo Wang
b4c9e23f73
fix galore and peft finetune example (#12776) 2025-02-06 16:36:13 +08:00
Yishuo Wang
c0d6b282b8
fix lisa finetune example (#12775) 2025-02-06 16:35:43 +08:00
Yishuo Wang
2e5f2e5dda
fix dpo finetune (#12774) 2025-02-06 16:35:21 +08:00
Yishuo Wang
9697197f3e
fix qlora finetune example (#12769) 2025-02-06 11:18:28 +08:00
Yuwen Hu
184adb2653
Small fix to MiniCPM-o-2_6 GPU example (#12766) 2025-02-05 11:32:26 +08:00
Yuwen Hu
d11f257ee7
Add GPU example for MiniCPM-o-2_6 (#12735)
* Add init example for omni mode

* Small fix

* Small fix

* Add chat example

* Remove lagecy link

* Further update link

* Add readme

* Small fix

* Update main readme link

* Update based on comments

* Small fix

* Small fix

* Small fix
2025-01-23 16:10:19 +08:00
Yuwen Hu
c52bdff76b
Update Deepseek coder GPU example (#12712)
* Update Deepseek coder GPU example

* Fix based on comment
2025-01-16 14:05:31 +08:00
Xu, Shuo
350fae285d
Add Qwen2-VL HF GPU example with ModelScope Support (#12606)
* Add qwen2-vl example

* complete generate.py & readme

* improve lint style

* update 1-6

* update main readme

* Format and other small fixes

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2025-01-13 15:42:04 +08:00
Xu, Shuo
62318964fa
Update llama example information (#12640)
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2025-01-02 13:48:39 +08:00
Yishuo Wang
c72a5db757
remove unused code again (#12624) 2024-12-27 14:17:11 +08:00
Xu, Shuo
55ce091242
Add GLM4-Edge-V GPU example (#12596)
* Add GLM4-Edge-V examples

* polish readme

* revert wrong changes

* polish readme

* polish readme

* little polish in reference info and indent

* Small fix and sample output updates

* Update main readme

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-12-27 09:40:29 +08:00
Xu, Shuo
ef585d3360
Polish Readme for ModelScope-related examples (#12603) 2024-12-26 10:52:47 +08:00
Xu, Shuo
b0338c5529
Add --modelscope option for glm-v4 MiniCPM-V-2_6 glm-edge and internvl2 (#12583)
* Add --modelscope option for glm-v4 and MiniCPM-V-2_6

* glm-edge

* minicpm-v-2_6:don't use model_hub=modelscope when use lowbit; internvl2

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-12-20 13:54:17 +08:00
Xu, Shuo
47da3c999f
Add --modelscope in GPU examples for minicpm, minicpm3, baichuan2 (#12564)
* Add --modelscope for more models

* minicpm

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-12-19 17:25:46 +08:00
Xu, Shuo
47e90a362f
Add --modelscope in GPU examples for glm4, codegeex2, qwen2 and qwen2.5 (#12561)
* Add --modelscope for more models

* imporve readme

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-12-19 10:00:39 +08:00
Xu, Shuo
ccc18eefb5
Add Modelscope option for chatglm3 on GPU (#12545)
* Add Modelscope option for GPU model chatglm3

* Update readme

* Update readme

* Update readme

* Update readme

* format update

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-12-16 20:00:37 +08:00
Chu,Youcheng
a86487c539
Add GLM-Edge GPU example (#12483)
* feat: initial commit

* generate.py and README updates

* Update link for main readme

* Update based on comments

* Small fix

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-12-16 14:39:19 +08:00
Jun Wang
0b953e61ef
[REFINE] graphmode code (#12540) 2024-12-16 09:17:01 +08:00
Heyang Sun
fa261b8af1
torch 2.3 inference docker (#12517)
* torch 2.3 inference docker

* Update README.md

* add convert code

* rename image

* remove 2.1 and add graph example

* Update README.md
2024-12-13 10:47:04 +08:00
Chu,Youcheng
ce6fcaa9ba
update transformers version in example of glm4 (#12453)
* fix: update transformers version in example of glm4

* fix: textual adjustments

* fix: texual adjustment
2024-11-27 15:02:25 +08:00
Yuwen Hu
effb9bb41c
Small update to LangChain examples readme (#12452) 2024-11-27 14:02:25 +08:00
Chu,Youcheng
acd77d9e87
Remove env variable BIGDL_LLM_XMX_DISABLED in documentation (#12445)
* fix: remove BIGDL_LLM_XMX_DISABLED in mddocs

* fix: remove set SYCL_CACHE_PERSISTENT=1 in example

* fix: remove BIGDL_LLM_XMX_DISABLED in workflows

* fix: merge igpu and A-series Graphics

* fix: remove set BIGDL_LLM_XMX_DISABLED=1 in example

* fix: remove BIGDL_LLM_XMX_DISABLED in workflows

* fix: merge igpu and A-series Graphics

* fix: textual adjustment

* fix: textual adjustment

* fix: textual adjustment
2024-11-27 11:16:36 +08:00
Jin, Qiao
c2efa264d9
Update LangChain examples to use upstream (#12388)
* Update LangChain examples to use upstream

* Update README and fix links

* Update LangChain CPU examples to use upstream

* Update LangChain CPU voice_assistant example

* Update CPU README

* Update GPU README

* Remove GPU Langchain vLLM example and fix comments

* Change langchain -> LangChain

* Add reference for both upstream llms and embeddings

* Fix comments

* Fix comments

* Fix comments

* Fix comments

* Fix comment
2024-11-26 16:43:15 +08:00
Jinhe
66bd7abae4
add sdxl and lora-lcm optimization (#12444)
* add sdxl and lora-lcm optimization

* fix openjourney speed drop
2024-11-26 11:38:09 +08:00
Jinhe
7e0a840f74
add optimization to openjourney (#12423)
* add optimization to openjourney

* add optimization to openjourney
2024-11-21 15:23:51 +08:00
Jinhe
d2a37b6ab2
add Stable diffusion examples (#12418)
* add openjourney example

* add timing

* add stable diffusion to model page

* 4.1 fix

* small fix
2024-11-20 17:18:36 +08:00
Qiyuan Gong
7e50ff113c
Add padding_token=eos_token for GPU trl QLora example (#12398)
* Avoid tokenizer doesn't have a padding token error.
2024-11-14 10:51:30 +08:00
Guancheng Fu
0ee54fc55f
Upgrade to vllm 0.6.2 (#12338)
* Initial updates for vllm 0.6.2

* fix

* Change Dockerfile to support v062

* Fix

* fix examples

* Fix

* done

* fix

* Update engine.py

* Fix Dockerfile to original path

* fix

* add option

* fix

* fix

* fix

* fix

---------

Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
2024-11-12 20:35:34 +08:00
Qiyuan Gong
2dfcc36825
Fix trl version and padding in trl qlora example (#12368)
* Change trl to 0.9.6
* Enable padding to avoid padding related errors.
2024-11-08 16:05:17 +08:00
Jin, Qiao
82a61b5cf3
Limit trl version in example (#12332)
* Limit trl version in example

* Limit trl version in example
2024-11-05 14:50:10 +08:00
Zijie Li
cd5e22cee5
Update Llava GPU Example (#12311)
* update-llava-example

* add warmup

* small fix on llava example

* remove space& extra print prompt

* renew example

* small fix

---------
Co-authored-by: Jinhe Tang <jin.tang1337@gmail.com>
2024-11-01 17:06:00 +08:00