Yuwen Hu
7c0c77cce3
Tiny fixes ( #12936 )
2025-03-05 14:55:26 +08:00
Yuwen Hu
68a770745b
Add moonlight GPU example ( #12929 )
...
* Add moonlight GPU example and update table
* Small fix
* Fix based on comments
* Small fix
2025-03-05 11:31:14 +08:00
Yuwen Hu
443cb5d4e0
Update Janus-Pro GPU example ( #12906 )
2025-02-28 15:39:03 +08:00
Xin Qiu
e946127613
glm 4v 1st sdp for vision ( #12904 )
...
* glm4v 1st sdp
* update glm4v example
* meet code review
* fix style
2025-02-28 13:23:27 +08:00
Guancheng Fu
02ec313eab
Update README.md ( #12877 )
2025-02-24 09:59:17 +08:00
Xu, Shuo
1e00bed001
Add GPU example for Janus-Pro ( #12869 )
...
* Add example for Janus-Pro
* Update model link
* Fixes
* Fixes
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2025-02-21 18:36:50 +08:00
binbin Deng
8077850452
[NPU GGUF] Add simple example ( #12853 )
2025-02-21 09:58:00 +08:00
Guancheng Fu
4eed0c7d99
initial implementation for low_bit_loader vLLM ( #12838 )
...
* initial
* add logic for handling tensor parallel models
* fix
* Add some comments
* add doc
* fix done
2025-02-19 19:45:34 +08:00
Xiangyu Tian
b26409d53f
R1 Hybrid: Add Benchmark for DeepSeek R1 transformers example ( #12854 )
...
* init
* fix
* update
* update
* fix
* fix
2025-02-19 18:33:21 +08:00
Xiangyu Tian
93c10be762
LLM: Support hybrid convert for DeepSeek V3/R1 ( #12834 )
...
LLM: Support hybrid convert for DeepSeek V3/R1
2025-02-19 11:31:19 +08:00
Xiangyu Tian
09150b6058
Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 ( #12832 )
...
Initiate CPU-XPU Hybrid Inference for DeepSeek-R1 with DeepseekV3Attention
and DeepseekV3MLP to XPU
2025-02-18 13:34:14 +08:00
Xiangyu Tian
09ed96082b
Add DeepSeek V3/R1 CPU example ( #12836 )
...
Add DeepSeek V3/R1 CPU example for bf16 model
2025-02-18 12:45:49 +08:00
Yina Chen
eb2df5ed70
common.h -> npu/npu_common.h ( #12800 )
2025-02-10 14:38:22 +08:00
binbin Deng
3fee838b14
[NPU] Fix of c++ convert example ( #12797 )
2025-02-10 11:17:58 +08:00
Kai Huang
468d3f22fc
Rename NPU public example to llm-cli ( #12790 )
...
* rename to llm-cli
* update readme
2025-02-08 10:19:59 +08:00
Ruonan Wang
e90a9ad196
[NPU] Support non-const parameter for decoder layers when keep_ir=True ( #12789 )
...
* support layernorm=False for decoder layers
* renbame to meet review
* fix style
* rename to const_parameter
* fix rebase error
* fix rebase error
2025-02-08 09:58:42 +08:00
Yishuo Wang
8aea5319bb
update more lora example ( #12785 )
2025-02-08 09:46:48 +08:00
binbin Deng
6ff7faa781
[NPU] Update deepseek support in python examples and quickstart ( #12786 )
2025-02-07 11:25:16 +08:00
Ruonan Wang
b4f2be2b09
[NPU] Update C++ example to add DeepSeek-R1 ( #12787 )
2025-02-07 11:23:34 +08:00
Yishuo Wang
d0d9c9d636
remove load_in_8bit usage as it is not supported a long time ago ( #12779 )
2025-02-07 11:21:29 +08:00
Yishuo Wang
b4c9e23f73
fix galore and peft finetune example ( #12776 )
2025-02-06 16:36:13 +08:00
Yishuo Wang
c0d6b282b8
fix lisa finetune example ( #12775 )
2025-02-06 16:35:43 +08:00
Yishuo Wang
2e5f2e5dda
fix dpo finetune ( #12774 )
2025-02-06 16:35:21 +08:00
Yishuo Wang
9697197f3e
fix qlora finetune example ( #12769 )
2025-02-06 11:18:28 +08:00
Ruonan Wang
094a25b740
[NPU] Expose parameter to control blob / IR save logic ( #12767 )
...
* update api
* fix convert.py
* fix style
* remove unnecessary bin file
* fix style
2025-02-06 10:07:45 +08:00
Yuwen Hu
184adb2653
Small fix to MiniCPM-o-2_6 GPU example ( #12766 )
2025-02-05 11:32:26 +08:00
Yuwen Hu
d11f257ee7
Add GPU example for MiniCPM-o-2_6 ( #12735 )
...
* Add init example for omni mode
* Small fix
* Small fix
* Add chat example
* Remove lagecy link
* Further update link
* Add readme
* Small fix
* Update main readme link
* Update based on comments
* Small fix
* Small fix
* Small fix
2025-01-23 16:10:19 +08:00
Ruonan Wang
78cca0a68c
[NPU] update llm-npu-cli example ( #12729 )
...
* update cli example
* add license
* rename
* update readme sample output
2025-01-22 09:59:27 +08:00
Yuwen Hu
c52bdff76b
Update Deepseek coder GPU example ( #12712 )
...
* Update Deepseek coder GPU example
* Fix based on comment
2025-01-16 14:05:31 +08:00
Xu, Shuo
350fae285d
Add Qwen2-VL HF GPU example with ModelScope Support ( #12606 )
...
* Add qwen2-vl example
* complete generate.py & readme
* improve lint style
* update 1-6
* update main readme
* Format and other small fixes
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2025-01-13 15:42:04 +08:00
Yuwen Hu
525b0ee991
[NPU] Tiny fixes on examples ( #12661 )
2025-01-07 14:30:38 +08:00
Yuwen Hu
381d448ee2
[NPU] Example & Quickstart updates ( #12650 )
...
* Remove model with optimize_model=False in NPU verified models tables, and remove related example
* Remove experimental in run optimized model section title
* Unify model table order & example cmd
* Move embedding example to separate folder & update quickstart example link
* Add Quickstart reference in main NPU readme
* Small fix
* Small fix
* Move save/load examples under NPU/HF-Transformers-AutoModels
* Add low-bit and polish arguments for LLM Python examples
* Small fix
* Add low-bit and polish arguments for Multi-Model examples
* Polish argument for Embedding models
* Polish argument for LLM CPP examples
* Add low-bit and polish argument for Save-Load examples
* Add accuracy tuning tips for examples
* Update NPU qucikstart accuracy tuning with low-bit optimizations
* Add save/load section to qucikstart
* Update CPP example sample output to EN
* Add installation regarding cmake for CPP examples
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Unify max prompt length to 512
* Change recommended low-bit for Qwen2.5-3B-Instruct to asym_int4
* Update based on comments
* Small fix
2025-01-07 13:52:41 +08:00
binbin Deng
0b377100c5
Add guide for save-load usage ( #12498 )
2025-01-03 16:30:15 +08:00
Xu, Shuo
62318964fa
Update llama example information ( #12640 )
...
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2025-01-02 13:48:39 +08:00
Yishuo Wang
c72a5db757
remove unused code again ( #12624 )
2024-12-27 14:17:11 +08:00
Ruonan Wang
90f6709486
[remove pipeline examples ( #12626 )
2024-12-27 13:42:28 +08:00
Zijie Li
5f04ed7254
NPU] Update prompt format for baichuan2-pipeline ( #12625 )
2024-12-27 11:30:54 +08:00
Xu, Shuo
55ce091242
Add GLM4-Edge-V GPU example ( #12596 )
...
* Add GLM4-Edge-V examples
* polish readme
* revert wrong changes
* polish readme
* polish readme
* little polish in reference info and indent
* Small fix and sample output updates
* Update main readme
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-12-27 09:40:29 +08:00
binbin Deng
796ee571a5
[NPU doc] Update verified platforms ( #12621 )
2024-12-26 17:39:13 +08:00
Zijie Li
ccc4055058
[NPU] Update prompt format for baichuan2 ( #12615 )
...
* Update baichuan2.py
* style fix
2024-12-26 11:41:37 +08:00
Ruonan Wang
d841e1dc0d
[NPU] update convert script based on latest usage ( #12617 )
2024-12-26 11:23:04 +08:00
Xu, Shuo
ef585d3360
Polish Readme for ModelScope-related examples ( #12603 )
2024-12-26 10:52:47 +08:00
Xu, Shuo
b0338c5529
Add --modelscope option for glm-v4 MiniCPM-V-2_6 glm-edge and internvl2 ( #12583 )
...
* Add --modelscope option for glm-v4 and MiniCPM-V-2_6
* glm-edge
* minicpm-v-2_6:don't use model_hub=modelscope when use lowbit; internvl2
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-12-20 13:54:17 +08:00
Xu, Shuo
47da3c999f
Add --modelscope in GPU examples for minicpm, minicpm3, baichuan2 ( #12564 )
...
* Add --modelscope for more models
* minicpm
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-12-19 17:25:46 +08:00
Xu, Shuo
47e90a362f
Add --modelscope in GPU examples for glm4, codegeex2, qwen2 and qwen2.5 ( #12561 )
...
* Add --modelscope for more models
* imporve readme
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-12-19 10:00:39 +08:00
binbin Deng
680ea7e4a8
[NPU doc] Update configuration for different platforms ( #12554 )
2024-12-17 10:15:09 +08:00
Xu, Shuo
ccc18eefb5
Add Modelscope option for chatglm3 on GPU ( #12545 )
...
* Add Modelscope option for GPU model chatglm3
* Update readme
* Update readme
* Update readme
* Update readme
* format update
---------
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-12-16 20:00:37 +08:00
Chu,Youcheng
a86487c539
Add GLM-Edge GPU example ( #12483 )
...
* feat: initial commit
* generate.py and README updates
* Update link for main readme
* Update based on comments
* Small fix
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-12-16 14:39:19 +08:00
Jun Wang
0b953e61ef
[REFINE] graphmode code ( #12540 )
2024-12-16 09:17:01 +08:00
binbin Deng
caf15cc5ef
[NPU] Add IPEX_LLM_NPU_MTL to enable support on mtl ( #12543 )
2024-12-13 17:01:13 +08:00