binbin Deng
|
14d8d3d8af
|
Integrate NPU C++ imple into ipex-llm (#12461)
|
2024-11-29 09:25:37 +08:00 |
|
Ruonan Wang
|
f8c2bb2943
|
[NPU] optimize qwen2 prefill performance for C++ (#12451)
|
2024-11-27 10:46:18 +08:00 |
|
Ruonan Wang
|
0e23bd779f
|
Add support of llama3.2 for NPU C++ (#12442)
* initial support of llama3.2
* update
* update
* fix style
* fix style
* fix
* small fix
|
2024-11-26 09:26:55 +08:00 |
|
Ruonan Wang
|
b9abb8a285
|
Support qwen2.5 3B for NPU & update related examples (#12438)
* update qwen2.5-3B
* update convert
* small fix
* replace load_in_low_bit with low_bit
* small fix
|
2024-11-25 16:38:31 +08:00 |
|
Jinhe
|
b633fbf26c
|
add chinese prompt troubleshooting for npu cpp examples (#12437)
* add chinese prompt troubleshooting
* add chinese prompt troubleshooting
|
2024-11-25 15:28:47 +08:00 |
|
Ruonan Wang
|
f41405368a
|
Support minicpm for NPU C++ (#12434)
* support minicpm-1b
* update
* tune fused_layers
* update readme.md
|
2024-11-25 10:42:02 +08:00 |
|
Ruonan Wang
|
0819fad34e
|
support Llama2-7B / Llama3-8B for NPU C++ (#12431)
* support llama2
* update
* support fused_layers=4 for Llama2-7B
|
2024-11-22 18:47:19 +08:00 |
|
Ruonan Wang
|
4ffa6c752c
|
New convert support for C++ NPU (#12430)
* initial commit
* fix
* fix style
* fix style
* fix
* fix
|
2024-11-22 14:28:30 +08:00 |
|
Ruonan Wang
|
2935e97610
|
small fix of cpp readme(#12425)
|
2024-11-21 18:21:34 +08:00 |
|
Ruonan Wang
|
7288c759ce
|
Initial NPU C++ Example (#12417)
* temp save
* meet review, update
* update
* meet review, add license
* typo
|
2024-11-21 10:09:26 +08:00 |
|