binbin Deng
|
14d8d3d8af
|
Integrate NPU C++ imple into ipex-llm (#12461)
|
2024-11-29 09:25:37 +08:00 |
|
Ruonan Wang
|
b9abb8a285
|
Support qwen2.5 3B for NPU & update related examples (#12438)
* update qwen2.5-3B
* update convert
* small fix
* replace load_in_low_bit with low_bit
* small fix
|
2024-11-25 16:38:31 +08:00 |
|
Ruonan Wang
|
3fe2ea3081
|
[NPU] Reuse prefill of acc lib for pipeline (#12279)
* first commit
* update example
* fix style
* update example
* embedding as const
* fix generate
* code refactor
* meet code review
* fix style
* change max_output_len to max_context_len
* fix all-in-one
* fix example
* add check for new tokens
|
2024-10-28 16:05:49 +08:00 |
|
Jin, Qiao
|
8fa98e2742
|
Remove Qwen2-7b from NPU example for "Run Optimized Models (Experimental)" (#12245)
* Remove qwen2-7b from npu example readme
* fix
|
2024-10-22 17:07:51 +08:00 |
|
Jin, Qiao
|
2bedb17be7
|
Add Qwen2.5 NPU Example (#12110)
* Add Qwen2.5 NPU Example
* fix
* Merge qwen2.py and qwen2.5.py into qwen.py
* Fix description
|
2024-09-25 15:20:03 +08:00 |
|