binbin Deng
|
ab01753b1c
|
[NPU] update save-load API usage (#12473)
|
2024-12-03 09:46:15 +08:00 |
|
binbin Deng
|
c911026f03
|
[NPU C++] Update model support & examples & benchmark (#12466)
|
2024-11-29 13:35:58 +08:00 |
|
binbin Deng
|
41b8064554
|
Support minicpm-1B in level0 pipeline (#12297)
|
2024-10-30 17:21:47 +08:00 |
|
Ruonan Wang
|
3fe2ea3081
|
[NPU] Reuse prefill of acc lib for pipeline (#12279)
* first commit
* update example
* fix style
* update example
* embedding as const
* fix generate
* code refactor
* meet code review
* fix style
* change max_output_len to max_context_len
* fix all-in-one
* fix example
* add check for new tokens
|
2024-10-28 16:05:49 +08:00 |
|
Jinhe
|
4ca330da15
|
Fix NPU load error message and add minicpm npu lowbit feat (#12064)
* fix npu_model raise sym_int4 error
* add load_lowbit
* remove print&perf
|
2024-09-11 16:56:35 +08:00 |
|
SONG Ge
|
a81a329a5f
|
[NPU] Add example for NPU multi-processing minicpm-1b model (#11935)
* add minicpm example
|
2024-08-27 14:57:46 +08:00 |
|