Commit graph

6 commits

Author SHA1 Message Date
binbin Deng
ab01753b1c
[NPU] update save-load API usage (#12473) 2024-12-03 09:46:15 +08:00
binbin Deng
c911026f03
[NPU C++] Update model support & examples & benchmark (#12466) 2024-11-29 13:35:58 +08:00
binbin Deng
41b8064554
Support minicpm-1B in level0 pipeline (#12297) 2024-10-30 17:21:47 +08:00
Ruonan Wang
3fe2ea3081
[NPU] Reuse prefill of acc lib for pipeline (#12279)
* first commit

* update example

* fix style

* update example

* embedding as const

* fix generate

* code  refactor

* meet code review

* fix style

* change max_output_len to max_context_len

* fix all-in-one

* fix example

* add check for new tokens
2024-10-28 16:05:49 +08:00
Jinhe
4ca330da15
Fix NPU load error message and add minicpm npu lowbit feat (#12064)
* fix npu_model raise sym_int4 error

* add load_lowbit

* remove print&perf
2024-09-11 16:56:35 +08:00
SONG Ge
a81a329a5f
[NPU] Add example for NPU multi-processing minicpm-1b model (#11935)
* add minicpm example
2024-08-27 14:57:46 +08:00