binbin Deng
|
d409d9d0eb
|
[NPU L0] Update streaming mode of example (#12312)
|
2024-11-01 15:38:10 +08:00 |
|
Ruonan Wang
|
2b2cb9c693
|
[NPU pipeline] Support save & load and update examples (#12293)
* support save & load, update llama examples
* update baichuan2 example
* update readme
|
2024-10-30 10:02:00 +08:00 |
|
Yina Chen
|
4467645088
|
[NPU] Support l0 Llama groupwise (#12276)
* except lm_head
* remove
* support gw lm_head
* update
* fix
* remove run.bat
* fix style
* support llama3
|
2024-10-28 17:06:55 +08:00 |
|
Ruonan Wang
|
3fe2ea3081
|
[NPU] Reuse prefill of acc lib for pipeline (#12279)
* first commit
* update example
* fix style
* update example
* embedding as const
* fix generate
* code refactor
* meet code review
* fix style
* change max_output_len to max_context_len
* fix all-in-one
* fix example
* add check for new tokens
|
2024-10-28 16:05:49 +08:00 |
|
binbin Deng
|
ec362e6133
|
Add llama3 level0 example (#12275)
|
2024-10-28 09:24:51 +08:00 |
|