Kai Huang
|
416c19165c
|
Add Qwen pipeline and example (#12292)
* support qwen pipeline
* update error msg
* style
* meet review
* minor
|
2024-10-31 11:25:25 +08:00 |
|
binbin Deng
|
41b8064554
|
Support minicpm-1B in level0 pipeline (#12297)
|
2024-10-30 17:21:47 +08:00 |
|
Ruonan Wang
|
2b2cb9c693
|
[NPU pipeline] Support save & load and update examples (#12293)
* support save & load, update llama examples
* update baichuan2 example
* update readme
|
2024-10-30 10:02:00 +08:00 |
|
binbin Deng
|
3feb58d1e4
|
Support baichuan2 for level0 pipeline (#12289)
|
2024-10-29 19:24:16 +08:00 |
|
Yina Chen
|
4467645088
|
[NPU] Support l0 Llama groupwise (#12276)
* except lm_head
* remove
* support gw lm_head
* update
* fix
* remove run.bat
* fix style
* support llama3
|
2024-10-28 17:06:55 +08:00 |
|
Ruonan Wang
|
3fe2ea3081
|
[NPU] Reuse prefill of acc lib for pipeline (#12279)
* first commit
* update example
* fix style
* update example
* embedding as const
* fix generate
* code refactor
* meet code review
* fix style
* change max_output_len to max_context_len
* fix all-in-one
* fix example
* add check for new tokens
|
2024-10-28 16:05:49 +08:00 |
|
binbin Deng
|
ec362e6133
|
Add llama3 level0 example (#12275)
|
2024-10-28 09:24:51 +08:00 |
|
Ruonan Wang
|
854398f6e0
|
update example to reduce peak memory usage (#12274)
|
2024-10-25 17:09:26 +08:00 |
|
Ruonan Wang
|
ae57e23e4f
|
fix incompatibility between llama GW & llama pipeline (#12267)
* fix
* fix
|
2024-10-25 10:31:44 +08:00 |
|
Ruonan Wang
|
821fd96367
|
Initial integrate our L0 Llama impl into ipex-llm (#12255)
* temp save
* initial support
* fix
* simplify code
* fix style
* fix example
* make default value of pipeline as False
|
2024-10-24 09:49:27 +08:00 |
|
Ruonan Wang
|
4d93bb81fe
|
Initial support of NPU level0 Model (#12177)
* first commit to support load dll and init llm pipeline
* add init generate
* fix style
* small updates
* fix style and check tokens number
|
2024-10-11 09:45:53 +08:00 |
|