Commit graph

11 commits

Author SHA1 Message Date
Kai Huang
416c19165c
Add Qwen pipeline and example (#12292)
* support qwen pipeline

* update error msg

* style

* meet review

* minor
2024-10-31 11:25:25 +08:00
binbin Deng
41b8064554
Support minicpm-1B in level0 pipeline (#12297) 2024-10-30 17:21:47 +08:00
Ruonan Wang
2b2cb9c693
[NPU pipeline] Support save & load and update examples (#12293)
* support save & load, update llama examples

* update baichuan2 example

* update readme
2024-10-30 10:02:00 +08:00
binbin Deng
3feb58d1e4
Support baichuan2 for level0 pipeline (#12289) 2024-10-29 19:24:16 +08:00
Yina Chen
4467645088
[NPU] Support l0 Llama groupwise (#12276)
* except lm_head

* remove

* support gw lm_head

* update

* fix

* remove run.bat

* fix style

* support llama3
2024-10-28 17:06:55 +08:00
Ruonan Wang
3fe2ea3081
[NPU] Reuse prefill of acc lib for pipeline (#12279)
* first commit

* update example

* fix style

* update example

* embedding as const

* fix generate

* code  refactor

* meet code review

* fix style

* change max_output_len to max_context_len

* fix all-in-one

* fix example

* add check for new tokens
2024-10-28 16:05:49 +08:00
binbin Deng
ec362e6133
Add llama3 level0 example (#12275) 2024-10-28 09:24:51 +08:00
Ruonan Wang
854398f6e0
update example to reduce peak memory usage (#12274) 2024-10-25 17:09:26 +08:00
Ruonan Wang
ae57e23e4f
fix incompatibility between llama GW & llama pipeline (#12267)
* fix

* fix
2024-10-25 10:31:44 +08:00
Ruonan Wang
821fd96367
Initial integrate our L0 Llama impl into ipex-llm (#12255)
* temp save

* initial support

* fix

* simplify code

* fix style

* fix example

* make default value of pipeline as False
2024-10-24 09:49:27 +08:00
Ruonan Wang
4d93bb81fe
Initial support of NPU level0 Model (#12177)
* first commit to support load dll and init llm pipeline

* add init generate

* fix style

* small updates

* fix style and check tokens number
2024-10-11 09:45:53 +08:00