Commit graph

3 commits

Author SHA1 Message Date
Ruonan Wang
3fe2ea3081
[NPU] Reuse prefill of acc lib for pipeline (#12279)
* first commit

* update example

* fix style

* update example

* embedding as const

* fix generate

* code  refactor

* meet code review

* fix style

* change max_output_len to max_context_len

* fix all-in-one

* fix example

* add check for new tokens
2024-10-28 16:05:49 +08:00
Ch1y0q
820f8a4554
add --lowbit-path option for NPU llama example (#12020)
* add option" `--lowbit-path`

* add descriptions in `README.md` and formatting

* Update llama.py
2024-09-05 15:31:01 +08:00
Yina Chen
e246f1e258
update llama3 npu example (#11933) 2024-08-27 13:03:18 +08:00
Renamed from python/llm/example/NPU/HF-Transformers-AutoModels/LLM/llama2.py (Browse further)