ipex-llm/python/llm/example/NPU
Yuwen Hu 828fa01ad3
[NPU] Add mixed_precision for Qwen2 7B (#12098)
* Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct

* Small fix

* Fixed on load low bit with mixed precision

* Small fix

* Update example accordingly

* Update for default prompt

* Update base on comments

* Final fix
2024-09-20 16:36:21 +08:00
..
HF-Transformers-AutoModels [NPU] Add mixed_precision for Qwen2 7B (#12098) 2024-09-20 16:36:21 +08:00