ipex-llm/python
Yuwen Hu 828fa01ad3
[NPU] Add mixed_precision for Qwen2 7B (#12098)
* Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct

* Small fix

* Fixed on load low bit with mixed precision

* Small fix

* Update example accordingly

* Update for default prompt

* Update base on comments

* Final fix
2024-09-20 16:36:21 +08:00
..
llm [NPU] Add mixed_precision for Qwen2 7B (#12098) 2024-09-20 16:36:21 +08:00