ipex-llm

History

Yuwen Hu 828fa01ad3 [NPU] Add `mixed_precision` for Qwen2 7B (#12098 ) * Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct * Small fix * Fixed on load low bit with mixed precision * Small fix * Update example accordingly * Update for default prompt * Update base on comments * Final fix	2024-09-20 16:36:21 +08:00
..
HF-Transformers-AutoModels	[NPU] Add `mixed_precision` for Qwen2 7B (#12098 )	2024-09-20 16:36:21 +08:00

[NPU] Add mixed_precision for Qwen2 7B (#12098 )

* Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct

* Small fix

* Fixed on load low bit with mixed precision

* Small fix

* Update example accordingly

* Update for default prompt

* Update base on comments

* Final fix

2024-09-20 16:36:21 +08:00

HF-Transformers-AutoModels [NPU] Add mixed_precision for Qwen2 7B (#12098 ) 2024-09-20 16:36:21 +08:00