ipex-llm/python/llm/example
Yuwen Hu 828fa01ad3
[NPU] Add mixed_precision for Qwen2 7B (#12098)
* Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct

* Small fix

* Fixed on load low bit with mixed precision

* Small fix

* Update example accordingly

* Update for default prompt

* Update base on comments

* Final fix
2024-09-20 16:36:21 +08:00
..
CPU added minicpm cpu examples (#12027) 2024-09-11 15:51:21 +08:00
GPU add internvl2 example (#12102) 2024-09-20 16:31:54 +08:00
NPU/HF-Transformers-AutoModels [NPU] Add mixed_precision for Qwen2 7B (#12098) 2024-09-20 16:36:21 +08:00