ipex-llm/python/llm/example/NPU/HF-Transformers-AutoModels
Yuwen Hu 828fa01ad3
[NPU] Add mixed_precision for Qwen2 7B (#12098)
* Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct

* Small fix

* Fixed on load low bit with mixed precision

* Small fix

* Update example accordingly

* Update for default prompt

* Update base on comments

* Final fix
2024-09-20 16:36:21 +08:00
..
LLM [NPU] Add mixed_precision for Qwen2 7B (#12098) 2024-09-20 16:36:21 +08:00
Multimodal add lowbit_path for generate.py, fix npu_model (#12077) 2024-09-13 17:28:05 +08:00