ipex-llm

Author	SHA1	Message	Date
Yuwen Hu	828fa01ad3	[NPU] Add `mixed_precision` for Qwen2 7B (#12098 ) * Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct * Small fix * Fixed on load low bit with mixed precision * Small fix * Update example accordingly * Update for default prompt * Update base on comments * Final fix	2024-09-20 16:36:21 +08:00
Ch1y0q	73a4360f3f	update lowbit path for baichuan2, qwen2, `generate.py` (#12051 ) * update lowbit path for baichuan2, qwen2, `generate.py` * update readme	2024-09-10 15:35:24 +08:00
binbin Deng	14b2c8dc32	Update qwen2-7b example script (#11961 )	2024-08-29 18:25:17 +08:00
Zijie Li	6c3eb1e1e8	refactor from_pretrained API for NPU (#11927 )	2024-08-27 09:50:30 +08:00
binbin Deng	72a7bf624b	Support qwen2-1.5b with fused decoderlayer optimization on NPU (#11888 )	2024-08-22 11:09:12 +08:00