ipex-llm

History

Yuwen Hu ef4028ac2d [NPU] Support split `lm_head` for Qwen2 with CPP (#12491 ) * Use split for Qwen2 lm_head instead of slice in optimize_pre * Support split lm_head in Qwen2 python cpp backend * Fit with Python acc lib pipeline * Removed default mixed_precision=True in all-in-one and related examples * Small fix * Style fix * Fix based on comments * Fix based on comments * Stype fix		2024-12-04 14:41:08 +08:00
..
CPU	remove nf4 unsupport comment in cpu finetuning (#12460 )	2024-11-28 13:26:46 +08:00
GPU	update transformers version in example of glm4 (#12453 )	2024-11-27 15:02:25 +08:00
NPU/HF-Transformers-AutoModels	[NPU] Support split `lm_head` for Qwen2 with CPP (#12491 )	2024-12-04 14:41:08 +08:00