ipex-llm/python/llm/example
Yuwen Hu ef4028ac2d
[NPU] Support split lm_head for Qwen2 with CPP (#12491)
* Use split for Qwen2 lm_head instead of slice in optimize_pre

* Support split lm_head in Qwen2 python cpp backend

* Fit with Python acc lib pipeline

* Removed default mixed_precision=True in all-in-one and related examples

* Small fix

* Style fix

* Fix based on comments

* Fix based on comments

* Stype fix
2024-12-04 14:41:08 +08:00
..
CPU remove nf4 unsupport comment in cpu finetuning (#12460) 2024-11-28 13:26:46 +08:00
GPU update transformers version in example of glm4 (#12453) 2024-11-27 15:02:25 +08:00
NPU/HF-Transformers-AutoModels [NPU] Support split lm_head for Qwen2 with CPP (#12491) 2024-12-04 14:41:08 +08:00