ipex-llm

History

Ruonan Wang 49ab8974fa [NPU] initial support of `asym_int4_rtn` (#12484 ) * initiail support of q4_1 * fix * fix * update * update min to Z1 * update * fix * update * fix style * fix * support qwen2 optimize_model=True mp version * temp save * fix * fix style * replace min with zero * support split linear for q4_1 * fix lm_head with mixed_precision=True * fix style * revert test code * add down proj back for q4_0 * remove print	2024-12-05 17:40:36 +08:00
..
llm	[NPU] initial support of `asym_int4_rtn` (#12484 )	2024-12-05 17:40:36 +08:00

[NPU] initial support of asym_int4_rtn (#12484 )

* initiail support of q4_1

* fix

* fix

* update

* update min to Z1

* update

* fix

* update

* fix style

* fix

* support qwen2 optimize_model=True mp version

* temp save

* fix

* fix style

* replace min with zero

* support split linear for q4_1

* fix lm_head with mixed_precision=True

* fix style

* revert test code

* add down proj back for q4_0

* remove print

2024-12-05 17:40:36 +08:00

llm [NPU] initial support of asym_int4_rtn (#12484 ) 2024-12-05 17:40:36 +08:00