ipex-llm

History

Ruonan Wang 49ab8974fa [NPU] initial support of `asym_int4_rtn` (#12484 ) * initiail support of q4_1 * fix * fix * update * update min to Z1 * update * fix * update * fix style * fix * support qwen2 optimize_model=True mp version * temp save * fix * fix style * replace min with zero * support split linear for q4_1 * fix lm_head with mixed_precision=True * fix style * revert test code * add down proj back for q4_0 * remove print		2024-12-05 17:40:36 +08:00
..
model	Support imatrix-guided quantization for NPU CW (#12468 )	2024-12-02 11:31:26 +08:00
__init__.py	Refactor bigdl.llm to ipex_llm (#24 )	2024-03-22 15:41:21 +08:00
convert.py	Remove chatglm_C Module to Eliminate LGPL Dependency (#11178 )	2024-05-31 17:03:11 +08:00
convert_model.py	Remove chatglm_C Module to Eliminate LGPL Dependency (#11178 )	2024-05-31 17:03:11 +08:00
quantize.py	[NPU] initial support of `asym_int4_rtn` (#12484 )	2024-12-05 17:40:36 +08:00