ipex-llm

History

Yuwen Hu ef4028ac2d [NPU] Support split `lm_head` for Qwen2 with CPP (#12491 ) * Use split for Qwen2 lm_head instead of slice in optimize_pre * Support split lm_head in Qwen2 python cpp backend * Fit with Python acc lib pipeline * Removed default mixed_precision=True in all-in-one and related examples * Small fix * Style fix * Fix based on comments * Fix based on comments * Stype fix		2024-12-04 14:41:08 +08:00
..
cli	Refactor bigdl.llm to ipex_llm (#24 )	2024-03-22 15:41:21 +08:00
ggml	Support imatrix-guided quantization for NPU CW (#12468 )	2024-12-02 11:31:26 +08:00
gptq	Refactor bigdl.llm to ipex_llm (#24 )	2024-03-22 15:41:21 +08:00
langchain	Remove chatglm_C Module to Eliminate LGPL Dependency (#11178 )	2024-05-31 17:03:11 +08:00
llamaindex	Llamaindex: add tokenizer_id and support chat (#10590 )	2024-04-07 13:51:34 +08:00
serving	Upgrade to vllm 0.6.2 (#12338 )	2024-11-12 20:35:34 +08:00
transformers	[NPU] Support split `lm_head` for Qwen2 with CPP (#12491 )	2024-12-04 14:41:08 +08:00
utils	fix ipex 2.3 bug (#12366 )	2024-11-08 13:29:15 +08:00
vllm	add vLLM glm4 fix (#12474 )	2024-12-02 14:05:16 +08:00
__init__.py	IPEX Duplicate importer V2 (#11310 )	2024-06-19 16:29:19 +08:00
convert_model.py	Refactor bigdl.llm to ipex_llm (#24 )	2024-03-22 15:41:21 +08:00
format.sh	Refactor bigdl.llm to ipex_llm (#24 )	2024-03-22 15:41:21 +08:00
llm_patching.py	Upgrade Peft version to 0.10.0 for LLM finetune (#10886 )	2024-05-07 15:09:14 +08:00
models.py	Remove chatglm_C Module to Eliminate LGPL Dependency (#11178 )	2024-05-31 17:03:11 +08:00
optimize.py	fix and optimize sd (#12436 )	2024-11-25 14:09:48 +08:00