ipex-llm

History

Ruonan Wang 4a61f7d20d update mlp of llama (#11897 ) * update mlp of llama * relax threshold of mlp test * revert code		2024-08-22 20:34:53 +08:00
..
cli
ggml	Init NPU quantize method and support q8_0_rtn (#11452 )	2024-07-01 13:45:07 +08:00
gptq
langchain	Remove chatglm_C Module to Eliminate LGPL Dependency (#11178 )	2024-05-31 17:03:11 +08:00
llamaindex	Llamaindex: add tokenizer_id and support chat (#10590 )	2024-04-07 13:51:34 +08:00
serving	Add lightweight-serving whisper asr example (#11847 )	2024-08-22 15:46:28 +08:00
transformers	update mlp of llama (#11897 )	2024-08-22 20:34:53 +08:00
utils	Add benchmark util for transformers 4.42 (#11725 )	2024-08-07 08:48:07 +08:00
vllm	fix vllm qwen2 models (#11879 )	2024-08-21 11:05:24 +08:00
__init__.py	IPEX Duplicate importer V2 (#11310 )	2024-06-19 16:29:19 +08:00
convert_model.py
format.sh
llm_patching.py	Upgrade Peft version to 0.10.0 for LLM finetune (#10886 )	2024-05-07 15:09:14 +08:00
models.py	Remove chatglm_C Module to Eliminate LGPL Dependency (#11178 )	2024-05-31 17:03:11 +08:00
optimize.py	Update tests for transformers 4.36 (#10858 )	2024-05-24 10:26:38 +08:00