ipex-llm

History

Xin Qiu 30795bdfbc Gemma optimization: rms_norm, kv_cache, fused_rope, fused_rope+qkv (#10212 ) * gemma optimization * update * update * fix style * meet code review		2024-02-23 10:07:24 +08:00
..
cli	[LLM] fix chatglm main choice (#9073 )	2023-09-28 11:23:37 +08:00
ggml	LLM: add GGUF-IQ2 examples (#10207 )	2024-02-22 14:18:45 +08:00
gptq	gptq2ggml: support loading safetensors model. (#8401 )	2023-06-27 11:19:33 +08:00
langchain	LLM: modify transformersembeddings.embed() in langchain (#10051 )	2024-02-05 10:42:10 +08:00
serving	[Serving] Add vllm_worker to fastchat serving framework (#9934 )	2024-01-18 21:33:36 +08:00
transformers	Gemma optimization: rms_norm, kv_cache, fused_rope, fused_rope+qkv (#10212 )	2024-02-23 10:07:24 +08:00
utils	change xmx condition (#9896 )	2024-01-12 19:51:48 +08:00
vllm	add mistral and chatglm support to vllm (#9879 )	2024-01-10 15:38:42 +08:00
__init__.py	[LLM] IPEX auto importer set on by default (#9832 )	2024-01-04 13:33:29 +08:00
convert_model.py	LLM: add chatglm native int4 transformers API (#8695 )	2023-08-07 17:52:47 +08:00
format.sh	Integrate vllm (#9310 )	2023-11-23 16:46:45 +08:00
models.py	LLM: add chatglm native int4 transformers API (#8695 )	2023-08-07 17:52:47 +08:00
optimize.py	[LLM] Improve LLM doc regarding windows gpu related info (#9880 )	2024-01-11 14:37:16 +08:00