ipex-llm/python/llm/src/ipex_llm/transformers/models
Yang Wang 5a1f446d3c
support fp8 in xetla (#10555)
* support fp8 in xetla

* change name

* adjust model file

* support convert back to cpu

* factor

* fix bug

* fix style
2024-04-08 13:22:09 -07:00
..
__init__.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
aquila.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
baichuan.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
baichuan2.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
bert.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
bloom.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
chatglm.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
chatglm2.py LLM: support int4 fp16 chatglm2-6b 8k input. (#10648) 2024-04-07 09:39:21 +08:00
chatglm2_32k.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
decilm.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
falcon.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
gemma.py enable fp4 fused mlp and qkv (#10531) 2024-03-26 08:34:00 +08:00
gptbigcode.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
gptj.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
gptneox.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
internlm.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
llama.py support fp8 in xetla (#10555) 2024-04-08 13:22:09 -07:00
mistral.py support fp8 in xetla (#10555) 2024-04-08 13:22:09 -07:00
mixtral.py Fix vllm print error message issue (#10664) 2024-04-05 15:08:13 -07:00
mpt.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
phixtral.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
qwen.py support fp8 in xetla (#10555) 2024-04-08 13:22:09 -07:00
qwen2.py Fix vllm print error message issue (#10664) 2024-04-05 15:08:13 -07:00
qwen_vl.py support fp8 in xetla (#10555) 2024-04-08 13:22:09 -07:00
rwkv4.py fix rwkv with pip installer (#10591) 2024-03-29 17:56:45 +08:00
rwkv5.py fix rwkv with pip installer (#10591) 2024-03-29 17:56:45 +08:00
stablelm.py stablelm fp8 kv cache (#10672) 2024-04-08 15:16:46 +08:00
starcoder2.py optimize starcoder normal kv cache (#10642) 2024-04-03 15:27:02 +08:00
utils.py fix stablelm logits diff (#10636) 2024-04-03 15:08:12 +08:00
yuan.py support fp8 in xetla (#10555) 2024-04-08 13:22:09 -07:00