ipex-llm/python/llm/src/ipex_llm/transformers
Zhicun b4147a97bb
Fix dtype mismatch error (#10609)
* fix llama

* fix

* fix code style

* add torch type in model.py

---------

Co-authored-by: arda <arda@arda-arc19.sh.intel.com>
2024-04-09 17:50:33 +08:00
..
awq Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
gguf Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
layers Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
models fix llama2 (#10710) 2024-04-09 17:28:37 +08:00
__init__.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
bmm.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
convert.py LLM: support baichuan2-13b using AutoTP (#10691) 2024-04-09 14:06:01 +08:00
convert_ipex.py LLM: Fix no return_last_logit running bigdl_ipex chatglm3 (#10678) 2024-04-07 15:27:58 +08:00
embedding.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
kv.py optimize starcoder normal kv cache (#10642) 2024-04-03 15:27:02 +08:00
load_config.yaml Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
loader.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
low_bit_linear.py support fp8 in xetla (#10555) 2024-04-08 13:22:09 -07:00
model.py Fix dtype mismatch error (#10609) 2024-04-09 17:50:33 +08:00
modelling_bigdl.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
qlora.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
relora.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
speculative.py LLM: Fix no return_last_logit running bigdl_ipex chatglm3 (#10678) 2024-04-07 15:27:58 +08:00
training_patch.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
utils.py LLM: support iq1s for llama2-70b-hf (#10596) 2024-04-01 13:13:13 +08:00
xpu_customize_fwd.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00