ipex-llm/python/llm/src/ipex_llm/transformers/models
2024-12-25 17:04:32 +08:00
..
__init__.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
aquila.py refactor attention_softmax (#12295) 2024-10-30 13:20:50 +08:00
baichuan.py refactor baichuan, glm4 and minicpm3 (#12600) 2024-12-24 14:16:30 +08:00
bert.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
bloom.py refactor qwen2 and llama3 (#12587) 2024-12-20 13:25:25 +08:00
chatglm.py add glm_sdpa back to fix chatglm-6b (#11313) 2024-06-14 10:31:43 +08:00
chatglm2.py refactor chatglm2, internlm, stablelm and qwen (#12604) 2024-12-24 18:18:00 +08:00
chatglm4.py refactor baichuan, glm4 and minicpm3 (#12600) 2024-12-24 14:16:30 +08:00
chatglm4v.py refactor baichuan, glm4 and minicpm3 (#12600) 2024-12-24 14:16:30 +08:00
cohere.py fix llama related import (#12611) 2024-12-25 16:23:52 +08:00
common.py refactor mistral and phi3 (#12605) 2024-12-24 17:52:32 +08:00
decilm.py fix llama related import (#12611) 2024-12-25 16:23:52 +08:00
falcon.py LLM: fix get env KV_CACHE_ALLOC_BLOCK_LENGTH type. (#10771) 2024-04-16 09:32:30 +08:00
gemma.py refactor attention_softmax (#12295) 2024-10-30 13:20:50 +08:00
gemma2.py optimize minicpm3 again (#12047) 2024-09-10 14:19:57 +08:00
glm.py refactor glm edge (#12588) 2024-12-20 15:36:57 +08:00
gpt2.py refactor mllama, gpt2 and internvl (#12602) 2024-12-24 14:18:31 +08:00
gptbigcode.py Fix Starcoder issue on CPU on transformers 4.36+ (#11190) 2024-06-04 10:05:40 -07:00
gptj.py LLM: fix get env KV_CACHE_ALLOC_BLOCK_LENGTH type. (#10771) 2024-04-16 09:32:30 +08:00
gptneox.py refactor ot remove old rope usage (#12224) 2024-10-17 17:06:09 +08:00
internlm.py refactor chatglm2, internlm, stablelm and qwen (#12604) 2024-12-24 18:18:00 +08:00
internvl.py refactor mllama, gpt2 and internvl (#12602) 2024-12-24 14:18:31 +08:00
llama.py rewrite llama optimization (#12609) 2024-12-25 17:04:32 +08:00
minicpm.py refactor yuan2 and starcoder2 and fix (#12589) 2024-12-20 16:41:50 +08:00
minicpm3.py refactor baichuan, glm4 and minicpm3 (#12600) 2024-12-24 14:16:30 +08:00
minicpmv.py refactor sd 1.5 and qwen2-vl and fix (#12590) 2024-12-20 17:34:55 +08:00
mistral.py add compresskv back for mistral (#12607) 2024-12-25 11:06:08 +08:00
mixtral.py add compresskv back for mistral (#12607) 2024-12-25 11:06:08 +08:00
mllama.py refactor mllama, gpt2 and internvl (#12602) 2024-12-24 14:18:31 +08:00
mpt.py LLM: fix get env KV_CACHE_ALLOC_BLOCK_LENGTH type. (#10771) 2024-04-16 09:32:30 +08:00
phi.py refactor attention_softmax (#12295) 2024-10-30 13:20:50 +08:00
phi3.py refactor mistral and phi3 (#12605) 2024-12-24 17:52:32 +08:00
phixtral.py refactor to reduce old rope usage (#12219) 2024-10-17 14:45:09 +08:00
qwen.py refactor chatglm2, internlm, stablelm and qwen (#12604) 2024-12-24 18:18:00 +08:00
qwen2.py refactor mistral and phi3 (#12605) 2024-12-24 17:52:32 +08:00
qwen2_moe.py refactor merge_qkv and attention_softmax (#12213) 2024-10-16 15:58:14 +08:00
qwen2_vl.py refactor sd 1.5 and qwen2-vl and fix (#12590) 2024-12-20 17:34:55 +08:00
qwen_vl.py Support pipeline parallel for qwen-vl (#11503) 2024-07-04 18:03:57 +08:00
rwkv4.py Divide core-xe packages (#11131) 2024-05-28 12:00:18 +08:00
rwkv5.py Divide core-xe packages (#11131) 2024-05-28 12:00:18 +08:00
sd.py refactor sd 1.5 and qwen2-vl and fix (#12590) 2024-12-20 17:34:55 +08:00
stablelm.py refactor chatglm2, internlm, stablelm and qwen (#12604) 2024-12-24 18:18:00 +08:00
starcoder2.py refactor yuan2 and starcoder2 and fix (#12589) 2024-12-20 16:41:50 +08:00
utils.py refactor qwen2 and llama3 (#12587) 2024-12-20 13:25:25 +08:00
yuan.py refactor yuan2 and starcoder2 and fix (#12589) 2024-12-20 16:41:50 +08:00