ipex-llm/python/llm
Wang, Jian4 209c3501e6
LLM: Optimize qwen1.5 moe model (#10706)
* update moe block

* fix style

* enable optmize MLP

* enabel kv_cache

* enable fuse rope

* enable fused qkv

* enable flash_attention

* error sdp quantize

* use old api

* use fuse

* use xetla

* fix python style

* update moe_blocks num

* fix output error

* add cpu sdpa

* update

* update

* update
2024-04-18 14:54:05 +08:00
..
dev Merge pull request #10697 from MargarettMao/ceval 2024-04-12 14:37:47 +08:00
example LISA Finetuning Example (#10743) 2024-04-18 13:48:10 +08:00
portable-zip Fix baichuan-13b issue on portable zip under transformers 4.36 (#10746) 2024-04-12 16:27:01 -07:00
scripts Update Env check Script (#10709) 2024-04-10 15:06:00 +08:00
src/ipex_llm LLM: Optimize qwen1.5 moe model (#10706) 2024-04-18 14:54:05 +08:00
test edit 'ppl_result does not exist' issue, delete useless code (#10767) 2024-04-16 18:11:56 +08:00
.gitignore [LLM] add chatglm pybinding binary file release (#8677) 2023-08-04 11:45:27 +08:00
setup.py Update setup.py for bigdl-core-xe-esimd-21 on Windows (#10705) 2024-04-09 18:21:21 +08:00
version.txt Update setup.py and add new actions and add compatible mode (#25) 2024-03-22 15:44:59 +08:00