ipex-llm/python/llm
Yina Chen 3cd4e87168
Support compress KV with quantize KV (#11812)
* update llama

* support llama 4.41

* fix style

* support minicpm

* support qwen2

* support minicpm & update

* support chatglm4

* support chatglm

* remove print

* add DynamicCompressFp8Cache & support qwen

* support llama

* support minicpm phi3

* update chatglm2/4

* small fix & support qwen 4.42

* remove print
2024-08-19 15:32:32 +08:00
..
dev Fixes regarding utf-8 in all-in-one benchmark (#11839) 2024-08-19 10:38:00 +08:00
example Codegeex2 tokenization fix (#11831) 2024-08-16 15:48:47 +08:00
portable-zip Fix null pointer dereferences error. (#11125) 2024-05-30 16:16:10 +08:00
scripts fix typo in python/llm/scripts/README.md (#11536) 2024-07-09 09:53:14 +08:00
src/ipex_llm Support compress KV with quantize KV (#11812) 2024-08-19 15:32:32 +08:00
test Remove gemma-2-9b-it 3k input from igpu-perf (#11834) 2024-08-17 13:10:05 +08:00
tpp OSPDT: add tpp licenses (#11165) 2024-06-06 10:59:06 +08:00
.gitignore [LLM] add chatglm pybinding binary file release (#8677) 2023-08-04 11:45:27 +08:00
setup.py update doc/setup to use onednn gemm for cpp (#11598) 2024-07-18 13:04:38 +08:00
version.txt Update setup.py and add new actions and add compatible mode (#25) 2024-03-22 15:44:59 +08:00