ipex-llm/python/llm
Qiyuan Gong f4537798c1
Enable kv cache quantization by default for flex when 1 < batch <= 8 (#10584)
* Enable kv cache quantization by default for flex when 1 < batch <= 8.
* Change up bound from <8 to <=8.
2024-03-29 09:43:42 +08:00
..
dev LLM: Set different env based on different Linux kernels (#10566) 2024-03-27 17:56:33 +08:00
example Replace ipex with ipex-llm (#10554) 2024-03-28 13:54:40 +08:00
portable-zip Update_document by heyang (#30) 2024-03-25 10:06:02 +08:00
scripts Update_document by heyang (#30) 2024-03-25 10:06:02 +08:00
src/ipex_llm Enable kv cache quantization by default for flex when 1 < batch <= 8 (#10584) 2024-03-29 09:43:42 +08:00
test Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
.gitignore [LLM] add chatglm pybinding binary file release (#8677) 2023-08-04 11:45:27 +08:00
setup.py Update pip install to use --extra-index-url for ipex package (#10557) 2024-03-28 09:56:23 +08:00
version.txt Update setup.py and add new actions and add compatible mode (#25) 2024-03-22 15:44:59 +08:00