ipex-llm/python
Qiyuan Gong f4537798c1
Enable kv cache quantization by default for flex when 1 < batch <= 8 (#10584)
* Enable kv cache quantization by default for flex when 1 < batch <= 8.
* Change up bound from <8 to <=8.
2024-03-29 09:43:42 +08:00
..
llm Enable kv cache quantization by default for flex when 1 < batch <= 8 (#10584) 2024-03-29 09:43:42 +08:00