intel/ipex-llm - Accelerate local LLM inference and finetuning on Intel XPUs
https://github.com/intel/ipex-llm/
Quantization object support: 1. quantize a value with max and min. 2. quantize an array 3. quantize a Tensor[Float] And for test, there're relative dequantize methods. |
||
|---|---|---|