intel/ipex-llm - Accelerate local LLM inference and finetuning on Intel XPUs
https://github.com/intel/ipex-llm/
* feat: serilization for quantized modules All quantized modules extending QuantModule, which has an empty Tensor for gradient. And the object mixes QuantSerializer for protobuf supporting. * refactor: serialization api changes |
||
|---|---|---|