ipex-llm/python
Ruonan Wang 3288acb8de LLM : Support embedding quantization (only q2k now) (#10170)
* basic logic added

* basic support

* support save&load, update mixed strategy

* fix style

* use int8 for lm_head

* add check for xpu
2024-02-20 16:56:57 +08:00
..
llm LLM : Support embedding quantization (only q2k now) (#10170) 2024-02-20 16:56:57 +08:00