ipex-llm/python/llm/example
Guancheng Fu 4eed0c7d99
initial implementation for low_bit_loader vLLM (#12838)
* initial

* add logic for handling tensor parallel models

* fix

* Add some comments

* add doc

* fix done
2025-02-19 19:45:34 +08:00
..
CPU Add DeepSeek V3/R1 CPU example (#12836) 2025-02-18 12:45:49 +08:00
GPU initial implementation for low_bit_loader vLLM (#12838) 2025-02-19 19:45:34 +08:00
NPU/HF-Transformers-AutoModels common.h -> npu/npu_common.h (#12800) 2025-02-10 14:38:22 +08:00