ipex-llm/python/llm/src/ipex_llm
Ruonan Wang 2b2cb9c693
[NPU pipeline] Support save & load and update examples (#12293)
* support save & load, update llama examples

* update baichuan2 example

* update readme
2024-10-30 10:02:00 +08:00
..
cli
ggml Init NPU quantize method and support q8_0_rtn (#11452) 2024-07-01 13:45:07 +08:00
gptq
langchain
llamaindex
serving Support lightweight-serving glm-4v-9b (#11994) 2024-09-05 09:25:08 +08:00
transformers [NPU pipeline] Support save & load and update examples (#12293) 2024-10-30 10:02:00 +08:00
utils Add benchmark_util for transformers >= 4.44.0 (#12171) 2024-10-14 15:40:12 +08:00
vllm Enable vllm multimodal minicpm-v-2-6 (#12074) 2024-09-13 13:28:35 +08:00
__init__.py
convert_model.py
format.sh
llm_patching.py
models.py
optimize.py support passing None to low_bit in optimize_model (#12121) 2024-09-26 11:09:35 +08:00