ipex-llm/docker/llm/serving/xpu/docker/start-lightweight_serving-service.sh
Wang, Jian4 1083fe5508
Reenable pp and lightweight-serving serving on 0.6.6 (#12814)
* reenable pp ang lightweight serving on 066

* update readme

* updat

* update tag
2025-02-13 10:16:00 +08:00

7 lines
No EOL
288 B
Bash

# need to update transformers version first
# pip install transformers==4.37.0
cd /llm/lightweight_serving
export IPEX_LLM_NOT_USE_VLLM=True
model_path="/llm/models/Llama-2-7b-chat-hf"
low_bit="sym_int4"
python lightweight_serving.py --repo-id-or-model-path $model_path --low-bit $low_bit