From ae3b577537ba309c247949b7c22561357d79d144 Mon Sep 17 00:00:00 2001 From: Guancheng Fu <110874468+gc-fu@users.noreply.github.com> Date: Mon, 22 Apr 2024 11:07:10 +0800 Subject: [PATCH] Update README.md (#10833) --- python/llm/example/GPU/vLLM-Serving/README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/python/llm/example/GPU/vLLM-Serving/README.md b/python/llm/example/GPU/vLLM-Serving/README.md index 8afc4b51..8228a14f 100644 --- a/python/llm/example/GPU/vLLM-Serving/README.md +++ b/python/llm/example/GPU/vLLM-Serving/README.md @@ -136,6 +136,8 @@ Currently, for vLLM-v2, we support the following models: Install the dependencies for vLLM-v2 as follows: ```bash +# This directory may change depends on where you install oneAPI-basekit +source /opt/intel/oneapi/setvars.sh # First create an conda environment conda create -n ipex-vllm python=3.11 conda activate ipex-vllm @@ -200,4 +202,4 @@ Then you can access the api server as follows: "max_tokens": 128, "temperature": 0 }' & -``` \ No newline at end of file +```