diff --git a/docker/llm/inference/cpu/docker/Dockerfile b/docker/llm/inference/cpu/docker/Dockerfile
index 80747466..4cc05c2d 100644
--- a/docker/llm/inference/cpu/docker/Dockerfile
+++ b/docker/llm/inference/cpu/docker/Dockerfile
@@ -22,7 +22,8 @@ RUN env DEBIAN_FRONTEND=noninteractive apt-get update && \
     pip install --pre --upgrade bigdl-llm[all] && \
     pip install --pre --upgrade bigdl-nano && \
 # Download chat.py script
-    wget -P /root https://raw.githubusercontent.com/intel-analytics/BigDL/main/python/llm/portable-executable/chat.py && \
+    pip install --upgrade colorama && \
+    wget -P /root https://raw.githubusercontent.com/intel-analytics/BigDL/main/python/llm/portable-zip/chat.py && \
     export PYTHONUNBUFFERED=1
 
 ENTRYPOINT ["/bin/bash"]
\ No newline at end of file
diff --git a/docker/llm/inference/cpu/docker/README.md b/docker/llm/inference/cpu/docker/README.md
index 805cfc07..589fe32c 100644
--- a/docker/llm/inference/cpu/docker/README.md
+++ b/docker/llm/inference/cpu/docker/README.md
@@ -32,3 +32,13 @@ sudo docker run -itd \
 After the container is booted, you could get into the container through `docker exec`.
 
 To run inference using `BigDL-LLM` using cpu, you could refer to this [documentation](https://github.com/intel-analytics/BigDL/tree/main/python/llm#cpu-int4).
+
+### Use chat.py
+
+chat.py can be used to initiate a conversation with a specified model. The file is under directory '/root'.
+
+To run chat.py:
+```
+cd /root
+python chat.py --model-path YOUR_MODEL_PATH
+```