From 84239d0bd34d34f58ffe805ffd198befb72d88c6 Mon Sep 17 00:00:00 2001
From: Shaojun Liu <61072813+liu-shaojun@users.noreply.github.com>
Date: Fri, 17 May 2024 11:06:11 +0800
Subject: [PATCH] Update docker image tags in Docker Quickstart (#11061)

* update docker image tag to latest

* add note

* simplify note

* add link in reStructuredText

* minor fix

* update tag
---
 .../DockerGuides/docker_pytorch_inference_gpu.md   | 13 ++++++++++---
 .../doc/LLM/DockerGuides/docker_windows_gpu.md     | 14 +++++++-------
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/docs/readthedocs/source/doc/LLM/DockerGuides/docker_pytorch_inference_gpu.md b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_pytorch_inference_gpu.md
index e3245d41..7a3d3550 100644
--- a/docs/readthedocs/source/doc/LLM/DockerGuides/docker_pytorch_inference_gpu.md
+++ b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_pytorch_inference_gpu.md
@@ -2,6 +2,13 @@
 
 We can run PyTorch Inference Benchmark, Chat Service and PyTorch Examples on Intel GPUs within Docker (on Linux or WSL).
 
+```eval_rst
+.. note::
+
+   The current Windows + WSL + Docker solution only supports Arc series dGPU. For Windows users with MTL iGPU, it is recommended to install directly via pip install in Anaconda Prompt. Refer to `this guide <https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/install_windows_gpu.html>`_.
+
+```
+
 ## Install Docker
 
 Follow the [Docker installation Guide](./docker_windows_gpu.html#install-docker) to install docker on either Linux or Windows.
@@ -10,7 +17,7 @@ Follow the [Docker installation Guide](./docker_windows_gpu.html#install-docker)
 
 Prepare ipex-llm-xpu Docker Image:
 ```bash
-docker pull intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-xpu:latest
 ```
 
 Start ipex-llm-xpu Docker Container:
@@ -21,7 +28,7 @@ Start ipex-llm-xpu Docker Container:
 
       .. code-block:: bash
 
-        export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT
+        export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:latest
         export CONTAINER_NAME=my_container
         export MODEL_PATH=/llm/models[change to your model path]
 
@@ -39,7 +46,7 @@ Start ipex-llm-xpu Docker Container:
       .. code-block:: bash
 
          #/bin/bash
-        export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT
+        export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:latest
         export CONTAINER_NAME=my_container
         export MODEL_PATH=/llm/models[change to your model path]
 
diff --git a/docs/readthedocs/source/doc/LLM/DockerGuides/docker_windows_gpu.md b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_windows_gpu.md
index 6014229b..f0b2d6e4 100644
--- a/docs/readthedocs/source/doc/LLM/DockerGuides/docker_windows_gpu.md
+++ b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_windows_gpu.md
@@ -80,13 +80,13 @@ We have several docker images available for running LLMs on Intel GPUs. The foll
 
 | Image Name | Description | Use Case |
 |------------|-------------|----------|
-| intelanalytics/ipex-llm-cpu:2.1.0-SNAPSHOT | CPU Inference |For development and running LLMs using llama.cpp, Ollama and Python|
-| intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT | GPU Inference |For development and running LLMs using llama.cpp, Ollama and Python|
-| intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT | CPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat|
-| intelanalytics/ipex-llm-serving-xpu:2.1.0-SNAPSHOT | GPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat|
-| intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.1.0-SNAPSHOT | CPU Finetuning via Docker|For fine-tuning LLMs using QLora/Lora, etc. |
-|intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:2.1.0-SNAPSHOT|CPU Finetuning via Kubernetes|For fine-tuning LLMs using QLora/Lora, etc. |
-| intelanalytics/ipex-llm-finetune-qlora-xpu:2.1.0-SNAPSHOT| GPU Finetuning|For fine-tuning LLMs using QLora/Lora, etc.|
+| intelanalytics/ipex-llm-cpu:latest | CPU Inference |For development and running LLMs using llama.cpp, Ollama and Python|
+| intelanalytics/ipex-llm-xpu:latest | GPU Inference |For development and running LLMs using llama.cpp, Ollama and Python|
+| intelanalytics/ipex-llm-serving-cpu:latest | CPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat|
+| intelanalytics/ipex-llm-serving-xpu:latest | GPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat|
+| intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:latest | CPU Finetuning via Docker|For fine-tuning LLMs using QLora/Lora, etc. |
+|intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:latest|CPU Finetuning via Kubernetes|For fine-tuning LLMs using QLora/Lora, etc. |
+| intelanalytics/ipex-llm-finetune-qlora-xpu:latest| GPU Finetuning|For fine-tuning LLMs using QLora/Lora, etc.|
 
 We have also provided several quickstarts for various usage scenarios:
 - [Run and develop LLM applications in PyTorch](./docker_pytorch_inference_gpu.html)