From 84239d0bd34d34f58ffe805ffd198befb72d88c6 Mon Sep 17 00:00:00 2001 From: Shaojun Liu <61072813+liu-shaojun@users.noreply.github.com> Date: Fri, 17 May 2024 11:06:11 +0800 Subject: [PATCH] Update docker image tags in Docker Quickstart (#11061) * update docker image tag to latest * add note * simplify note * add link in reStructuredText * minor fix * update tag --- .../DockerGuides/docker_pytorch_inference_gpu.md | 13 ++++++++++--- .../doc/LLM/DockerGuides/docker_windows_gpu.md | 14 +++++++------- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/docs/readthedocs/source/doc/LLM/DockerGuides/docker_pytorch_inference_gpu.md b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_pytorch_inference_gpu.md index e3245d41..7a3d3550 100644 --- a/docs/readthedocs/source/doc/LLM/DockerGuides/docker_pytorch_inference_gpu.md +++ b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_pytorch_inference_gpu.md @@ -2,6 +2,13 @@ We can run PyTorch Inference Benchmark, Chat Service and PyTorch Examples on Intel GPUs within Docker (on Linux or WSL). +```eval_rst +.. note:: + + The current Windows + WSL + Docker solution only supports Arc series dGPU. For Windows users with MTL iGPU, it is recommended to install directly via pip install in Anaconda Prompt. Refer to `this guide `_. + +``` + ## Install Docker Follow the [Docker installation Guide](./docker_windows_gpu.html#install-docker) to install docker on either Linux or Windows. @@ -10,7 +17,7 @@ Follow the [Docker installation Guide](./docker_windows_gpu.html#install-docker) Prepare ipex-llm-xpu Docker Image: ```bash -docker pull intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT +docker pull intelanalytics/ipex-llm-xpu:latest ``` Start ipex-llm-xpu Docker Container: @@ -21,7 +28,7 @@ Start ipex-llm-xpu Docker Container: .. code-block:: bash - export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT + export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:latest export CONTAINER_NAME=my_container export MODEL_PATH=/llm/models[change to your model path] @@ -39,7 +46,7 @@ Start ipex-llm-xpu Docker Container: .. code-block:: bash #/bin/bash - export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT + export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:latest export CONTAINER_NAME=my_container export MODEL_PATH=/llm/models[change to your model path] diff --git a/docs/readthedocs/source/doc/LLM/DockerGuides/docker_windows_gpu.md b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_windows_gpu.md index 6014229b..f0b2d6e4 100644 --- a/docs/readthedocs/source/doc/LLM/DockerGuides/docker_windows_gpu.md +++ b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_windows_gpu.md @@ -80,13 +80,13 @@ We have several docker images available for running LLMs on Intel GPUs. The foll | Image Name | Description | Use Case | |------------|-------------|----------| -| intelanalytics/ipex-llm-cpu:2.1.0-SNAPSHOT | CPU Inference |For development and running LLMs using llama.cpp, Ollama and Python| -| intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT | GPU Inference |For development and running LLMs using llama.cpp, Ollama and Python| -| intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT | CPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat| -| intelanalytics/ipex-llm-serving-xpu:2.1.0-SNAPSHOT | GPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat| -| intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.1.0-SNAPSHOT | CPU Finetuning via Docker|For fine-tuning LLMs using QLora/Lora, etc. | -|intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:2.1.0-SNAPSHOT|CPU Finetuning via Kubernetes|For fine-tuning LLMs using QLora/Lora, etc. | -| intelanalytics/ipex-llm-finetune-qlora-xpu:2.1.0-SNAPSHOT| GPU Finetuning|For fine-tuning LLMs using QLora/Lora, etc.| +| intelanalytics/ipex-llm-cpu:latest | CPU Inference |For development and running LLMs using llama.cpp, Ollama and Python| +| intelanalytics/ipex-llm-xpu:latest | GPU Inference |For development and running LLMs using llama.cpp, Ollama and Python| +| intelanalytics/ipex-llm-serving-cpu:latest | CPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat| +| intelanalytics/ipex-llm-serving-xpu:latest | GPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat| +| intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:latest | CPU Finetuning via Docker|For fine-tuning LLMs using QLora/Lora, etc. | +|intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:latest|CPU Finetuning via Kubernetes|For fine-tuning LLMs using QLora/Lora, etc. | +| intelanalytics/ipex-llm-finetune-qlora-xpu:latest| GPU Finetuning|For fine-tuning LLMs using QLora/Lora, etc.| We have also provided several quickstarts for various usage scenarios: - [Run and develop LLM applications in PyTorch](./docker_pytorch_inference_gpu.html)