diff --git a/docs/readthedocs/source/_templates/sidebar_quicklinks.html b/docs/readthedocs/source/_templates/sidebar_quicklinks.html
index b71cf4e4..87d2fe47 100644
--- a/docs/readthedocs/source/_templates/sidebar_quicklinks.html
+++ b/docs/readthedocs/source/_templates/sidebar_quicklinks.html
@@ -83,6 +83,9 @@
Run PyTorch Inference on an Intel GPU via Docker
+
+ Run/Develop PyTorch in VSCode with Docker on Intel GPU
+
Run llama.cpp/Ollama/Open-WebUI on an Intel GPU via Docker
diff --git a/docs/readthedocs/source/_toc.yml b/docs/readthedocs/source/_toc.yml
index 75540171..3d3a5245 100644
--- a/docs/readthedocs/source/_toc.yml
+++ b/docs/readthedocs/source/_toc.yml
@@ -21,6 +21,7 @@ subtrees:
- entries:
- file: doc/LLM/DockerGuides/docker_windows_gpu
- file: doc/LLM/DockerGuides/docker_pytorch_inference_gpu
+ - file: doc/LLM/DockerGuides/docker_run_pytorch_inference_in_vscode
- file: doc/LLM/DockerGuides/docker_cpp_xpu_quickstart
- file: doc/LLM/Quickstart/index
title: "Quickstart"
diff --git a/docs/readthedocs/source/doc/LLM/DockerGuides/docker_run_pytorch_inference_in_vscode.md b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_run_pytorch_inference_in_vscode.md
new file mode 100644
index 00000000..b625ac6b
--- /dev/null
+++ b/docs/readthedocs/source/doc/LLM/DockerGuides/docker_run_pytorch_inference_in_vscode.md
@@ -0,0 +1,139 @@
+# Run/Develop PyTorch in VSCode with Docker on Intel GPU
+
+An IPEX-LLM container is a pre-configured environment that includes all necessary dependencies for running LLMs on Intel GPUs.
+
+This guide provides steps to run/develop PyTorch examples in VSCode with Docker on Intel GPUs.
+
+```eval_rst
+.. note::
+
+ This guide assumes you have already installed VSCode in your environment.
+
+ To run/develop on Windows, install VSCode and then follow the steps below.
+
+ To run/develop on Linux, you might open VSCode first and SSH to a remote Linux machine, then proceed with the following steps.
+
+```
+
+
+## Install Docker
+
+Follow the [Docker installation Guide](./docker_windows_gpu.html#install-docker) to install docker on either Linux or Windows.
+
+## Install Extensions for VSCcode
+
+#### Install Dev Containers Extension
+For both Linux/Windows, you will need to Install Dev Containers extension.
+
+Open the Extensions view in VSCode (you can use the shortcut `Ctrl+Shift+X`), then search for and install the `Dev Containers` extension.
+
+
+
+
+
+
+#### Install WSL Extension for Windows
+
+For Windows, you will need to install wsl extension to to the WSL environment. Open the Extensions view in VSCode (you can use the shortcut `Ctrl+Shift+X`), then search for and install the `WSL` extension.
+
+Press F1 to bring up the Command Palette and type in `WSL: Connect to WSL Using Distro...` and select it and then select a specific WSL distro `Ubuntu`
+
+
+
+
+
+
+
+## Launch Container
+
+Open the Terminal in VSCode (you can use the shortcut `` Ctrl+Shift+` ``), then pull ipex-llm-xpu Docker Image:
+
+```bash
+docker pull intelanalytics/ipex-llm-xpu:latest
+```
+
+Start ipex-llm-xpu Docker Container:
+
+```eval_rst
+.. tabs::
+ .. tab:: Linux
+
+ .. code-block:: bash
+
+ export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:latest
+ export CONTAINER_NAME=my_container
+ export MODEL_PATH=/llm/models[change to your model path]
+
+ docker run -itd \
+ --net=host \
+ --device=/dev/dri \
+ --memory="32G" \
+ --name=$CONTAINER_NAME \
+ --shm-size="16g" \
+ -v $MODEL_PATH:/llm/models \
+ $DOCKER_IMAGE
+
+ .. tab:: Windows WSL
+
+ .. code-block:: bash
+
+ #/bin/bash
+ export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:latest
+ export CONTAINER_NAME=my_container
+ export MODEL_PATH=/llm/models[change to your model path]
+
+ sudo docker run -itd \
+ --net=host \
+ --privileged \
+ --device /dev/dri \
+ --memory="32G" \
+ --name=$CONTAINER_NAME \
+ --shm-size="16g" \
+ -v $MODEL_PATH:/llm/llm-models \
+ -v /usr/lib/wsl:/usr/lib/wsl \
+ $DOCKER_IMAGE
+```
+
+
+## Run/Develop Pytorch Examples
+
+Press F1 to bring up the Command Palette and type in `Dev Containers: Attach to Running Container...` and select it and then select `my_container`
+
+Now you are in a running Docker Container, Open folder `/ipex-llm/python/llm/example/GPU/HF-Transformers-AutoModels/Model/`.
+
+
+
+
+
+In this folder, we provide several PyTorch examples that you could apply IPEX-LLM INT4 optimizations on models on Intel GPUs.
+
+For example, if your model is Llama-2-7b-chat-hf and mounted on /llm/models, you can navigate to llama2 directory, excute the following command to run example:
+ ```bash
+ cd
+ python ./generate.py --repo-id-or-model-path /llm/models/Llama-2-7b-chat-hf --prompt PROMPT --n-predict N_PREDICT
+ ```
+
+
+Arguments info:
+- `--repo-id-or-model-path REPO_ID_OR_MODEL_PATH`: argument defining the huggingface repo id for the Llama2 model (e.g. `meta-llama/Llama-2-7b-chat-hf` and `meta-llama/Llama-2-13b-chat-hf`) to be downloaded, or the path to the huggingface checkpoint folder. It is default to be `'meta-llama/Llama-2-7b-chat-hf'`.
+- `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'What is AI?'`.
+- `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
+
+**Sample Output**
+```log
+Inference time: xxxx s
+-------------------- Prompt --------------------
+[INST] <>
+
+<>
+
+What is AI? [/INST]
+-------------------- Output --------------------
+[INST] <>
+
+<>
+
+What is AI? [/INST] Artificial intelligence (AI) is the broader field of research and development aimed at creating machines that can perform tasks that typically require human intelligence,
+```
+
+You can develop your own PyTorch example based on these examples.
diff --git a/docs/readthedocs/source/doc/LLM/DockerGuides/index.rst b/docs/readthedocs/source/doc/LLM/DockerGuides/index.rst
index c4b24d7e..9e9f02fd 100644
--- a/docs/readthedocs/source/doc/LLM/DockerGuides/index.rst
+++ b/docs/readthedocs/source/doc/LLM/DockerGuides/index.rst
@@ -6,6 +6,7 @@ In this section, you will find guides related to using IPEX-LLM with Docker, cov
* `Overview of IPEX-LLM Containers for Intel GPU <./docker_windows_gpu.html>`_
* `Run PyTorch Inference on an Intel GPU via Docker <./docker_pytorch_inference_gpu.html>`_
+* `Run/Develop PyTorch in VSCode with Docker on Intel GPU <./docker_pytorch_inference_gpu.html>`_
* `Run llama.cpp/Ollama/open-webui with Docker on Intel GPU <./docker_cpp_xpu_quickstart.html>`_
* `Run IPEX-LLM integrated FastChat with Docker on Intel GPU <./fastchat_docker_quickstart>`_
-* `Run IPEX-LLM integrated vLLM with Docker on Intel GPU <./vllm_docker_quickstart>`_
\ No newline at end of file
+* `Run IPEX-LLM integrated vLLM with Docker on Intel GPU <./vllm_docker_quickstart>`_