From 86b81c09d90fe90f005afe931330d4c0fff33491 Mon Sep 17 00:00:00 2001
From: SichengStevenLi <144295301+SichengStevenLi@users.noreply.github.com>
Date: Fri, 28 Jun 2024 10:41:00 +0800
Subject: [PATCH] Table of Contents in Quickstart Files (#11437)

* fixed a minor grammar mistake

* added table of contents

* added table of contents

* changed table of contents indexing

* added table of contents

* added table of contents, changed grammar

* added table of contents

* added table of contents

* added table of contents

* added table of contents

* added table of contents

* added table of contents, modified chapter numbering

* fixed troubleshooting section redirection path

* added table of contents

* added table of contents, modified section numbering

* added table of contents, modified section numbering

* added table of contents

* added table of contents, changed title size, modified numbering

* added table of contents, changed section title size and capitalization

* added table of contents, modified section numbering

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents syntax

* changed table of contents capitalization issue

* changed table of contents capitalization issue

* changed table of contents location

* changed table of contents

* changed table of contents

* changed section capitalization

* removed comments

* removed comments

* removed comments
---
 docs/mddocs/Quickstart/README.md                |  2 +-
 docs/mddocs/Quickstart/axolotl_quickstart.md    |  9 +++++++++
 docs/mddocs/Quickstart/benchmark_quickstart.md  | 13 ++++++++++---
 docs/mddocs/Quickstart/bigdl_llm_migration.md   |  5 +++++
 docs/mddocs/Quickstart/chatchat_quickstart.md   | 10 +++++++++-
 docs/mddocs/Quickstart/continue_quickstart.md   |  7 +++++++
 .../deepspeed_autotp_fastapi_quickstart.md      |  5 +++++
 docs/mddocs/Quickstart/dify_quickstart.md       |  7 +++++++
 docs/mddocs/Quickstart/fastchat_quickstart.md   |  5 +++++
 docs/mddocs/Quickstart/install_linux_gpu.md     |  9 +++++++++
 docs/mddocs/Quickstart/install_windows_gpu.md   |  8 ++++++++
 .../llama3_llamacpp_ollama_quickstart.md        |  6 ++++++
 docs/mddocs/Quickstart/llama_cpp_quickstart.md  | 15 ++++++++++++---
 docs/mddocs/Quickstart/ollama_quickstart.md     | 15 +++++++++++----
 .../open_webui_with_ollama_quickstart.md        | 11 +++++++++--
 docs/mddocs/Quickstart/privateGPT_quickstart.md |  6 ++++++
 docs/mddocs/Quickstart/ragflow_quickstart.md    | 12 ++++++++++--
 docs/mddocs/Quickstart/vLLM_quickstart.md       | 17 ++++++++++++-----
 docs/mddocs/Quickstart/webui_quickstart.md      | 14 +++++++++++---
 19 files changed, 152 insertions(+), 24 deletions(-)
diff --git a/docs/mddocs/Quickstart/README.md b/docs/mddocs/Quickstart/README.md
index 3294761c..2f76c59b 100644
--- a/docs/mddocs/Quickstart/README.md
+++ b/docs/mddocs/Quickstart/README.md
@@ -1,7 +1,7 @@
 # IPEX-LLM Quickstart
 
 > [!NOTE]
-> We are adding more Quickstart guide.
+> We are adding more Quickstart guides.
 
 This section includes efficient guide to show you how to:
 
diff --git a/docs/mddocs/Quickstart/axolotl_quickstart.md b/docs/mddocs/Quickstart/axolotl_quickstart.md
index c0654cd2..e50b9f8e 100644
--- a/docs/mddocs/Quickstart/axolotl_quickstart.md
+++ b/docs/mddocs/Quickstart/axolotl_quickstart.md
@@ -13,6 +13,15 @@ See the demo of finetuning LLaMA2-7B on Intel Arc GPU below.
   </tr>
 </table>
 
+## Table of Contents
+- [Prerequisites](./axolotl_quickstart.md#0-prerequisites)
+- [Install IPEX-LLM for Axolotl](./axolotl_quickstart.md#1-install-ipex-llm-for-axolotl)
+- [Example: Finetune Llama-2-7B with Axolotl](./axolotl_quickstart.md#2-example-finetune-llama-2-7b-with-axolotl)
+- [Finetune Llama-3-8B (Experimental)](./axolotl_quickstart.md#3-finetune-llama-3-8b-experimental)
+- [Troubleshooting](./axolotl_quickstart.md#troubleshooting)
+
+
+
 ## Quickstart
 
 ### 0. Prerequisites
diff --git a/docs/mddocs/Quickstart/benchmark_quickstart.md b/docs/mddocs/Quickstart/benchmark_quickstart.md
index a677398e..fc5ce949 100644
--- a/docs/mddocs/Quickstart/benchmark_quickstart.md
+++ b/docs/mddocs/Quickstart/benchmark_quickstart.md
@@ -2,7 +2,14 @@
 
 We can perform benchmarking for IPEX-LLM on Intel CPUs and GPUs using the benchmark scripts we provide.
 
-## Prepare The Environment
+## Table of Contents
+- [Prepare the Environment](./benchmark_quickstart.md#prepare-the-environment)
+- [Prepare the Scripts](./benchmark_quickstart.md#prepare-the-scripts)
+- [Run on Windows](./benchmark_quickstart.md#run-on-windows)
+- [Run on Linux](./benchmark_quickstart.md#run-on-linux)
+- [Result](./benchmark_quickstart.md#result)
+
+## Prepare the Environment
 
 You can refer to [here](../Overview/install.md) to install IPEX-LLM in your environment. The following dependencies are also needed to run the benchmark scripts.
 
@@ -11,7 +18,7 @@ pip install pandas
 pip install omegaconf
 ```
 
-## Prepare The Scripts
+## Prepare the Scripts
 
 Navigate to your local workspace and then download IPEX-LLM from GitHub. Modify the `config.yaml` under `all-in-one` folder for your benchmark configurations.
 
@@ -21,7 +28,7 @@ git clone https://github.com/intel-analytics/ipex-llm.git
 cd ipex-llm/python/llm/dev/benchmark/all-in-one/
 ```
 
-## config.yaml
+### config.yaml
 
 
 ```yaml
diff --git a/docs/mddocs/Quickstart/bigdl_llm_migration.md b/docs/mddocs/Quickstart/bigdl_llm_migration.md
index 0b7643e1..f6a76f34 100644
--- a/docs/mddocs/Quickstart/bigdl_llm_migration.md
+++ b/docs/mddocs/Quickstart/bigdl_llm_migration.md
@@ -2,6 +2,11 @@
 
 This guide helps you migrate your `bigdl-llm` application to use `ipex-llm`.
 
+## Table of Contents
+- [Upgrade `bigdl-llm` package to `ipex-llm`](./bigdl_llm_migration.md#1-upgrade-bigdl-llm-code-to-ipex-llm)
+- [Migrate `bigdl-llm` code to `ipex-llm`](./bigdl_llm_migration.md#migrate-bigdl-llm-code-to-ipex-llm)
+
+
 ## Upgrade `bigdl-llm` package to `ipex-llm`
 
 > [!NOTE]
diff --git a/docs/mddocs/Quickstart/chatchat_quickstart.md b/docs/mddocs/Quickstart/chatchat_quickstart.md
index 217d199c..8aaa4307 100644
--- a/docs/mddocs/Quickstart/chatchat_quickstart.md
+++ b/docs/mddocs/Quickstart/chatchat_quickstart.md
@@ -21,12 +21,20 @@
 > [!NOTE]
 > You can change the UI language in the left-side menu. We currently support **English** and **简体中文** (see video demos below).
 
+## Table of Contents
+- [Langchain-Chatchat Architecture](./chatchat_quickstart.md#langchain-chatchat-architecture)
+- [Install and Run](./chatchat_quickstart.md#install-and-run)
+- [How to Use RAG](./chatchat_quickstart.md#how-to-use-rag)
+- [Troubleshooting & Tips](./chatchat_quickstart.md#troubleshooting--tips)
+
+
 ## Langchain-Chatchat Architecture
 
 See the Langchain-Chatchat architecture below ([source](https://github.com/chatchat-space/Langchain-Chatchat/blob/master/docs/img/langchain%2Bchatglm.png)).
 
 <img src="https://llm-assets.readthedocs.io/en/latest/_images/langchain-arch.png" height="50%" />
 
+
 ## Quickstart
 
 ### Install and Run
@@ -72,7 +80,7 @@ You can now click `Dialogue` on the left-side menu to return to the chat UI. The
 
 For more information about how to use Langchain-Chatchat, refer to Official Quickstart guide in [English](https://github.com/chatchat-space/Langchain-Chatchat/blob/master/README_en.md#), [Chinese](https://github.com/chatchat-space/Langchain-Chatchat/blob/master/README.md#), or the [Wiki](https://github.com/chatchat-space/Langchain-Chatchat/wiki/).
 
-### Trouble Shooting & Tips
+### Troubleshooting & Tips
 
 #### 1. Version Compatibility
 
diff --git a/docs/mddocs/Quickstart/continue_quickstart.md b/docs/mddocs/Quickstart/continue_quickstart.md
index 9bfbd1b1..d3feb289 100644
--- a/docs/mddocs/Quickstart/continue_quickstart.md
+++ b/docs/mddocs/Quickstart/continue_quickstart.md
@@ -14,6 +14,13 @@ Below is a demo of using `Continue` with [CodeQWen1.5-7B](https://huggingface.co
   </tr>
 </table>
 
+## Table of Contents
+- [Install and Run Ollama Serve](./continue_quickstart.md#1-install-and-run-ollama-serve)
+- [Pull and Prepare the Model](./continue_quickstart.md#2-pull-and-prepare-the-model)
+- [Install `Continue` Extension](./continue_quickstart.md#3-install-continue-extension)
+- [`Continue` Configuration](./continue_quickstart.md#4-continue-configuration)
+- [How to Use `Continue`](./continue_quickstart.md#5-how-to-use-continue)
+
 ## Quickstart
 
 This guide walks you through setting up and running **Continue** within _Visual Studio Code_, empowered by local large language models served via [Ollama](./ollama_quickstart.md) with `ipex-llm` optimizations.
diff --git a/docs/mddocs/Quickstart/deepspeed_autotp_fastapi_quickstart.md b/docs/mddocs/Quickstart/deepspeed_autotp_fastapi_quickstart.md
index 17e51dca..0fa9888b 100644
--- a/docs/mddocs/Quickstart/deepspeed_autotp_fastapi_quickstart.md
+++ b/docs/mddocs/Quickstart/deepspeed_autotp_fastapi_quickstart.md
@@ -2,6 +2,11 @@
 
 This example demonstrates how to run IPEX-LLM serving on multiple [Intel GPUs](../../../python/llm/example/GPU/README.md) by leveraging DeepSpeed AutoTP.
 
+## Table of Contents
+- [Requirements](./deepspeed_autotp_fastapi_quickstart.md#requirements)
+- [Example](./deepspeed_autotp_fastapi_quickstart.md#example)
+
+
 ## Requirements
 
 To run this example with IPEX-LLM on Intel GPUs, we have some recommended requirements for your machine, please refer to [here](../../../python/llm/example/GPU/README.md#requirements) for more information. For this particular example, you will need at least two GPUs on your machine.
diff --git a/docs/mddocs/Quickstart/dify_quickstart.md b/docs/mddocs/Quickstart/dify_quickstart.md
index d507f9bd..68c6544d 100644
--- a/docs/mddocs/Quickstart/dify_quickstart.md
+++ b/docs/mddocs/Quickstart/dify_quickstart.md
@@ -15,6 +15,13 @@
   </tr>
 </table>
 
+## Table of Contents
+- [Install and Start Ollama Service on Intel GPU](./dify_quickstart.md#1-install-and-start-ollama-service-on-intel-gpu)
+- [Install and Start Dify](./dify_quickstart.md#2-install-and-start-dify)
+- [How to Use Dify](./dify_quickstart.md#3-how-to-use-dify)
+
+
+
 ## Quickstart
 
 ### 1. Install and Start `Ollama` Service on Intel GPU 
diff --git a/docs/mddocs/Quickstart/fastchat_quickstart.md b/docs/mddocs/Quickstart/fastchat_quickstart.md
index e89c64f6..43145739 100644
--- a/docs/mddocs/Quickstart/fastchat_quickstart.md
+++ b/docs/mddocs/Quickstart/fastchat_quickstart.md
@@ -4,6 +4,11 @@ FastChat is an open platform for training, serving, and evaluating large languag
 
 IPEX-LLM can be easily integrated into FastChat so that user can use `IPEX-LLM` as a serving backend in the deployment.
 
+## Table of Contents
+- [Install IPEX-LLM with FastChat](./fastchat_quickstart.md#1-install-ipex-llm-with-fastchat)
+- [Start the Service](./fastchat_quickstart.md#2-start-the-service)
+
+
 ## Quick Start
 
 This quickstart guide walks you through installing and running `FastChat` with `ipex-llm`.
diff --git a/docs/mddocs/Quickstart/install_linux_gpu.md b/docs/mddocs/Quickstart/install_linux_gpu.md
index afb64e6f..8fc0c8ff 100644
--- a/docs/mddocs/Quickstart/install_linux_gpu.md
+++ b/docs/mddocs/Quickstart/install_linux_gpu.md
@@ -4,6 +4,15 @@ This guide demonstrates how to install IPEX-LLM on Linux with Intel GPUs. It app
 
 IPEX-LLM currently supports the Ubuntu 20.04 operating system and later, and supports PyTorch 2.0 and PyTorch 2.1 on Linux. This page demonstrates IPEX-LLM with PyTorch 2.1. Check the [Installation](../Overview/install_gpu.md#linux) page for more details.
 
+
+## Table of Contents
+- [Install Prerequisites](./install_linux_gpu.md#install-prerequisites)
+- [Install ipex-llm](./install_linux_gpu.md#install-ipex-llm)
+- [Verify Installation](./install_linux_gpu.md#verify-installation)
+- [Runtime Configurations](./install_linux_gpu.md#runtime-configurations)
+- [A Quick Example](./install_linux_gpu.md#a-quick-example)
+- [Tips & Troubleshooting](./install_linux_gpu.md#tips--troubleshooting)
+
 ## Install Prerequisites
 
 ### Install GPU Driver
diff --git a/docs/mddocs/Quickstart/install_windows_gpu.md b/docs/mddocs/Quickstart/install_windows_gpu.md
index eb7fa9f9..a77d855b 100644
--- a/docs/mddocs/Quickstart/install_windows_gpu.md
+++ b/docs/mddocs/Quickstart/install_windows_gpu.md
@@ -4,6 +4,14 @@ This guide demonstrates how to install IPEX-LLM on Windows with Intel GPUs.
 
 It applies to Intel Core Ultra and Core 11 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU.
 
+## Table of Contents
+- [Install Prerequisites](./install_windows_gpu.md#install-prerequisites)
+- [Install ipex-llm](./install_windows_gpu.md#install-ipex-llm)
+- [Verify Installation](./install_windows_gpu.md#verify-installation)
+- [Monitor GPU Status](./install_windows_gpu.md#monitor-gpu-status)
+- [A Quick Example](./install_windows_gpu.md#a-quick-example)
+- [Tips & Troubleshooting](./install_windows_gpu.md#tips--troubleshooting)
+
 ## Install Prerequisites
 
 ### (Optional) Update GPU Driver
diff --git a/docs/mddocs/Quickstart/llama3_llamacpp_ollama_quickstart.md b/docs/mddocs/Quickstart/llama3_llamacpp_ollama_quickstart.md
index 8ab22500..5f2dabe7 100644
--- a/docs/mddocs/Quickstart/llama3_llamacpp_ollama_quickstart.md
+++ b/docs/mddocs/Quickstart/llama3_llamacpp_ollama_quickstart.md
@@ -15,6 +15,12 @@ See the demo of running Llama-3-8B-Instruct on Intel Arc GPU using `Ollama` belo
   </tr>
 </table>
 
+## Table of Contents
+- [Run Llama 3 using llama.cpp](./llama3_llamacpp_ollama_quickstart.md#1-run-llama-3-using-llamacpp)
+- [Run Llama3 using Ollama](./llama3_llamacpp_ollama_quickstart.md#2-run-llama3-using-ollama)
+
+
+
 ## Quick Start
 This quickstart guide walks you through how to run Llama 3 on Intel GPU using `llama.cpp` / `Ollama` with IPEX-LLM.
 
diff --git a/docs/mddocs/Quickstart/llama_cpp_quickstart.md b/docs/mddocs/Quickstart/llama_cpp_quickstart.md
index 1297f474..adcccd39 100644
--- a/docs/mddocs/Quickstart/llama_cpp_quickstart.md
+++ b/docs/mddocs/Quickstart/llama_cpp_quickstart.md
@@ -18,6 +18,15 @@ See the demo of running LLaMA2-7B on Intel Arc GPU below.
 >
 > Our latest version is consistent with [62bfef5](https://github.com/ggerganov/llama.cpp/commit/62bfef5194d5582486d62da3db59bf44981b7912) of llama.cpp.
 
+## Table of Contents
+- [Prerequisites](./llama_cpp_quickstart.md#0-prerequisites)
+- [Install IPEX-LLM for llama.cpp](./llama_cpp_quickstart.md#1-install-ipex-llm-for-llamacpp)
+- [Setup for running llama.cpp](./llama_cpp_quickstart.md#2-setup-for-running-llamacpp)
+- [Example: Running community GGUF models with IPEX-LLM](./llama_cpp_quickstart.md#3-example-running-community-gguf-models-with-ipex-llm)
+- [Troubleshooting](./llama_cpp_quickstart.md#troubleshooting)
+
+
+
 ## Quick Start
 This quickstart guide walks you through installing and running `llama.cpp` with `ipex-llm`.
 
@@ -35,7 +44,7 @@ IPEX-LLM backend for llama.cpp only supports the more recent GPU drivers. Please
 
 If you have lower GPU driver version, visit the [Install IPEX-LLM on Windows with Intel GPU Guide](./install_windows_gpu.md), and follow [Update GPU driver](./install_windows_gpu.md#optional-update-gpu-driver).
 
-### 1 Install IPEX-LLM for llama.cpp
+### 1. Install IPEX-LLM for llama.cpp
 
 To use `llama.cpp` with IPEX-LLM, first ensure that `ipex-llm[cpp]` is installed.
 
@@ -59,7 +68,7 @@ To use `llama.cpp` with IPEX-LLM, first ensure that `ipex-llm[cpp]` is installed
 
 **After the installation, you should have created a conda environment, named `llm-cpp` for instance, for running `llama.cpp` commands with IPEX-LLM.**
 
-### 2 Setup for running llama.cpp
+### 2. Setup for running llama.cpp
 
 First you should create a directory to use `llama.cpp`, for instance, use following command to create a `llama-cpp` directory and enter it.
 ```cmd
@@ -127,7 +136,7 @@ To use GPU acceleration, several environment variables are required or recommend
 > export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
 > ```
 
-### 3 Example: Running community GGUF models with IPEX-LLM
+### 3. Example: Running community GGUF models with IPEX-LLM
 
 Here we provide a simple example to show how to run a community GGUF model with IPEX-LLM.
 
diff --git a/docs/mddocs/Quickstart/ollama_quickstart.md b/docs/mddocs/Quickstart/ollama_quickstart.md
index 98a8be98..4846f82c 100644
--- a/docs/mddocs/Quickstart/ollama_quickstart.md
+++ b/docs/mddocs/Quickstart/ollama_quickstart.md
@@ -18,9 +18,16 @@ See the demo of running LLaMA2-7B on Intel Arc GPU below.
 >
 > Our current version is consistent with [v0.1.39](https://github.com/ollama/ollama/releases/tag/v0.1.39) of ollama.
 
+## Table of Contents
+- [Install IPEX-LLM for Ollama](./ollama_quickstart.md#1-install-ipex-llm-for-ollama)
+- [Initialize Ollama](./ollama_quickstart.md#2-initialize-ollama)
+- [Run Ollama Serve](./ollama_quickstart.md#3-run-ollama-serve)
+- [Pull Model](./ollama_quickstart.md#4-pull-model)
+- [Using Ollama](./ollama_quickstart.md#5-using-ollama)
+
 ## Quickstart
 
-### 1 Install IPEX-LLM for Ollama
+### 1. Install IPEX-LLM for Ollama
 
 IPEX-LLM's support for `ollama` now is available for Linux system and Windows system.
 
@@ -53,7 +60,7 @@ Activate the `llm-cpp` conda environment and initialize Ollama by executing the
 
 **Now you can use this executable file by standard ollama's usage.**
 
-### 3 Run Ollama Serve
+### 3. Run Ollama Serve
 
 You may launch the Ollama service as below:
 
@@ -102,7 +109,7 @@ The console will display messages similar to the following:
 </a>
 
 
-### 4 Pull Model
+### 4. Pull Model
 Keep the Ollama service on and open another terminal and run `./ollama pull <model_name>` in Linux (`ollama.exe pull <model_name>` in Windows) to automatically pull a model. e.g. `dolphin-phi:latest`:
 
 <a href="https://llm-assets.readthedocs.io/en/latest/_images/ollama_pull.png" target="_blank">
@@ -110,7 +117,7 @@ Keep the Ollama service on and open another terminal and run `./ollama pull <mod
 </a>
 
 
-### 5 Using Ollama
+### 5. Using Ollama
 
 #### Using Curl 
 
diff --git a/docs/mddocs/Quickstart/open_webui_with_ollama_quickstart.md b/docs/mddocs/Quickstart/open_webui_with_ollama_quickstart.md
index 6981b464..be975a4b 100644
--- a/docs/mddocs/Quickstart/open_webui_with_ollama_quickstart.md
+++ b/docs/mddocs/Quickstart/open_webui_with_ollama_quickstart.md
@@ -13,16 +13,23 @@
   </tr>
 </table>
 
+## Table of Contents
+- [Run Ollama with Intel GPU](./open_webui_with_ollama_quickstart.md#1-run-ollama-with-intel-gpu)
+- [Install the Open-Webui](./open_webui_with_ollama_quickstart.md#2-install-the-open-webui)
+- [Start the Open-WebUI](./open_webui_with_ollama_quickstart.md#3-start-the-open-webui)
+- [Using the Open-Webui](./open_webui_with_ollama_quickstart.md#4-using-the-open-webui)
+- [Troubleshooting](./open_webui_with_ollama_quickstart.md#5-troubleshooting)
+
 ## Quickstart
 
 This quickstart guide walks you through setting up and using [Open WebUI](https://github.com/open-webui/open-webui) with Ollama (using the C++ interface of [`ipex-llm`](https://github.com/intel-analytics/ipex-llm) as an accelerated backend).
 
 
-### 1 Run Ollama with Intel GPU
+### 1. Run Ollama with Intel GPU
 
 Follow the instructions on the [Run Ollama with Intel GPU](./ollama_quickstart.md) to install and run "Ollama Serve". Please ensure that the Ollama server continues to run while you're using the Open WebUI.
 
-### 2 Install the Open-Webui
+### 2. Install the Open-Webui
 
 #### Install Node.js & npm
 
diff --git a/docs/mddocs/Quickstart/privateGPT_quickstart.md b/docs/mddocs/Quickstart/privateGPT_quickstart.md
index b95599d5..b7a53fc3 100644
--- a/docs/mddocs/Quickstart/privateGPT_quickstart.md
+++ b/docs/mddocs/Quickstart/privateGPT_quickstart.md
@@ -13,6 +13,12 @@
   </tr>
 </table>
 
+## Table of Contents
+- [Install and Start `Ollama` Service on Intel GPU](./privateGPT_quickstart.md#1-install-and-start-ollama-service-on-intel-gpu)
+- [Install PrivateGPT](./privateGPT_quickstart.md#2-install-privategpt)
+- [Start PrivateGPT](./privateGPT_quickstart.md#3-start-privategpt)
+- [Using PrivateGPT](./privateGPT_quickstart.md#4-using-privategpt)
+
 ## Quickstart
 
 ### 1. Install and Start `Ollama` Service on Intel GPU 
diff --git a/docs/mddocs/Quickstart/ragflow_quickstart.md b/docs/mddocs/Quickstart/ragflow_quickstart.md
index 254aa372..22251831 100644
--- a/docs/mddocs/Quickstart/ragflow_quickstart.md
+++ b/docs/mddocs/Quickstart/ragflow_quickstart.md
@@ -14,9 +14,17 @@
   </tr>
 </table>
 
+
+## Table of Contents
+- [Prerequisites](./ragflow_quickstart.md#0-prerequisites)
+- [Install and Start Ollama Service on Intel GPU](./ragflow_quickstart.md#1-install-and-start-ollama-service-on-intel-gpu)
+- [Pull Model](./ragflow_quickstart.md#2-pull-model)
+- [Start `RAGFlow` Service](./ragflow_quickstart.md#3-start-ragflow-service)
+- [Using `RAGFlow`](./ragflow_quickstart.md#4-using-ragflow)
+
 ## Quickstart
 
-### 0 Prerequisites
+### 0. Prerequisites
 
 - CPU >= 4 cores
 - RAM >= 16 GB
@@ -95,7 +103,7 @@ To make the change permanent and ensure it persists after a reboot, add or updat
 vm.max_map_count=262144
 ```
 
-### 3.3 Start the `RAGFlow` server using Docker
+#### 3.3 Start the `RAGFlow` server using Docker
 
 Build the pre-built Docker images and start up the server:
 
diff --git a/docs/mddocs/Quickstart/vLLM_quickstart.md b/docs/mddocs/Quickstart/vLLM_quickstart.md
index 155fd321..764b35c1 100644
--- a/docs/mddocs/Quickstart/vLLM_quickstart.md
+++ b/docs/mddocs/Quickstart/vLLM_quickstart.md
@@ -11,6 +11,13 @@ Currently, IPEX-LLM integrated vLLM only supports the following models:
 - ChatGLM series models
 - Baichuan series models
 
+## Table of Contents
+- [Install IPEX-LLM for vLLM](./vLLM_quickstart.md#1-install-ipex-llm-for-vllm)
+- [Install vLLM](./vLLM_quickstart.md#2-install-vllm)
+- [Offline Inference/Service](./vLLM_quickstart.md#3-offline-inferenceservice)
+- [About Tensor Parallel](./vLLM_quickstart.md#4-about-tensor-parallel)
+- [Performing Benchmark](./vLLM_quickstart.md#5-performing-benchmark)
+
 
 ## Quick Start
 
@@ -48,9 +55,9 @@ pip install transformers_stream_generator einops tiktoken
 
 **Now you are all set to use vLLM with IPEX-LLM**
 
-## 3. Offline inference/Service
+### 3. Offline Inference/Service
 
-### Offline inference
+#### Offline inference
 
 To run offline inference using vLLM for a quick impression, use the following example.
 
@@ -87,7 +94,7 @@ Prompt: 'The capital of France is', Generated text: ' Paris.\nThe capital of Fra
 Prompt: 'The future of AI is', Generated text: " bright, but it's not without challenges. As AI continues to evolve,"
 ```
 
-### Service
+#### Service
 
 > [!NOTE]
 > Because of using JIT compilation for kernels. We recommend to send a few requests for warmup before using the service for the best performance.
@@ -170,7 +177,7 @@ Below shows an example output using `Qwen1.5-7B-Chat` with low-bit format `sym_i
 > export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
 > ```
 
-## 4. About Tensor parallel
+### 4. About Tensor Parallel
 
 > [!NOTE]
 > We recommend to use docker for tensor parallel deployment. Check our serving docker image `intelanalytics/ipex-llm-serving-xpu`.
@@ -223,7 +230,7 @@ If the service have booted successfully, you should see the output similar to th
   <img src="https://llm-assets.readthedocs.io/en/latest/_images/start-vllm-service.png" width=100%; />
 </a>
 
-## 5.Performing benchmark
+### 5. Performing Benchmark
 
 To perform benchmark, you can use the **benchmark_throughput** script that is originally provided by vLLM repo.
 
diff --git a/docs/mddocs/Quickstart/webui_quickstart.md b/docs/mddocs/Quickstart/webui_quickstart.md
index 2775605f..6600c6c0 100644
--- a/docs/mddocs/Quickstart/webui_quickstart.md
+++ b/docs/mddocs/Quickstart/webui_quickstart.md
@@ -13,6 +13,14 @@ See the demo of running LLaMA2-7B on an Intel Core Ultra laptop below.
   </tr>
 </table>
 
+## Table of Contents
+- [Install IPEX-LLM](./webui_quickstart.md#1-install-ipex-llm)
+- [Install the WebUI](./webui_quickstart.md#2-install-the-webui)
+- [Start the WebUI Server](./webui_quickstart.md#3-start-the-webui-server)
+- [Using the WebUI](./webui_quickstart.md#4-using-the-webui)
+- [Advanced Usage](./webui_quickstart.md#5-advanced-usage)
+- [Troubleshooting](./webui_quickstart.md#troubleshooting)
+
 ## Quickstart
 This quickstart guide walks you through setting up and using the [Text Generation WebUI](https://github.com/intel-analytics/text-generation-webui) with `ipex-llm`. 
 
@@ -23,13 +31,13 @@ A preview of the WebUI in action is shown below:
 </a>
 
 
-### 1 Install IPEX-LLM
+### 1. Install IPEX-LLM
 
 To use the WebUI, first ensure that IPEX-LLM is installed. Follow the instructions on the [IPEX-LLM Installation Quickstart for Windows with Intel GPU](./install_windows_gpu.md). 
 
 **After the installation, you should have created a conda environment, named `llm` for instance, for running `ipex-llm` applications.**
 
-### 2 Install the WebUI
+### 2. Install the WebUI
 
 #### Download the WebUI
 Download the `text-generation-webui` with IPEX-LLM integrations from [this link](https://github.com/intel-analytics/text-generation-webui/archive/refs/heads/ipex-llm.zip). Unzip the content into a directory, e.g.,`C:\text-generation-webui`. 
@@ -50,7 +58,7 @@ pip install -r extensions/openai/requirements.txt
 > [!NOTE]
 > `extensions/openai/requirements.txt` is for API service. If you don't need the API service, you can omit this command. 
 
-### 3 Start the WebUI Server
+### 3. Start the WebUI Server
 
 #### Set Environment Variables
 Configure oneAPI variables by running the following command in **Miniforge Prompt**: