Update part of Overview guide in mddocs (1/2) (#11378)

* Create install.md * Update install_cpu.md * Delete original docs/mddocs/Overview/install_cpu.md * Update install_cpu.md * Update install_gpu.md * update llm.md and install.md * Update docs in KeyFeatures * Review and fix typos * Fix on folded NOTE * Small fix * Small fix * Remove empty known_issue.md * Small fix * Small fix * Further fix * Fixes * Fix --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-06-21 10:45:17 +08:00 · 2024-06-21 10:45:17 +08:00 · 33b9a9c4c9
commit 33b9a9c4c9
parent 4ba82191f2
11 changed files with 441 additions and 521 deletions
--- a/docs/mddocs/Overview/KeyFeatures/multi_gpus_selection.md
+++ b/docs/mddocs/Overview/KeyFeatures/multi_gpus_selection.md
@ -6,24 +6,22 @@ In [Inference on GPU](inference_on_gpu.md) and [Finetune (QLoRA)](finetune.md),

 The `sycl-ls` tool enumerates a list of devices available in the system. You can use it after you setup oneapi environment:

-```eval_rst
-.. tabs::
-   .. tab:: Windows
+- For **Windows users**:

-      Please make sure you are using CMD (Miniforge Prompt if using conda):
+   Please make sure you are using CMD (Miniforge Prompt if using conda):

-      .. code-block:: cmd
+   ```cmd
+   call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
+   sycl-ls
+   ```

-        call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
-        sycl-ls
+- For **Linux users**:

-   .. tab:: Linux
+   ```bash
+   source /opt/intel/oneapi/setvars.sh
+   sycl-ls
+   ```

-      .. code-block:: bash
-
-         source /opt/intel/oneapi/setvars.sh
-         sycl-ls
-```

 If you have two Arc770 GPUs, you can get something like below:
 ```
@ -40,7 +38,7 @@ This output shows there are two Arc A770 GPUs as well as an Intel iGPU on this m

 ## Devices selection
 To enable xpu, you should convert your model and input to xpu by below code:
-```
+```python
 model = model.to('xpu')
 input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu')
 ```
@ -50,7 +48,7 @@ To select the desired devices, there are two ways: one is changing the code, ano
 To specify a xpu, you can change the `to('xpu')` to `to('xpu:[device_id]')`, this device_id is counted from zero.

 If you you want to use the second device, you can change the code like this: 
-```
+```python
 model = model.to('xpu:1')
 input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu:1')
 ```
@ -59,28 +57,23 @@ input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu:1')
 Device selection environment variable, `ONEAPI_DEVICE_SELECTOR`, can be used to limit the choice of Intel GPU devices. As upon `sycl-ls` shows, the last three lines are three Level Zero GPU devices. So we can use `ONEAPI_DEVICE_SELECTOR=level_zero:[gpu_id]` to select devices.
 For example, you want to use the second A770 GPU, you can run the python like this:

-```eval_rst
-.. tabs::
-   .. tab:: Windows
+- For **Windows users**:

-      .. code-block:: cmd
+   ```cmd
+   set ONEAPI_DEVICE_SELECTOR=level_zero:1 
+   python generate.py
+   ```
+   Through ``set ONEAPI_DEVICE_SELECTOR=level_zero:1``, only the second A770 GPU will be available for the current environment.

-         set ONEAPI_DEVICE_SELECTOR=level_zero:1 
-         python generate.py
+- For **Linux users**:

-      Through ``set ONEAPI_DEVICE_SELECTOR=level_zero:1``, only the second A770 GPU will be available for the current environment.
+   ```bash
+   ONEAPI_DEVICE_SELECTOR=level_zero:1 python generate.py
+   ```

-   .. tab:: Linux
+   ``ONEAPI_DEVICE_SELECTOR=level_zero:1`` in upon command only affect in current python program. Also, you can export the environment variable, then run your python:

-      .. code-block:: bash
-
-         ONEAPI_DEVICE_SELECTOR=level_zero:1 python generate.py
-
-      ``ONEAPI_DEVICE_SELECTOR=level_zero:1`` in upon command only affect in current python program. Also, you can export the environment variable, then run your python:
-
-      .. code-block:: bash
-
-         export ONEAPI_DEVICE_SELECTOR=level_zero:1
-         python generate.py
-
-```
+   ```bash
+   export ONEAPI_DEVICE_SELECTOR=level_zero:1
+   python generate.py
+   ```
--- a/docs/mddocs/Overview/KeyFeatures/native_format.md
+++ b/docs/mddocs/Overview/KeyFeatures/native_format.md
@ -2,17 +2,15 @@

 You may also convert Hugging Face *Transformers* models into native INT4 format for maximum performance as follows.

-```eval_rst
-.. note::
+> [!NOTE]
+> Currently only llama/bloom/gptneox/starcoder/chatglm model families are supported; you may use the corresponding API to load the converted model. (For other models, you can use the Hugging Face ``transformers`` format as described [here](./hugging_face_format.md))

-   Currently only llama/bloom/gptneox/starcoder/chatglm model families are supported; you may use the corresponding API to load the converted model. (For other models, you can use the Hugging Face ``transformers`` format as described `here <./hugging_face_format.html>`_).
-```

 ```python
 # convert the model
 from ipex_llm import llm_convert
 ipex_llm_path = llm_convert(model='/path/to/model/',
-       outfile='/path/to/output/', outtype='int4', model_family="llama")
+                            outfile='/path/to/output/', outtype='int4', model_family="llama")

 # load the converted model
 # switch to ChatGLMForCausalLM/GptneoxForCausalLM/BloomForCausalLM/StarcoderForCausalLM to load other models
@ -25,8 +23,5 @@ output_ids = llm.generate(input_ids, ...)
 output = llm.batch_decode(output_ids)
 ```

-```eval_rst
-.. seealso::
-   
-   See the complete example `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models>`_
-```
+> [!NOTE] 
+> See the complete example [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models)
--- a/docs/mddocs/Overview/KeyFeatures/optimize_model.md
+++ b/docs/mddocs/Overview/KeyFeatures/optimize_model.md
@ -60,10 +60,7 @@ model = load_low_bit(model, saved_dir) # Load the optimized model
 ```


-```eval_rst
-.. seealso::
+> [!NOTE]
+> - Please refer to the [API documentation](https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/optimize.html) for more details.
+> - We also provide detailed examples on how to run PyTorch models (e.g., Openai Whisper, LLaMA2, ChatGLM2, Falcon, MPT, Baichuan2, etc.) using IPEX-LLM. See the complete CPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models) and GPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models)

-   * Please refer to the `API documentation <https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/optimize.html>`_ for more details.
-
-   * We also provide detailed examples on how to run PyTorch models (e.g., Openai Whisper, LLaMA2, ChatGLM2, Falcon, MPT, Baichuan2, etc.) using IPEX-LLM. See the complete CPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models>`_ and GPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models>`_.
-```
--- a/docs/mddocs/Overview/KeyFeatures/transformers_style_api.md
+++ b/docs/mddocs/Overview/KeyFeatures/transformers_style_api.md
@ -0,0 +1,6 @@
+# `transformers`-style API
+
+You may run the LLMs using `transformers`-style API in `ipex-llm`.
+
+* [Hugging Face `transformers` Format](./hugging_face_format.md)
+* [Native Format](./native_format.md)
--- a/docs/mddocs/Overview/KeyFeatures/transformers_style_api.rst
+++ b/docs/mddocs/Overview/KeyFeatures/transformers_style_api.rst
@ -1,10 +0,0 @@
-``transformers``-style API
-================================
-
-You may run the LLMs using ``transformers``-style API in ``ipex-llm``.
-
-* |hugging_face_transformers_format|_
-* `Native Format <./native_format.html>`_
-
-.. |hugging_face_transformers_format| replace:: Hugging Face ``transformers`` Format
-.. _hugging_face_transformers_format: ./hugging_face_format.html
--- a/docs/mddocs/Overview/install.md
+++ b/docs/mddocs/Overview/install.md
@ -0,0 +1,6 @@
+# IPEX-LLM Installation
+
+Here, we provide instructions on how to install `ipex-llm` and best practices for setting up your environment. Please refer to the appropriate guide based on your device:
+
+- [CPU](./install_cpu.md)
+- [GPU](./install_gpu.md)
--- a/docs/mddocs/Overview/install.rst
+++ b/docs/mddocs/Overview/install.rst
@ -1,7 +0,0 @@
-IPEX-LLM Installation
-================================
-
-Here, we provide instructions on how to install ``ipex-llm`` and best practices for setting up your environment. Please refer to the appropriate guide based on your device:
-
-* `CPU <./install_cpu.html>`_
-* `GPU <./install_gpu.html>`_
--- a/docs/mddocs/Overview/install_cpu.md
+++ b/docs/mddocs/Overview/install_cpu.md
@ -4,33 +4,26 @@

 Install IPEX-LLM for CPU supports using pip through:

-```eval_rst	
-.. tabs::
+- For **Linux users**:

-   .. tab:: Linux
+  ```bash
+  pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+  ```

-      .. code-block:: bash
+- For **Windows users**:

-         pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
-
-   .. tab:: Windows
-
-      .. code-block:: cmd
-
-         pip install --pre --upgrade ipex-llm[all]
-```
+  ```cmd
+  pip install --pre --upgrade ipex-llm[all]
+  ```

 Please refer to [Environment Setup](#environment-setup) for more information.

-```eval_rst
-.. note::
+> [!NOTE]
+> `all` option will trigger installation of all the dependencies for common LLM application development.

-   ``all`` option will trigger installation of all the dependencies for common LLM application development.
+> [!IMPORTANT]
+> `ipex-llm` is tested with Python 3.9, 3.10 and 3.11; Python 3.11 is recommended for best practices.

-.. important::
-
-   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11; Python 3.11 is recommended for best practices.
-```

 ## Recommended Requirements

@ -53,48 +46,39 @@ For optimal performance with LLM models using IPEX-LLM optimizations on Intel CP

 First we recommend using [Conda](https://conda-forge.org/download/) to create a python 3.11 enviroment:

-```eval_rst	
-.. tabs::
+- For **Linux users**:

-   .. tab:: Linux
+  ```bash
+  conda create -n llm python=3.11
+  conda activate llm

-      .. code-block:: bash
+  pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+  ```

-         conda create -n llm python=3.11
-         conda activate llm
+- For 
+```cmd
+conda create -n llm python=3.11
+conda activate llm

-         pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
-
-   .. tab:: Windows
-
-      .. code-block:: cmd
-
-         conda create -n llm python=3.11
-         conda activate llm
-
-         pip install --pre --upgrade ipex-llm[all]
+pip install --pre --upgrade ipex-llm[all]
 ```

 Then for running a LLM model with IPEX-LLM optimizations (taking an `example.py` an example):

-```eval_rst	
-.. tabs::
+- For **running on Client**:

-   .. tab:: Client
+  It is recommended to run directly with full utilization of all CPU cores:

-      It is recommended to run directly with full utilization of all CPU cores:
+  ```bash
+  python example.py
+  ```

-      .. code-block:: bash
+- For **running on Server**:

-         python example.py
+  It is recommended to run with all the physical cores of a single socket:

-   .. tab:: Server
-
-      It is recommended to run with all the physical cores of a single socket:
-
-      .. code-block:: bash
-
-         # e.g. for a server with 48 cores per socket
-         export OMP_NUM_THREADS=48
-         numactl -C 0-47 -m 0 python example.py
-```
+  ```bash
+  # e.g. for a server with 48 cores per socket
+  export OMP_NUM_THREADS=48
+  numactl -C 0-47 -m 0 python example.py
+  ```
--- a/docs/mddocs/Overview/install_gpu.md
+++ b/docs/mddocs/Overview/install_gpu.md
@ -6,21 +6,15 @@

 IPEX-LLM on Windows supports Intel iGPU and dGPU.

-```eval_rst
-.. important::
-
-    IPEX-LLM on Windows only supports PyTorch 2.1.
-```
+> [!IMPORTANT]
+> IPEX-LLM on Windows only supports PyTorch 2.1.

 To apply Intel GPU acceleration, please first verify your GPU driver version.

-```eval_rst
-.. note::
+> [!NOTE]
+> The GPU driver version of your device can be checked in the "Task Manager" -> GPU 0 (or GPU 1, etc.) -> Driver version.

-   The GPU driver version of your device can be checked in the "Task Manager" -> GPU 0 (or GPU 1, etc.) -> Driver version.
-```
-
-If you have driver version lower than `31.0.101.5122`, it is recommended to [**update your GPU driver to the latest**](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html):
+If you have driver version lower than `31.0.101.5122`, it is recommended to [**update your GPU driver to the latest**](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html).

 <!-- Intel® oneAPI Base Toolkit 2024.0 installation methods:

@ -47,34 +41,28 @@ If you have driver version lower than `31.0.101.5122`, it is recommended to [**u

 We recommend using [Miniforge](https://conda-forge.org/download/) to create a python 3.11 enviroment.

-```eval_rst
-.. important::
-
-   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
-```
+> [!IMPORTANT]
+> ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.

 The easiest ways to install `ipex-llm` is the following commands, choosing either US or CN website for `extra-index-url`:

-```eval_rst
-.. tabs::
-   .. tab:: US
+- For **US**:

-      .. code-block:: cmd
+   ```cmd
+   conda create -n llm python=3.11 libuv
+   conda activate llm

-         conda create -n llm python=3.11 libuv
-         conda activate llm
+   pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
+   ```

-         pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
+- For **CN**:

-   .. tab:: CN
+   ```cmd
+   conda create -n llm python=3.11 libuv
+   conda activate llm

-      .. code-block:: cmd
-
-         conda create -n llm python=3.11 libuv
-         conda activate llm
-
-         pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
-```
+   pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
+   ```

 #### Install IPEX-LLM From Wheel

@ -98,11 +86,9 @@ pip install intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-win_amd64.whl
 pip install --pre --upgrade ipex-llm[xpu]
 ```

-```eval_rst
-.. note::
+> [!NOTE]
+> All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.

-   All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
-```

 ### Runtime Configuration

@ -116,27 +102,20 @@ call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"

 Please also set the following environment variable if you would like to run LLMs on: -->

-```eval_rst
-.. tabs::
-   .. tab:: Intel iGPU
+- For **Intel iGPU**:
+   ```cmd
+   set SYCL_CACHE_PERSISTENT=1
+   set BIGDL_LLM_XMX_DISABLED=1
+   ```

-      .. code-block:: cmd
+- For **Intel Arc™ A-Series Graphics**:
+   ```cmd
+   set SYCL_CACHE_PERSISTENT=1
+   ```

-         set SYCL_CACHE_PERSISTENT=1
-         set BIGDL_LLM_XMX_DISABLED=1
+> [!NOTE]
+> For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.

-   .. tab:: Intel Arc™ A-Series Graphics
-
-      .. code-block:: cmd
-
-         set SYCL_CACHE_PERSISTENT=1
-```
-
-```eval_rst
-.. note::
-
-   For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
-```

 ### Troubleshooting

@ -173,458 +152,438 @@ IPEX-LLM GPU support on Linux has been verified on:
 * Intel Data Center GPU Flex Series
 * Intel Data Center GPU Max Series

-```eval_rst
-.. important::
+> [!IMPORTANT]
+> IPEX-LLM on Linux supports PyTorch 2.0 and PyTorch 2.1.
+> 
+> **Warning**
+> 
+> IPEX-LLM support for Pytorch 2.0 is deprecated as of ``ipex-llm >= 2.1.0b20240511``.

-    IPEX-LLM on Linux supports PyTorch 2.0 and PyTorch 2.1. 
+> [!IMPORTANT]
+> We currently support the Ubuntu 20.04 operating system and later.

-    .. warning::
+- For **PyTorch 2.1**:

-       IPEX-LLM support for Pytorch 2.0 is deprecated as of ``ipex-llm >= 2.1.0b20240511``.
-```
-
-```eval_rst
-.. important::
-
-    We currently support the Ubuntu 20.04 operating system and later.
-```
-
-```eval_rst
-.. tabs::
-   .. tab:: PyTorch 2.1
-
-      To enable IPEX-LLM for Intel GPUs with PyTorch 2.1, here are several prerequisite steps for tools installation and environment preparation:
+   To enable IPEX-LLM for Intel GPUs with PyTorch 2.1, here are several prerequisite steps for tools installation and environment preparation:


-      * Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. We highly recommend installing the latest version of intel-i915-dkms using apt.
+   - Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. We highly recommend installing the latest version of intel-i915-dkms using apt.

-        .. seealso::
+      > **Tip**:
+      >
+      > Please refer to our [driver installation](https://dgpu-docs.intel.com/driver/installation.html) for general purpose GPU capabilities.
+      >
+      > See [release page](https://dgpu-docs.intel.com/releases/index.html) for latest version.

-           Please refer to our `driver installation <https://dgpu-docs.intel.com/driver/installation.html>`_ for general purpose GPU capabilities.
+      > **Note**:
+      >
+      > For Intel Core™ Ultra integrated GPU, please make sure level_zero version >= 1.3.28717. The level_zero version can be checked with ``sycl-ls``, and verison will be tagged be ``[ext_oneapi_level_zero:gpu]``.         
+      > ```
+      > [opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2023.16.12.0.12_195853.xmain-hotfix]
+      > [opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2023.16.12.0.12_195853.xmain-hotfix]
+      > [opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO  [24.09.28717.12]
+      > [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.28717]
+      > ```
+      >
+      > If you have level_zero version < 1.3.28717, you could update as follows:
+      >
+      > ```bash
+      > wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-core_1.0.16238.4_amd64.deb
+      > wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-opencl_1.0.16238.4_amd64.deb
+      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu-dbgsym_1.3.28717.12_amd64.ddeb
+      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu_1.3.28717.12_amd64.deb
+      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd-dbgsym_24.09.28717.12_amd64.ddeb
+      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd_24.09.28717.12_amd64.deb
+      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/libigdgmm12_22.3.17_amd64.deb
+      > sudo dpkg -i *.deb
+      > ```

-           See `release page <https://dgpu-docs.intel.com/releases/index.html>`_ for latest version.
-
-        .. note::
-
-           For Intel Core™ Ultra integrated GPU, please make sure level_zero version >= 1.3.28717. The level_zero version can be checked with ``sycl-ls``, and verison will be tagged be ``[ext_oneapi_level_zero:gpu]``.
-            
-           .. code-block::
-
-               [opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2023.16.12.0.12_195853.xmain-hotfix]
-               [opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2023.16.12.0.12_195853.xmain-hotfix]
-               [opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO  [24.09.28717.12]
-               [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.28717]
-
-           If you have level_zero version < 1.3.28717, you could update as follows:
-
-           .. code-block:: bash
-
-               wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-core_1.0.16238.4_amd64.deb
-               wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-opencl_1.0.16238.4_amd64.deb
-               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu-dbgsym_1.3.28717.12_amd64.ddeb
-               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu_1.3.28717.12_amd64.deb
-               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd-dbgsym_24.09.28717.12_amd64.ddeb
-               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd_24.09.28717.12_amd64.deb
-               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/libigdgmm12_22.3.17_amd64.deb
-               sudo dpkg -i *.deb
-
-      * Step 2: Download and install `Intel® oneAPI Base Toolkit <https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html>`_ with version 2024.0. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
+   - Step 2: Download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html) with version 2024.0. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.

      Intel® oneAPI Base Toolkit 2024.0 installation methods:
+      <details>
+      <summary> For <b>APT installer</b> </summary>

-      .. tabs::
+      - Step 1: Set up repository

-         .. tab:: APT installer
+         ```bash
+         wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
+         echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
+         sudo apt update
+         ```

-            Step 1: Set up repository
+      - Step 2: Install the package

-            .. code-block:: bash
+         ```bash
+         sudo apt install intel-oneapi-common-vars=2024.0.0-49406 \
+            intel-oneapi-common-oneapi-vars=2024.0.0-49406 \
+            intel-oneapi-diagnostics-utility=2024.0.0-49093 \
+            intel-oneapi-compiler-dpcpp-cpp=2024.0.2-49895 \
+            intel-oneapi-dpcpp-ct=2024.0.0-49381 \
+            intel-oneapi-mkl=2024.0.0-49656 \
+            intel-oneapi-mkl-devel=2024.0.0-49656 \
+            intel-oneapi-mpi=2021.11.0-49493 \
+            intel-oneapi-mpi-devel=2021.11.0-49493 \
+            intel-oneapi-dal=2024.0.1-25 \
+            intel-oneapi-dal-devel=2024.0.1-25 \
+            intel-oneapi-ippcp=2021.9.1-5 \
+            intel-oneapi-ippcp-devel=2021.9.1-5 \
+            intel-oneapi-ipp=2021.10.1-13 \
+            intel-oneapi-ipp-devel=2021.10.1-13 \
+            intel-oneapi-tlt=2024.0.0-352 \
+            intel-oneapi-ccl=2021.11.2-5 \
+            intel-oneapi-ccl-devel=2021.11.2-5 \
+            intel-oneapi-dnnl-devel=2024.0.0-49521 \
+            intel-oneapi-dnnl=2024.0.0-49521 \
+            intel-oneapi-tcm-1.0=1.0.0-435
+         ```

-               wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
-               echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
-               sudo apt update
+         > **Note**:
+         >
+         > You can uninstall the package by running the following command:
+         >
+         > ```bash
+         > sudo apt autoremove intel-oneapi-common-vars
+         > ```
+      </details>

-            Step 2: Install the package
+      <details>
+      <summary> For <b>PIP installer</b> </summary>

-            .. code-block:: bash
+      - Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``.

-               sudo apt install intel-oneapi-common-vars=2024.0.0-49406 \
-                  intel-oneapi-common-oneapi-vars=2024.0.0-49406 \
-                  intel-oneapi-diagnostics-utility=2024.0.0-49093 \
-                  intel-oneapi-compiler-dpcpp-cpp=2024.0.2-49895 \
-                  intel-oneapi-dpcpp-ct=2024.0.0-49381 \
-                  intel-oneapi-mkl=2024.0.0-49656 \
-                  intel-oneapi-mkl-devel=2024.0.0-49656 \
-                  intel-oneapi-mpi=2021.11.0-49493 \
-                  intel-oneapi-mpi-devel=2021.11.0-49493 \
-                  intel-oneapi-dal=2024.0.1-25 \
-                  intel-oneapi-dal-devel=2024.0.1-25 \
-                  intel-oneapi-ippcp=2021.9.1-5 \
-                  intel-oneapi-ippcp-devel=2021.9.1-5 \
-                  intel-oneapi-ipp=2021.10.1-13 \
-                  intel-oneapi-ipp-devel=2021.10.1-13 \
-                  intel-oneapi-tlt=2024.0.0-352 \
-                  intel-oneapi-ccl=2021.11.2-5 \
-                  intel-oneapi-ccl-devel=2021.11.2-5 \
-                  intel-oneapi-dnnl-devel=2024.0.0-49521 \
-                  intel-oneapi-dnnl=2024.0.0-49521 \
-                  intel-oneapi-tcm-1.0=1.0.0-435
+         ```bash
+         export PYTHONUSERBASE=~/intel/oneapi
+         pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0 --user
+         ```

-            .. note::
+         > **Note**:
+         >
+         > The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.

-               You can uninstall the package by running the following command:
+      - Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.

-               .. code-block:: bash
-               
-                  sudo apt autoremove intel-oneapi-common-vars
+         ```bash
+         conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
+         ```

-         .. tab:: PIP installer
+         > **Note**:
+         >
+         > You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
+         > You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.

-            Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``.
+         > **Note**:
+         >
+         > You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
+         > You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
+         >
+         > ```bash
+         > rm -r ~/intel/oneapi
+         > conda env config vars unset LD_LIBRARY_PATH -n llm
+         > ```
+      </details>

-            .. code-block:: bash
+      <details>
+      <summary> For <b>Offline installer</b> </summary>
+      
+      Using the offline installer allows you to customize the installation path.

-               export PYTHONUSERBASE=~/intel/oneapi
-               pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0 --user
+      ```bash      
+      wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/20f4e6a1-6b0b-4752-b8c1-e5eacba10e01/l_BaseKit_p_2024.0.0.49564_offline.sh
+      sudo sh ./l_BaseKit_p_2024.0.0.49564_offline.sh
+      ```
+      > **Note**:
+      >
+      > You can also modify the installation or uninstall the package by running the following commands:
+      >
+      > ```bash
+      > cd /opt/intel/oneapi/installer
+      > sudo ./installer
+      > ```
+      </details>

-            .. note::
+- For **PyTorch 2.0** (deprecated for versions ``ipex-llm >= 2.1.0b20240511``):

-               The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.
-
-            Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.
-
-            .. code-block:: bash
-
-               conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
-
-            .. note::
-               You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
-               You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.
-
-            .. note::
-
-               You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
-               You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
-               
-               .. code-block:: bash
-               
-                  rm -r ~/intel/oneapi
-                  conda env config vars unset LD_LIBRARY_PATH -n llm
-
-         .. tab:: Offline installer
-         
-            Using the offline installer allows you to customize the installation path.
-
-            .. code-block:: bash
-            
-               wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/20f4e6a1-6b0b-4752-b8c1-e5eacba10e01/l_BaseKit_p_2024.0.0.49564_offline.sh
-               sudo sh ./l_BaseKit_p_2024.0.0.49564_offline.sh
-
-            .. note::
-
-                  You can also modify the installation or uninstall the package by running the following commands:
-
-                  .. code-block:: bash
-
-                     cd /opt/intel/oneapi/installer
-                     sudo ./installer
-
-   .. tab:: PyTorch 2.0 (deprecated for versions ``ipex-llm >= 2.1.0b20240511``)
-
-      To enable IPEX-LLM for Intel GPUs with PyTorch 2.0, here're several prerequisite steps for tools installation and environment preparation:
+   To enable IPEX-LLM for Intel GPUs with PyTorch 2.0, here're several prerequisite steps for tools installation and environment preparation:


-      * Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. Highly recommend installing the latest version of intel-i915-dkms using apt.
+   - Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. Highly recommend installing the latest version of intel-i915-dkms using apt.

-        .. seealso::
+      > **Tip**:
+      >
+      >   Please refer to our [driver installation](https://dgpu-docs.intel.com/driver/installation.html) for general purpose GPU capabilities.
+      >
+      >   See [release page](https://dgpu-docs.intel.com/releases/index.html) for latest version.

-           Please refer to our `driver installation <https://dgpu-docs.intel.com/driver/installation.html>`_ for general purpose GPU capabilities.
-
-           See `release page <https://dgpu-docs.intel.com/releases/index.html>`_ for latest version.
-
-      * Step 2: Download and install `Intel® oneAPI Base Toolkit <https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html>`_ with version 2023.2. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
+   - Step 2: Download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html) with version 2023.2. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.

      Intel® oneAPI Base Toolkit 2023.2 installation methods:

-      .. tabs::
-         .. tab:: APT installer
+      <details>
+      <summary> For <b>APT installer</b> </summary>

-            Step 1: Set up repository
+      - Step 1: Set up repository

-            .. code-block:: bash
+         ```bash
+         wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
+         echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
+         sudo apt update
+         ```

-               wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
-               echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
-               sudo apt update
+      - Step 2: Install the packages

-            Step 2: Install the packages
+         ```bash
+         sudo apt install -y intel-oneapi-common-vars=2023.2.0-49462 \
+            intel-oneapi-compiler-cpp-eclipse-cfg=2023.2.0-49495 intel-oneapi-compiler-dpcpp-eclipse-cfg=2023.2.0-49495 \
+            intel-oneapi-diagnostics-utility=2022.4.0-49091 \
+            intel-oneapi-compiler-dpcpp-cpp=2023.2.0-49495 \
+            intel-oneapi-mkl=2023.2.0-49495 intel-oneapi-mkl-devel=2023.2.0-49495 \
+            intel-oneapi-mpi=2021.10.0-49371 intel-oneapi-mpi-devel=2021.10.0-49371 \
+            intel-oneapi-tbb=2021.10.0-49541 intel-oneapi-tbb-devel=2021.10.0-49541\
+            intel-oneapi-ccl=2021.10.0-49084 intel-oneapi-ccl-devel=2021.10.0-49084\
+            intel-oneapi-dnnl-devel=2023.2.0-49516 intel-oneapi-dnnl=2023.2.0-49516
+         ```

-            .. code-block:: bash
+         > **Note**:
+         >
+         > You can uninstall the package by running the following command:
+         >
+         > ```bash
+         > sudo apt autoremove intel-oneapi-common-vars
+         > ```
+      </details>

-               sudo apt install -y intel-oneapi-common-vars=2023.2.0-49462 \
-                  intel-oneapi-compiler-cpp-eclipse-cfg=2023.2.0-49495 intel-oneapi-compiler-dpcpp-eclipse-cfg=2023.2.0-49495 \
-                  intel-oneapi-diagnostics-utility=2022.4.0-49091 \
-                  intel-oneapi-compiler-dpcpp-cpp=2023.2.0-49495 \
-                  intel-oneapi-mkl=2023.2.0-49495 intel-oneapi-mkl-devel=2023.2.0-49495 \
-                  intel-oneapi-mpi=2021.10.0-49371 intel-oneapi-mpi-devel=2021.10.0-49371 \
-                  intel-oneapi-tbb=2021.10.0-49541 intel-oneapi-tbb-devel=2021.10.0-49541\
-                  intel-oneapi-ccl=2021.10.0-49084 intel-oneapi-ccl-devel=2021.10.0-49084\
-                  intel-oneapi-dnnl-devel=2023.2.0-49516 intel-oneapi-dnnl=2023.2.0-49516
+      <details>
+      <summary> For <b>PIP installer</b> </summary>

-            .. note::
+      - Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``

-               You can uninstall the package by running the following command:
+         ```bash
+         export PYTHONUSERBASE=~/intel/oneapi
+         pip install dpcpp-cpp-rt==2023.2.0 mkl-dpcpp==2023.2.0 onednn-cpu-dpcpp-gpu-dpcpp==2023.2.0 --user
+         ```

-               .. code-block:: bash
-               
-                  sudo apt autoremove intel-oneapi-common-vars
+         > **Note**:
+         >
+         > The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.

-         .. tab:: PIP installer
+      - Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.

-            Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``
+         ```bash
+         conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
+         ```

-            .. code-block:: bash
+         > **Note**:
+         >
+         >   You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
+         >   You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.

-               export PYTHONUSERBASE=~/intel/oneapi
-               pip install dpcpp-cpp-rt==2023.2.0 mkl-dpcpp==2023.2.0 onednn-cpu-dpcpp-gpu-dpcpp==2023.2.0 --user
+         > **Note**:
+         >   
+         >   You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
+         >   You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).  
+         >
+         > ```bash
+         > rm -r ~/intel/oneapi
+         > conda env config vars unset LD_LIBRARY_PATH -n llm
+         > ```
+      </details>

-            .. note::
+      <details>
+      <summary> For <b>Offline installer</b> </summary>
+      
+      Using the offline installer allows you to customize the installation path.

-               The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.
-
-            Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.
-
-            .. code-block:: bash
-
-               conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
-
-            .. note::
-               You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
-               You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.
-
-            .. note::
-
-               You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
-               You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
-               
-               .. code-block:: bash
-               
-                  rm -r ~/intel/oneapi
-                  conda env config vars unset LD_LIBRARY_PATH -n llm
-
-         .. tab:: Offline installer
-         
-            Using the offline installer allows you to customize the installation path.
-
-            .. code-block:: bash
-            
-               wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/992857b9-624c-45de-9701-f6445d845359/l_BaseKit_p_2023.2.0.49397_offline.sh
-               sudo sh ./l_BaseKit_p_2023.2.0.49397_offline.sh
-
-            .. note::
-
-               You can also modify the installation or uninstall the package by running the following commands:
-
-               .. code-block:: bash
-
-                  cd /opt/intel/oneapi/installer
-                  sudo ./installer
-```
+      ```bash
+      wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/992857b9-624c-45de-9701-f6445d845359/l_BaseKit_p_2023.2.0.49397_offline.sh
+      sudo sh ./l_BaseKit_p_2023.2.0.49397_offline.sh
+      ```
+      > **Note**:
+      >
+      > You can also modify the installation or uninstall the package by running the following commands:
+      >
+      > ```bash
+      > cd /opt/intel/oneapi/installer
+      > sudo ./installer
+      > ```
+      </details>

 ### Install IPEX-LLM
 #### Install IPEX-LLM From PyPI

-We recommend using [Miniforge](https://conda-forge.org/download/ to create a python 3.11 enviroment:
+We recommend using [Miniforge](https://conda-forge.org/download/) to create a python 3.11 enviroment:

-```eval_rst
-.. important::
+> [!IMPORTANT]
+> ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.

-   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
-```

-```eval_rst
-.. important::
-   Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with Pytorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with Pytorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
-```
+> [!IMPORTANT]
+>   Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with Pytorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with Pytorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.

-```eval_rst
-.. tabs::
-   .. tab:: PyTorch 2.1
-      Choose either US or CN website for ``extra-index-url``:

-      .. tabs::
-         .. tab:: US
+- For **PyTorch 2.1**:

-            .. code-block:: bash
+   Choose either US or CN website for ``extra-index-url``:
+   
+   - For **US**:

-               conda create -n llm python=3.11
-               conda activate llm
+      ```bash
+      conda create -n llm python=3.11
+      conda activate llm

-               pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
+      pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
+      ```

-            .. note::
+      > **Note**:
+      >
+      > The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
+      >
+      > ```bash
+      > pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/> xpu/us/
+      > ```

-               The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
+   - For **CN**:

-               .. code-block:: bash
+      ```bash
+      conda create -n llm python=3.11
+      conda activate llm

-                  pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
+      pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
+      ```

-         .. tab:: CN
+      > **Note**:
+      >
+      > The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
+      >
+      > ```bash
+      > pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/> xpu/cn/
+      > ```

-            .. code-block:: bash
+- For **PyTorch 2.0** (deprecated for versions ``ipex-llm >= 2.1.0b20240511``):

-               conda create -n llm python=3.11
-               conda activate llm
+   Choose either US or CN website for ``extra-index-url``:
+   
+   - For **US**:

-               pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
+      ```bash
+      conda create -n llm python=3.11
+      conda activate llm

-            .. note::
+      pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
+      ```

-               The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
+   - For **CN**:

-               .. code-block:: bash
+      ```bash
+      conda create -n llm python=3.11
+      conda activate llm

-                  pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
-            
+      pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
+      ```

-   .. tab:: PyTorch 2.0 (deprecated for versions ``ipex-llm >= 2.1.0b20240511``)
-      Choose either US or CN website for ``extra-index-url``:
-
-      .. tabs::
-         .. tab:: US
-
-            .. code-block:: bash
-
-               conda create -n llm python=3.11
-               conda activate llm
-
-               pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
-
-         .. tab:: CN
-
-            .. code-block:: bash
-
-               conda create -n llm python=3.11
-               conda activate llm
-
-               pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
-
-```

 #### Install IPEX-LLM From Wheel

 If you encounter network issues when installing IPEX, you can also install IPEX-LLM dependencies for Intel XPU from source archives. First you need to download and install torch/torchvision/ipex from wheels listed below before installing `ipex-llm`.

-```eval_rst
-.. tabs::
-   .. tab:: PyTorch 2.1

-      .. code-block:: bash
+- For **PyTorch 2.1**:

-         # get the wheels on Linux system for IPEX 2.1.10+xpu
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.1.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.16.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.1.10%2Bxpu-cp311-cp311-linux_x86_64.whl
+   ```bash
+   # get the wheels on Linux system for IPEX 2.1.10+xpu
+   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.1.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
+   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.16.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
+   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.1.10%2Bxpu-cp311-cp311-linux_x86_64.whl
+   ```

-      Then you may install directly from the wheel archives using following commands:
+   Then you may install directly from the wheel archives using following commands:

-      .. code-block:: bash
+   ```bash
+   # install the packages from the wheels
+   pip install torch-2.1.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
+   pip install torchvision-0.16.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
+   pip install intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-linux_x86_64.whl

-         # install the packages from the wheels
-         pip install torch-2.1.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
-         pip install torchvision-0.16.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
-         pip install intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-linux_x86_64.whl
+   # install ipex-llm for Intel GPU
+   pip install --pre --upgrade ipex-llm[xpu]
+   ```

-         # install ipex-llm for Intel GPU
-         pip install --pre --upgrade ipex-llm[xpu]
+- For **PyTorch 2.0** (deprecated for versions ``ipex-llm >= 2.1.0b20240511``):

-   .. tab:: PyTorch 2.0 (deprecated for versions ``ipex-llm >= 2.1.0b20240511``)
+   ```bash
+   # get the wheels on Linux system for IPEX 2.0.110+xpu
+   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.0.1a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
+   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.15.2a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
+   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.0.110%2Bxpu-cp311-cp311-linux_x86_64.whl
+   ```

-      .. code-block:: bash
+   Then you may install directly from the wheel archives using following commands:

-         # get the wheels on Linux system for IPEX 2.0.110+xpu
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.0.1a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.15.2a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.0.110%2Bxpu-cp311-cp311-linux_x86_64.whl
+   ```bash
+   # install the packages from the wheels
+   pip install torch-2.0.1a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
+   pip install torchvision-0.15.2a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
+   pip install intel_extension_for_pytorch-2.0.110+xpu-cp311-cp311-linux_x86_64.whl

-      Then you may install directly from the wheel archives using following commands:
+   # install ipex-llm for Intel GPU
+   pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510
+   ```

-      .. code-block:: bash
+> [!NOTE]
+> All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.

-         # install the packages from the wheels
-         pip install torch-2.0.1a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
-         pip install torchvision-0.15.2a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
-         pip install intel_extension_for_pytorch-2.0.110+xpu-cp311-cp311-linux_x86_64.whl
-
-         # install ipex-llm for Intel GPU
-         pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510
-
-```
-
-```eval_rst
-.. note::
-
-   All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
-```

 ### Runtime Configuration

 To use GPU acceleration on Linux, several environment variables are required or recommended before running a GPU example.

-```eval_rst
-.. tabs::
-   .. tab:: Intel Arc™ A-Series and Intel Data Center GPU Flex
+
+   - For **Intel Arc™ A-Series and Intel Data Center GPU Flex**:

      For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series, we recommend:

-      .. code-block:: bash
+      ```bash
+      # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
+      # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
+      source /opt/intel/oneapi/setvars.sh

-         # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
-         # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
-         source /opt/intel/oneapi/setvars.sh
+      # Recommended Environment Variables for optimal performance
+      export USE_XETLA=OFF
+      export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
+      export SYCL_CACHE_PERSISTENT=1
+      ```

-         # Recommended Environment Variables for optimal performance
-         export USE_XETLA=OFF
-         export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
-         export SYCL_CACHE_PERSISTENT=1
-
-   .. tab:: Intel Data Center GPU Max
+   - For **Intel Data Center GPU Max**:

      For Intel Data Center GPU Max Series, we recommend:

-      .. code-block:: bash
+      ```bash
+      # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
+      # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
+      source /opt/intel/oneapi/setvars.sh

-         # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
-         # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
-         source /opt/intel/oneapi/setvars.sh
-
-         # Recommended Environment Variables for optimal performance
-         export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
-         export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
-         export SYCL_CACHE_PERSISTENT=1
-         export ENABLE_SDP_FUSION=1
+      # Recommended Environment Variables for optimal performance
+      export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
+      export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
+      export SYCL_CACHE_PERSISTENT=1
+      export ENABLE_SDP_FUSION=1
+      ```

      Please note that ``libtcmalloc.so`` can be installed by ``conda install -c conda-forge -y gperftools=2.10``

-   .. tab:: Intel iGPU
+   - For **Intel iGPU**:

-      .. code-block:: bash
+      ```bash
+      # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
+      # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
+      source /opt/intel/oneapi/setvars.sh

-         # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
-         # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
-         source /opt/intel/oneapi/setvars.sh
+      export SYCL_CACHE_PERSISTENT=1
+      export BIGDL_LLM_XMX_DISABLED=1
+      ```

-         export SYCL_CACHE_PERSISTENT=1
-         export BIGDL_LLM_XMX_DISABLED=1
-
-```
-
-```eval_rst
-.. note::
-
-   For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
-```
+> [!NOTE]
+> For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.

 ### Known issues

@ -662,5 +621,5 @@ Error: libmkl_sycl_blas.so.4: cannot open shared object file: No such file or di
 The reason for such errors is that oneAPI has not been initialized properly before running IPEX-LLM code or before importing IPEX package.

 * For oneAPI installed using APT or Offline Installer, make sure you execute `setvars.sh` of oneAPI Base Toolkit before running IPEX-LLM.
-* For PIP-installed oneAPI, activate your working environment and run ``echo $LD_LIBRARY_PATH`` to check if the installation path is properly configured for the environment. If the output does not contain oneAPI path (e.g. ``~/intel/oneapi/lib``), check [Prerequisites](#id1) to re-install oneAPI with PIP installer.
+* For PIP-installed oneAPI, activate your working environment and run ``echo $LD_LIBRARY_PATH`` to check if the installation path is properly configured for the environment. If the output does not contain oneAPI path (e.g. ``~/intel/oneapi/lib``), check [Prerequisites](#prerequisites-1) to re-install oneAPI with PIP installer.
 * Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with PyTorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with PyTorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
--- a/docs/mddocs/Overview/known_issues.md
+++ b/docs/mddocs/Overview/known_issues.md
@ -1 +0,0 @@
-# IPEX-LLM Known Issues
--- a/docs/mddocs/Overview/llm.md
+++ b/docs/mddocs/Overview/llm.md
@ -17,13 +17,11 @@ model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path="open
                                             load_in_4bit=True)
 ```

-```eval_rst
-.. tip::
+> [!TIP]
+> [open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2) is a pretrained large language model hosted on Hugging Face. `openlm-research/open_llama_3b_v2` is its Hugging Face model id. `from_pretrained` will automatically download the model from Hugging Face to a local cache path (e.g. ``~/.cache/huggingface``), load the model, and converted it to `ipex-llm` INT4 format.
+>
+> It may take a long time to download the model using API. You can also download the model yourself, and set `pretrained_model_name_or_path` to the local path of the downloaded model. This way, `from_pretrained` will load and convert directly from local path without download.

-   `open_llama_3b_v2 <https://huggingface.co/openlm-research/open_llama_3b_v2>`_ is a pretrained large language model hosted on Hugging Face. ``openlm-research/open_llama_3b_v2`` is its Hugging Face model id. ``from_pretrained`` will automatically download the model from Hugging Face to a local cache path (e.g. ``~/.cache/huggingface``), load the model, and converted it to ``ipex-llm`` INT4 format.
-
-   It may take a long time to download the model using API. You can also download the model yourself, and set ``pretrained_model_name_or_path`` to the local path of the downloaded model. This way, ``from_pretrained`` will load and convert directly from local path without download.
-```
 ## Load Tokenizer

 You also need a tokenizer for inference. Just use the official `transformers` API to load `LlamaTokenizer`: