Update part of Overview guide in mddocs (1/2) (#11378)

* Create install.md * Update install_cpu.md * Delete original docs/mddocs/Overview/install_cpu.md * Update install_cpu.md * Update install_gpu.md * update llm.md and install.md * Update docs in KeyFeatures * Review and fix typos * Fix on folded NOTE * Small fix * Small fix * Remove empty known_issue.md * Small fix * Small fix * Further fix * Fixes * Fix --------- Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-06-21 10:45:17 +08:00 · 2024-06-21 10:45:17 +08:00 · 33b9a9c4c9
commit 33b9a9c4c9
parent 4ba82191f2
11 changed files with 441 additions and 521 deletions
--- a/docs/mddocs/Overview/KeyFeatures/multi_gpus_selection.md
+++ b/docs/mddocs/Overview/KeyFeatures/multi_gpus_selection.md
@ -6,24 +6,22 @@ In [Inference on GPU](inference_on_gpu.md) and [Finetune (QLoRA)](finetune.md),
 The `sycl-ls` tool enumerates a list of devices available in the system. You can use it after you setup oneapi environment:
-```eval_rst
+- For **Windows users**:
 .. tabs::
   .. tab:: Windows
-      Please make sure you are using CMD (Miniforge Prompt if using conda):
+   Please make sure you are using CMD (Miniforge Prompt if using conda):
-      .. code-block:: cmd
+   ```cmd
   call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
   sycl-ls
   ```
-        call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
+- For **Linux users**:
        sycl-ls
-   .. tab:: Linux
+   ```bash
   source /opt/intel/oneapi/setvars.sh
   sycl-ls
   ```
      .. code-block:: bash
         source /opt/intel/oneapi/setvars.sh
         sycl-ls
 ```
 If you have two Arc770 GPUs, you can get something like below:
 ```
@ -40,7 +38,7 @@ This output shows there are two Arc A770 GPUs as well as an Intel iGPU on this m
 ## Devices selection
 To enable xpu, you should convert your model and input to xpu by below code:
-```
+```python
 model = model.to('xpu')
 input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu')
 ```
@ -50,7 +48,7 @@ To select the desired devices, there are two ways: one is changing the code, ano
 To specify a xpu, you can change the `to('xpu')` to `to('xpu:[device_id]')`, this device_id is counted from zero.
 If you you want to use the second device, you can change the code like this: 
-```
+```python
 model = model.to('xpu:1')
 input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu:1')
 ```
@ -59,28 +57,23 @@ input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu:1')
 Device selection environment variable, `ONEAPI_DEVICE_SELECTOR`, can be used to limit the choice of Intel GPU devices. As upon `sycl-ls` shows, the last three lines are three Level Zero GPU devices. So we can use `ONEAPI_DEVICE_SELECTOR=level_zero:[gpu_id]` to select devices.
 For example, you want to use the second A770 GPU, you can run the python like this:
-```eval_rst
+- For **Windows users**:
 .. tabs::
   .. tab:: Windows
-      .. code-block:: cmd
+   ```cmd
   set ONEAPI_DEVICE_SELECTOR=level_zero:1 
   python generate.py
   ```
   Through ``set ONEAPI_DEVICE_SELECTOR=level_zero:1``, only the second A770 GPU will be available for the current environment.
-         set ONEAPI_DEVICE_SELECTOR=level_zero:1 
+- For **Linux users**:
         python generate.py
-      Through ``set ONEAPI_DEVICE_SELECTOR=level_zero:1``, only the second A770 GPU will be available for the current environment.
+   ```bash
   ONEAPI_DEVICE_SELECTOR=level_zero:1 python generate.py
   ```
-   .. tab:: Linux
+   ``ONEAPI_DEVICE_SELECTOR=level_zero:1`` in upon command only affect in current python program. Also, you can export the environment variable, then run your python:
-      .. code-block:: bash
+   ```bash
-
+   export ONEAPI_DEVICE_SELECTOR=level_zero:1
-         ONEAPI_DEVICE_SELECTOR=level_zero:1 python generate.py
+   python generate.py
-
+   ```
      ``ONEAPI_DEVICE_SELECTOR=level_zero:1`` in upon command only affect in current python program. Also, you can export the environment variable, then run your python:
      .. code-block:: bash
         export ONEAPI_DEVICE_SELECTOR=level_zero:1
         python generate.py
 ```
--- a/docs/mddocs/Overview/KeyFeatures/native_format.md
+++ b/docs/mddocs/Overview/KeyFeatures/native_format.md
@ -2,17 +2,15 @@
 You may also convert Hugging Face *Transformers* models into native INT4 format for maximum performance as follows.
-```eval_rst
+> [!NOTE]
-.. note::
+> Currently only llama/bloom/gptneox/starcoder/chatglm model families are supported; you may use the corresponding API to load the converted model. (For other models, you can use the Hugging Face ``transformers`` format as described [here](./hugging_face_format.md))
   Currently only llama/bloom/gptneox/starcoder/chatglm model families are supported; you may use the corresponding API to load the converted model. (For other models, you can use the Hugging Face ``transformers`` format as described `here <./hugging_face_format.html>`_).
 ```
 ```python
 # convert the model
 from ipex_llm import llm_convert
 ipex_llm_path = llm_convert(model='/path/to/model/',
-       outfile='/path/to/output/', outtype='int4', model_family="llama")
+                            outfile='/path/to/output/', outtype='int4', model_family="llama")
 # load the converted model
 # switch to ChatGLMForCausalLM/GptneoxForCausalLM/BloomForCausalLM/StarcoderForCausalLM to load other models
@ -25,8 +23,5 @@ output_ids = llm.generate(input_ids, ...)
 output = llm.batch_decode(output_ids)
 ```
-```eval_rst
+> [!NOTE] 
-.. seealso::
+> See the complete example [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models)
   See the complete example `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models>`_
 ```
--- a/docs/mddocs/Overview/KeyFeatures/optimize_model.md
+++ b/docs/mddocs/Overview/KeyFeatures/optimize_model.md
@ -60,10 +60,7 @@ model = load_low_bit(model, saved_dir) # Load the optimized model
 ```
-```eval_rst
+> [!NOTE]
-.. seealso::
+> - Please refer to the [API documentation](https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/optimize.html) for more details.
 > - We also provide detailed examples on how to run PyTorch models (e.g., Openai Whisper, LLaMA2, ChatGLM2, Falcon, MPT, Baichuan2, etc.) using IPEX-LLM. See the complete CPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models) and GPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models)
   * Please refer to the `API documentation <https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/optimize.html>`_ for more details.
   * We also provide detailed examples on how to run PyTorch models (e.g., Openai Whisper, LLaMA2, ChatGLM2, Falcon, MPT, Baichuan2, etc.) using IPEX-LLM. See the complete CPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models>`_ and GPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models>`_.
 ```
--- a/docs/mddocs/Overview/KeyFeatures/transformers_style_api.md
+++ b/docs/mddocs/Overview/KeyFeatures/transformers_style_api.md
@ -0,0 +1,6 @@
 # `transformers`-style API
 You may run the LLMs using `transformers`-style API in `ipex-llm`.
 * [Hugging Face `transformers` Format](./hugging_face_format.md)
 * [Native Format](./native_format.md)
--- a/docs/mddocs/Overview/KeyFeatures/transformers_style_api.rst
+++ b/docs/mddocs/Overview/KeyFeatures/transformers_style_api.rst
@ -1,10 +0,0 @@
 ``transformers``-style API
 ================================
 You may run the LLMs using ``transformers``-style API in ``ipex-llm``.
 * |hugging_face_transformers_format|_
 * `Native Format <./native_format.html>`_
 .. |hugging_face_transformers_format| replace:: Hugging Face ``transformers`` Format
 .. _hugging_face_transformers_format: ./hugging_face_format.html
--- a/docs/mddocs/Overview/install.md
+++ b/docs/mddocs/Overview/install.md
@ -0,0 +1,6 @@
 # IPEX-LLM Installation
 Here, we provide instructions on how to install `ipex-llm` and best practices for setting up your environment. Please refer to the appropriate guide based on your device:
 - [CPU](./install_cpu.md)
 - [GPU](./install_gpu.md)
--- a/docs/mddocs/Overview/install.rst
+++ b/docs/mddocs/Overview/install.rst
@ -1,7 +0,0 @@
 IPEX-LLM Installation
 ================================
 Here, we provide instructions on how to install ``ipex-llm`` and best practices for setting up your environment. Please refer to the appropriate guide based on your device:
 * `CPU <./install_cpu.html>`_
 * `GPU <./install_gpu.html>`_
--- a/docs/mddocs/Overview/install_cpu.md
+++ b/docs/mddocs/Overview/install_cpu.md
@ -4,33 +4,26 @@
 Install IPEX-LLM for CPU supports using pip through:
-```eval_rst	
+- For **Linux users**:
 .. tabs::
-   .. tab:: Linux
+  ```bash
  pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
  ```
-      .. code-block:: bash
+- For **Windows users**:
-         pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+  ```cmd
-
+  pip install --pre --upgrade ipex-llm[all]
-   .. tab:: Windows
+  ```
      .. code-block:: cmd
         pip install --pre --upgrade ipex-llm[all]
 ```
 Please refer to [Environment Setup](#environment-setup) for more information.
-```eval_rst
+> [!NOTE]
-.. note::
+> `all` option will trigger installation of all the dependencies for common LLM application development.
-   ``all`` option will trigger installation of all the dependencies for common LLM application development.
+> [!IMPORTANT]
 > `ipex-llm` is tested with Python 3.9, 3.10 and 3.11; Python 3.11 is recommended for best practices.
 .. important::
   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11; Python 3.11 is recommended for best practices.
 ```
 ## Recommended Requirements
@ -53,48 +46,39 @@ For optimal performance with LLM models using IPEX-LLM optimizations on Intel CP
 First we recommend using [Conda](https://conda-forge.org/download/) to create a python 3.11 enviroment:
-```eval_rst	
+- For **Linux users**:
 .. tabs::
-   .. tab:: Linux
+  ```bash
  conda create -n llm python=3.11
  conda activate llm
-      .. code-block:: bash
+  pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
  ```
-         conda create -n llm python=3.11
+- For 
-         conda activate llm
+```cmd
 conda create -n llm python=3.11
 conda activate llm
-         pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+pip install --pre --upgrade ipex-llm[all]
   .. tab:: Windows
      .. code-block:: cmd
         conda create -n llm python=3.11
         conda activate llm
         pip install --pre --upgrade ipex-llm[all]
 ```
 Then for running a LLM model with IPEX-LLM optimizations (taking an `example.py` an example):
-```eval_rst	
+- For **running on Client**:
 .. tabs::
-   .. tab:: Client
+  It is recommended to run directly with full utilization of all CPU cores:
-      It is recommended to run directly with full utilization of all CPU cores:
+  ```bash
  python example.py
  ```
-      .. code-block:: bash
+- For **running on Server**:
-         python example.py
+  It is recommended to run with all the physical cores of a single socket:
-   .. tab:: Server
+  ```bash
-
+  # e.g. for a server with 48 cores per socket
-      It is recommended to run with all the physical cores of a single socket:
+  export OMP_NUM_THREADS=48
-
+  numactl -C 0-47 -m 0 python example.py
-      .. code-block:: bash
+  ```
         # e.g. for a server with 48 cores per socket
         export OMP_NUM_THREADS=48
         numactl -C 0-47 -m 0 python example.py
 ```
--- a/docs/mddocs/Overview/install_gpu.md
+++ b/docs/mddocs/Overview/install_gpu.md
@ -6,21 +6,15 @@
 IPEX-LLM on Windows supports Intel iGPU and dGPU.
-```eval_rst
+> [!IMPORTANT]
-.. important::
+> IPEX-LLM on Windows only supports PyTorch 2.1.
    IPEX-LLM on Windows only supports PyTorch 2.1.
 ```
 To apply Intel GPU acceleration, please first verify your GPU driver version.
-```eval_rst
+> [!NOTE]
-.. note::
+> The GPU driver version of your device can be checked in the "Task Manager" -> GPU 0 (or GPU 1, etc.) -> Driver version.
-   The GPU driver version of your device can be checked in the "Task Manager" -> GPU 0 (or GPU 1, etc.) -> Driver version.
+If you have driver version lower than `31.0.101.5122`, it is recommended to [**update your GPU driver to the latest**](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html).
 ```
 If you have driver version lower than `31.0.101.5122`, it is recommended to [**update your GPU driver to the latest**](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html):
 <!-- Intel® oneAPI Base Toolkit 2024.0 installation methods:
@ -47,34 +41,28 @@ If you have driver version lower than `31.0.101.5122`, it is recommended to [**u
 We recommend using [Miniforge](https://conda-forge.org/download/) to create a python 3.11 enviroment.
-```eval_rst
+> [!IMPORTANT]
-.. important::
+> ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
 ```
 The easiest ways to install `ipex-llm` is the following commands, choosing either US or CN website for `extra-index-url`:
-```eval_rst
+- For **US**:
 .. tabs::
   .. tab:: US
-      .. code-block:: cmd
+   ```cmd
   conda create -n llm python=3.11 libuv
   conda activate llm
-         conda create -n llm python=3.11 libuv
+   pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
-         conda activate llm
+   ```
-         pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
+- For **CN**:
-   .. tab:: CN
+   ```cmd
   conda create -n llm python=3.11 libuv
   conda activate llm
-      .. code-block:: cmd
+   pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
-
+   ```
         conda create -n llm python=3.11 libuv
         conda activate llm
         pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
 ```
 #### Install IPEX-LLM From Wheel
@ -98,11 +86,9 @@ pip install intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-win_amd64.whl
 pip install --pre --upgrade ipex-llm[xpu]
 ```
-```eval_rst
+> [!NOTE]
-.. note::
+> All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
   All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
 ```
 ### Runtime Configuration
@ -116,27 +102,20 @@ call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
 Please also set the following environment variable if you would like to run LLMs on: -->
-```eval_rst
+- For **Intel iGPU**:
-.. tabs::
+   ```cmd
-   .. tab:: Intel iGPU
+   set SYCL_CACHE_PERSISTENT=1
   set BIGDL_LLM_XMX_DISABLED=1
   ```
-      .. code-block:: cmd
+- For **Intel Arc™ A-Series Graphics**:
   ```cmd
   set SYCL_CACHE_PERSISTENT=1
   ```
-         set SYCL_CACHE_PERSISTENT=1
+> [!NOTE]
-         set BIGDL_LLM_XMX_DISABLED=1
+> For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
   .. tab:: Intel Arc™ A-Series Graphics
      .. code-block:: cmd
         set SYCL_CACHE_PERSISTENT=1
 ```
 ```eval_rst
 .. note::
   For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
 ```
 ### Troubleshooting
@ -173,458 +152,438 @@ IPEX-LLM GPU support on Linux has been verified on:
 * Intel Data Center GPU Flex Series
 * Intel Data Center GPU Max Series
-```eval_rst
+> [!IMPORTANT]
-.. important::
+> IPEX-LLM on Linux supports PyTorch 2.0 and PyTorch 2.1.
 > 
 > **Warning**
 > 
 > IPEX-LLM support for Pytorch 2.0 is deprecated as of ``ipex-llm >= 2.1.0b20240511``.
-    IPEX-LLM on Linux supports PyTorch 2.0 and PyTorch 2.1. 
+> [!IMPORTANT]
 > We currently support the Ubuntu 20.04 operating system and later.
-    .. warning::
+- For **PyTorch 2.1**:
-       IPEX-LLM support for Pytorch 2.0 is deprecated as of ``ipex-llm >= 2.1.0b20240511``.
+   To enable IPEX-LLM for Intel GPUs with PyTorch 2.1, here are several prerequisite steps for tools installation and environment preparation:
 ```
 ```eval_rst
 .. important::
    We currently support the Ubuntu 20.04 operating system and later.
 ```
 ```eval_rst
 .. tabs::
   .. tab:: PyTorch 2.1
      To enable IPEX-LLM for Intel GPUs with PyTorch 2.1, here are several prerequisite steps for tools installation and environment preparation:
-      * Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. We highly recommend installing the latest version of intel-i915-dkms using apt.
+   - Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. We highly recommend installing the latest version of intel-i915-dkms using apt.
-        .. seealso::
+      > **Tip**:
      >
      > Please refer to our [driver installation](https://dgpu-docs.intel.com/driver/installation.html) for general purpose GPU capabilities.
      >
      > See [release page](https://dgpu-docs.intel.com/releases/index.html) for latest version.
-           Please refer to our `driver installation <https://dgpu-docs.intel.com/driver/installation.html>`_ for general purpose GPU capabilities.
+      > **Note**:
      >
      > For Intel Core™ Ultra integrated GPU, please make sure level_zero version >= 1.3.28717. The level_zero version can be checked with ``sycl-ls``, and verison will be tagged be ``[ext_oneapi_level_zero:gpu]``.         
      > ```
      > [opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2023.16.12.0.12_195853.xmain-hotfix]
      > [opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2023.16.12.0.12_195853.xmain-hotfix]
      > [opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO  [24.09.28717.12]
      > [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.28717]
      > ```
      >
      > If you have level_zero version < 1.3.28717, you could update as follows:
      >
      > ```bash
      > wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-core_1.0.16238.4_amd64.deb
      > wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-opencl_1.0.16238.4_amd64.deb
      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu-dbgsym_1.3.28717.12_amd64.ddeb
      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu_1.3.28717.12_amd64.deb
      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd-dbgsym_24.09.28717.12_amd64.ddeb
      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd_24.09.28717.12_amd64.deb
      > wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/libigdgmm12_22.3.17_amd64.deb
      > sudo dpkg -i *.deb
      > ```
-           See `release page <https://dgpu-docs.intel.com/releases/index.html>`_ for latest version.
+   - Step 2: Download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html) with version 2024.0. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
        .. note::
           For Intel Core™ Ultra integrated GPU, please make sure level_zero version >= 1.3.28717. The level_zero version can be checked with ``sycl-ls``, and verison will be tagged be ``[ext_oneapi_level_zero:gpu]``.
           .. code-block::
               [opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2023.16.12.0.12_195853.xmain-hotfix]
               [opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2023.16.12.0.12_195853.xmain-hotfix]
               [opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO  [24.09.28717.12]
               [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.28717]
           If you have level_zero version < 1.3.28717, you could update as follows:
           .. code-block:: bash
               wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-core_1.0.16238.4_amd64.deb
               wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-opencl_1.0.16238.4_amd64.deb
               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu-dbgsym_1.3.28717.12_amd64.ddeb
               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu_1.3.28717.12_amd64.deb
               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd-dbgsym_24.09.28717.12_amd64.ddeb
               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd_24.09.28717.12_amd64.deb
               wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/libigdgmm12_22.3.17_amd64.deb
               sudo dpkg -i *.deb
      * Step 2: Download and install `Intel® oneAPI Base Toolkit <https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html>`_ with version 2024.0. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
      Intel® oneAPI Base Toolkit 2024.0 installation methods:
      <details>
      <summary> For <b>APT installer</b> </summary>
-      .. tabs::
+      - Step 1: Set up repository
-         .. tab:: APT installer
+         ```bash
         wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
         echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
         sudo apt update
         ```
-            Step 1: Set up repository
+      - Step 2: Install the package
-            .. code-block:: bash
+         ```bash
         sudo apt install intel-oneapi-common-vars=2024.0.0-49406 \
            intel-oneapi-common-oneapi-vars=2024.0.0-49406 \
            intel-oneapi-diagnostics-utility=2024.0.0-49093 \
            intel-oneapi-compiler-dpcpp-cpp=2024.0.2-49895 \
            intel-oneapi-dpcpp-ct=2024.0.0-49381 \
            intel-oneapi-mkl=2024.0.0-49656 \
            intel-oneapi-mkl-devel=2024.0.0-49656 \
            intel-oneapi-mpi=2021.11.0-49493 \
            intel-oneapi-mpi-devel=2021.11.0-49493 \
            intel-oneapi-dal=2024.0.1-25 \
            intel-oneapi-dal-devel=2024.0.1-25 \
            intel-oneapi-ippcp=2021.9.1-5 \
            intel-oneapi-ippcp-devel=2021.9.1-5 \
            intel-oneapi-ipp=2021.10.1-13 \
            intel-oneapi-ipp-devel=2021.10.1-13 \
            intel-oneapi-tlt=2024.0.0-352 \
            intel-oneapi-ccl=2021.11.2-5 \
            intel-oneapi-ccl-devel=2021.11.2-5 \
            intel-oneapi-dnnl-devel=2024.0.0-49521 \
            intel-oneapi-dnnl=2024.0.0-49521 \
            intel-oneapi-tcm-1.0=1.0.0-435
         ```
-               wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
+         > **Note**:
-               echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
+         >
-               sudo apt update
+         > You can uninstall the package by running the following command:
         >
         > ```bash
         > sudo apt autoremove intel-oneapi-common-vars
         > ```
      </details>
-            Step 2: Install the package
+      <details>
      <summary> For <b>PIP installer</b> </summary>
-            .. code-block:: bash
+      - Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``.
-               sudo apt install intel-oneapi-common-vars=2024.0.0-49406 \
+         ```bash
-                  intel-oneapi-common-oneapi-vars=2024.0.0-49406 \
+         export PYTHONUSERBASE=~/intel/oneapi
-                  intel-oneapi-diagnostics-utility=2024.0.0-49093 \
+         pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0 --user
-                  intel-oneapi-compiler-dpcpp-cpp=2024.0.2-49895 \
+         ```
                  intel-oneapi-dpcpp-ct=2024.0.0-49381 \
                  intel-oneapi-mkl=2024.0.0-49656 \
                  intel-oneapi-mkl-devel=2024.0.0-49656 \
                  intel-oneapi-mpi=2021.11.0-49493 \
                  intel-oneapi-mpi-devel=2021.11.0-49493 \
                  intel-oneapi-dal=2024.0.1-25 \
                  intel-oneapi-dal-devel=2024.0.1-25 \
                  intel-oneapi-ippcp=2021.9.1-5 \
                  intel-oneapi-ippcp-devel=2021.9.1-5 \
                  intel-oneapi-ipp=2021.10.1-13 \
                  intel-oneapi-ipp-devel=2021.10.1-13 \
                  intel-oneapi-tlt=2024.0.0-352 \
                  intel-oneapi-ccl=2021.11.2-5 \
                  intel-oneapi-ccl-devel=2021.11.2-5 \
                  intel-oneapi-dnnl-devel=2024.0.0-49521 \
                  intel-oneapi-dnnl=2024.0.0-49521 \
                  intel-oneapi-tcm-1.0=1.0.0-435
-            .. note::
+         > **Note**:
         >
         > The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.
-               You can uninstall the package by running the following command:
+      - Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.
-               .. code-block:: bash
+         ```bash
         conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
         ```
-                  sudo apt autoremove intel-oneapi-common-vars
+         > **Note**:
         >
         > You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
         > You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.
-         .. tab:: PIP installer
+         > **Note**:
         >
         > You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
         > You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
         >
         > ```bash
         > rm -r ~/intel/oneapi
         > conda env config vars unset LD_LIBRARY_PATH -n llm
         > ```
      </details>
-            Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``.
+      <details>
      <summary> For <b>Offline installer</b> </summary>
-            .. code-block:: bash
+      Using the offline installer allows you to customize the installation path.
-               export PYTHONUSERBASE=~/intel/oneapi
+      ```bash      
-               pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0 --user
+      wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/20f4e6a1-6b0b-4752-b8c1-e5eacba10e01/l_BaseKit_p_2024.0.0.49564_offline.sh
      sudo sh ./l_BaseKit_p_2024.0.0.49564_offline.sh
      ```
      > **Note**:
      >
      > You can also modify the installation or uninstall the package by running the following commands:
      >
      > ```bash
      > cd /opt/intel/oneapi/installer
      > sudo ./installer
      > ```
      </details>
-            .. note::
+- For **PyTorch 2.0** (deprecated for versions ``ipex-llm >= 2.1.0b20240511``):
-               The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.
+   To enable IPEX-LLM for Intel GPUs with PyTorch 2.0, here're several prerequisite steps for tools installation and environment preparation:
            Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.
            .. code-block:: bash
               conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
            .. note::
               You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
               You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.
            .. note::
               You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
               You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
               .. code-block:: bash
                  rm -r ~/intel/oneapi
                  conda env config vars unset LD_LIBRARY_PATH -n llm
         .. tab:: Offline installer
            Using the offline installer allows you to customize the installation path.
            .. code-block:: bash
               wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/20f4e6a1-6b0b-4752-b8c1-e5eacba10e01/l_BaseKit_p_2024.0.0.49564_offline.sh
               sudo sh ./l_BaseKit_p_2024.0.0.49564_offline.sh
            .. note::
                  You can also modify the installation or uninstall the package by running the following commands:
                  .. code-block:: bash
                     cd /opt/intel/oneapi/installer
                     sudo ./installer
   .. tab:: PyTorch 2.0 (deprecated for versions ``ipex-llm >= 2.1.0b20240511``)
      To enable IPEX-LLM for Intel GPUs with PyTorch 2.0, here're several prerequisite steps for tools installation and environment preparation:
-      * Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. Highly recommend installing the latest version of intel-i915-dkms using apt.
+   - Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. Highly recommend installing the latest version of intel-i915-dkms using apt.
-        .. seealso::
+      > **Tip**:
      >
      >   Please refer to our [driver installation](https://dgpu-docs.intel.com/driver/installation.html) for general purpose GPU capabilities.
      >
      >   See [release page](https://dgpu-docs.intel.com/releases/index.html) for latest version.
-           Please refer to our `driver installation <https://dgpu-docs.intel.com/driver/installation.html>`_ for general purpose GPU capabilities.
+   - Step 2: Download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html) with version 2023.2. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
           See `release page <https://dgpu-docs.intel.com/releases/index.html>`_ for latest version.
      * Step 2: Download and install `Intel® oneAPI Base Toolkit <https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html>`_ with version 2023.2. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
      Intel® oneAPI Base Toolkit 2023.2 installation methods:
-      .. tabs::
+      <details>
-         .. tab:: APT installer
+      <summary> For <b>APT installer</b> </summary>
-            Step 1: Set up repository
+      - Step 1: Set up repository
-            .. code-block:: bash
+         ```bash
         wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
         echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
         sudo apt update
         ```
-               wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
+      - Step 2: Install the packages
               echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
               sudo apt update
-            Step 2: Install the packages
+         ```bash
         sudo apt install -y intel-oneapi-common-vars=2023.2.0-49462 \
            intel-oneapi-compiler-cpp-eclipse-cfg=2023.2.0-49495 intel-oneapi-compiler-dpcpp-eclipse-cfg=2023.2.0-49495 \
            intel-oneapi-diagnostics-utility=2022.4.0-49091 \
            intel-oneapi-compiler-dpcpp-cpp=2023.2.0-49495 \
            intel-oneapi-mkl=2023.2.0-49495 intel-oneapi-mkl-devel=2023.2.0-49495 \
            intel-oneapi-mpi=2021.10.0-49371 intel-oneapi-mpi-devel=2021.10.0-49371 \
            intel-oneapi-tbb=2021.10.0-49541 intel-oneapi-tbb-devel=2021.10.0-49541\
            intel-oneapi-ccl=2021.10.0-49084 intel-oneapi-ccl-devel=2021.10.0-49084\
            intel-oneapi-dnnl-devel=2023.2.0-49516 intel-oneapi-dnnl=2023.2.0-49516
         ```
-            .. code-block:: bash
+         > **Note**:
         >
         > You can uninstall the package by running the following command:
         >
         > ```bash
         > sudo apt autoremove intel-oneapi-common-vars
         > ```
      </details>
-               sudo apt install -y intel-oneapi-common-vars=2023.2.0-49462 \
+      <details>
-                  intel-oneapi-compiler-cpp-eclipse-cfg=2023.2.0-49495 intel-oneapi-compiler-dpcpp-eclipse-cfg=2023.2.0-49495 \
+      <summary> For <b>PIP installer</b> </summary>
                  intel-oneapi-diagnostics-utility=2022.4.0-49091 \
                  intel-oneapi-compiler-dpcpp-cpp=2023.2.0-49495 \
                  intel-oneapi-mkl=2023.2.0-49495 intel-oneapi-mkl-devel=2023.2.0-49495 \
                  intel-oneapi-mpi=2021.10.0-49371 intel-oneapi-mpi-devel=2021.10.0-49371 \
                  intel-oneapi-tbb=2021.10.0-49541 intel-oneapi-tbb-devel=2021.10.0-49541\
                  intel-oneapi-ccl=2021.10.0-49084 intel-oneapi-ccl-devel=2021.10.0-49084\
                  intel-oneapi-dnnl-devel=2023.2.0-49516 intel-oneapi-dnnl=2023.2.0-49516
-            .. note::
+      - Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``
-               You can uninstall the package by running the following command:
+         ```bash
         export PYTHONUSERBASE=~/intel/oneapi
         pip install dpcpp-cpp-rt==2023.2.0 mkl-dpcpp==2023.2.0 onednn-cpu-dpcpp-gpu-dpcpp==2023.2.0 --user
         ```
-               .. code-block:: bash
+         > **Note**:
         >
         > The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.
-                  sudo apt autoremove intel-oneapi-common-vars
+      - Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.
-         .. tab:: PIP installer
+         ```bash
         conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
         ```
-            Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``
+         > **Note**:
         >
         >   You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
         >   You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.
-            .. code-block:: bash
+         > **Note**:
         >   
         >   You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
         >   You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).  
         >
         > ```bash
         > rm -r ~/intel/oneapi
         > conda env config vars unset LD_LIBRARY_PATH -n llm
         > ```
      </details>
-               export PYTHONUSERBASE=~/intel/oneapi
+      <details>
-               pip install dpcpp-cpp-rt==2023.2.0 mkl-dpcpp==2023.2.0 onednn-cpu-dpcpp-gpu-dpcpp==2023.2.0 --user
+      <summary> For <b>Offline installer</b> </summary>
-            .. note::
+      Using the offline installer allows you to customize the installation path.
-               The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.
+      ```bash
-
+      wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/992857b9-624c-45de-9701-f6445d845359/l_BaseKit_p_2023.2.0.49397_offline.sh
-            Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.
+      sudo sh ./l_BaseKit_p_2023.2.0.49397_offline.sh
-
+      ```
-            .. code-block:: bash
+      > **Note**:
-
+      >
-               conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
+      > You can also modify the installation or uninstall the package by running the following commands:
-
+      >
-            .. note::
+      > ```bash
-               You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
+      > cd /opt/intel/oneapi/installer
-               You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.
+      > sudo ./installer
-
+      > ```
-            .. note::
+      </details>
               You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
               You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
               .. code-block:: bash
                  rm -r ~/intel/oneapi
                  conda env config vars unset LD_LIBRARY_PATH -n llm
         .. tab:: Offline installer
            Using the offline installer allows you to customize the installation path.
            .. code-block:: bash
               wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/992857b9-624c-45de-9701-f6445d845359/l_BaseKit_p_2023.2.0.49397_offline.sh
               sudo sh ./l_BaseKit_p_2023.2.0.49397_offline.sh
            .. note::
               You can also modify the installation or uninstall the package by running the following commands:
               .. code-block:: bash
                  cd /opt/intel/oneapi/installer
                  sudo ./installer
 ```
 ### Install IPEX-LLM
 #### Install IPEX-LLM From PyPI
-We recommend using [Miniforge](https://conda-forge.org/download/ to create a python 3.11 enviroment:
+We recommend using [Miniforge](https://conda-forge.org/download/) to create a python 3.11 enviroment:
-```eval_rst
+> [!IMPORTANT]
-.. important::
+> ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
 ```
 ```eval_rst
 .. important::
   Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with Pytorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with Pytorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
 ```
 ```eval_rst
 .. tabs::
   .. tab:: PyTorch 2.1
      Choose either US or CN website for ``extra-index-url``:
      .. tabs::
         .. tab:: US
            .. code-block:: bash
               conda create -n llm python=3.11
               conda activate llm
               pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
            .. note::
               The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
               .. code-block:: bash
                  pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
         .. tab:: CN
            .. code-block:: bash
               conda create -n llm python=3.11
               conda activate llm
               pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
            .. note::
               The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
               .. code-block:: bash
                  pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
-   .. tab:: PyTorch 2.0 (deprecated for versions ``ipex-llm >= 2.1.0b20240511``)
+> [!IMPORTANT]
-      Choose either US or CN website for ``extra-index-url``:
+>   Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with Pytorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with Pytorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
      .. tabs::
         .. tab:: US
-            .. code-block:: bash
+- For **PyTorch 2.1**:
-               conda create -n llm python=3.11
+   Choose either US or CN website for ``extra-index-url``:
               conda activate llm
-               pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
+   - For **US**:
-         .. tab:: CN
+      ```bash
      conda create -n llm python=3.11
      conda activate llm
-            .. code-block:: bash
+      pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
      ```
-               conda create -n llm python=3.11
+      > **Note**:
-               conda activate llm
+      >
      > The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
      >
      > ```bash
      > pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/> xpu/us/
      > ```
-               pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
+   - For **CN**:
      ```bash
      conda create -n llm python=3.11
      conda activate llm
      pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
      ```
      > **Note**:
      >
      > The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
      >
      > ```bash
      > pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/> xpu/cn/
      > ```
 - For **PyTorch 2.0** (deprecated for versions ``ipex-llm >= 2.1.0b20240511``):
   Choose either US or CN website for ``extra-index-url``:
   - For **US**:
      ```bash
      conda create -n llm python=3.11
      conda activate llm
      pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
      ```
   - For **CN**:
      ```bash
      conda create -n llm python=3.11
      conda activate llm
      pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
      ```
 ```
 #### Install IPEX-LLM From Wheel
 If you encounter network issues when installing IPEX, you can also install IPEX-LLM dependencies for Intel XPU from source archives. First you need to download and install torch/torchvision/ipex from wheels listed below before installing `ipex-llm`.
 ```eval_rst
 .. tabs::
   .. tab:: PyTorch 2.1
-      .. code-block:: bash
+- For **PyTorch 2.1**:
-         # get the wheels on Linux system for IPEX 2.1.10+xpu
+   ```bash
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.1.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
+   # get the wheels on Linux system for IPEX 2.1.10+xpu
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.16.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
+   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.1.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.1.10%2Bxpu-cp311-cp311-linux_x86_64.whl
+   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.16.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.1.10%2Bxpu-cp311-cp311-linux_x86_64.whl
   ```
-      Then you may install directly from the wheel archives using following commands:
+   Then you may install directly from the wheel archives using following commands:
-      .. code-block:: bash
+   ```bash
   # install the packages from the wheels
   pip install torch-2.1.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
   pip install torchvision-0.16.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
   pip install intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-linux_x86_64.whl
-         # install the packages from the wheels
+   # install ipex-llm for Intel GPU
-         pip install torch-2.1.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
+   pip install --pre --upgrade ipex-llm[xpu]
-         pip install torchvision-0.16.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
+   ```
         pip install intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-linux_x86_64.whl
-         # install ipex-llm for Intel GPU
+- For **PyTorch 2.0** (deprecated for versions ``ipex-llm >= 2.1.0b20240511``):
         pip install --pre --upgrade ipex-llm[xpu]
-   .. tab:: PyTorch 2.0 (deprecated for versions ``ipex-llm >= 2.1.0b20240511``)
+   ```bash
   # get the wheels on Linux system for IPEX 2.0.110+xpu
   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.0.1a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.15.2a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
   wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.0.110%2Bxpu-cp311-cp311-linux_x86_64.whl
   ```
-      .. code-block:: bash
+   Then you may install directly from the wheel archives using following commands:
-         # get the wheels on Linux system for IPEX 2.0.110+xpu
+   ```bash
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.0.1a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
+   # install the packages from the wheels
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.15.2a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
+   pip install torch-2.0.1a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
-         wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.0.110%2Bxpu-cp311-cp311-linux_x86_64.whl
+   pip install torchvision-0.15.2a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
   pip install intel_extension_for_pytorch-2.0.110+xpu-cp311-cp311-linux_x86_64.whl
-      Then you may install directly from the wheel archives using following commands:
+   # install ipex-llm for Intel GPU
   pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510
   ```
-      .. code-block:: bash
+> [!NOTE]
 > All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
         # install the packages from the wheels
         pip install torch-2.0.1a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
         pip install torchvision-0.15.2a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
         pip install intel_extension_for_pytorch-2.0.110+xpu-cp311-cp311-linux_x86_64.whl
         # install ipex-llm for Intel GPU
         pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510
 ```
 ```eval_rst
 .. note::
   All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
 ```
 ### Runtime Configuration
 To use GPU acceleration on Linux, several environment variables are required or recommended before running a GPU example.
-```eval_rst
+
-.. tabs::
+   - For **Intel Arc™ A-Series and Intel Data Center GPU Flex**:
   .. tab:: Intel Arc™ A-Series and Intel Data Center GPU Flex
      For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series, we recommend:
-      .. code-block:: bash
+      ```bash
      # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
      # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
      source /opt/intel/oneapi/setvars.sh
-         # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
+      # Recommended Environment Variables for optimal performance
-         # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
+      export USE_XETLA=OFF
-         source /opt/intel/oneapi/setvars.sh
+      export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
      export SYCL_CACHE_PERSISTENT=1
      ```
-         # Recommended Environment Variables for optimal performance
+   - For **Intel Data Center GPU Max**:
         export USE_XETLA=OFF
         export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
         export SYCL_CACHE_PERSISTENT=1
   .. tab:: Intel Data Center GPU Max
      For Intel Data Center GPU Max Series, we recommend:
-      .. code-block:: bash
+      ```bash
      # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
      # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
      source /opt/intel/oneapi/setvars.sh
-         # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
+      # Recommended Environment Variables for optimal performance
-         # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
+      export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
-         source /opt/intel/oneapi/setvars.sh
+      export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
-
+      export SYCL_CACHE_PERSISTENT=1
-         # Recommended Environment Variables for optimal performance
+      export ENABLE_SDP_FUSION=1
-         export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
+      ```
         export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
         export SYCL_CACHE_PERSISTENT=1
         export ENABLE_SDP_FUSION=1
      Please note that ``libtcmalloc.so`` can be installed by ``conda install -c conda-forge -y gperftools=2.10``
-   .. tab:: Intel iGPU
+   - For **Intel iGPU**:
-      .. code-block:: bash
+      ```bash
      # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
      # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
      source /opt/intel/oneapi/setvars.sh
-         # Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
+      export SYCL_CACHE_PERSISTENT=1
-         # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
+      export BIGDL_LLM_XMX_DISABLED=1
-         source /opt/intel/oneapi/setvars.sh
+      ```
-         export SYCL_CACHE_PERSISTENT=1
+> [!NOTE]
-         export BIGDL_LLM_XMX_DISABLED=1
+> For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
 ```
 ```eval_rst
 .. note::
   For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
 ```
 ### Known issues
@ -662,5 +621,5 @@ Error: libmkl_sycl_blas.so.4: cannot open shared object file: No such file or di
 The reason for such errors is that oneAPI has not been initialized properly before running IPEX-LLM code or before importing IPEX package.
 * For oneAPI installed using APT or Offline Installer, make sure you execute `setvars.sh` of oneAPI Base Toolkit before running IPEX-LLM.
-* For PIP-installed oneAPI, activate your working environment and run ``echo $LD_LIBRARY_PATH`` to check if the installation path is properly configured for the environment. If the output does not contain oneAPI path (e.g. ``~/intel/oneapi/lib``), check [Prerequisites](#id1) to re-install oneAPI with PIP installer.
+* For PIP-installed oneAPI, activate your working environment and run ``echo $LD_LIBRARY_PATH`` to check if the installation path is properly configured for the environment. If the output does not contain oneAPI path (e.g. ``~/intel/oneapi/lib``), check [Prerequisites](#prerequisites-1) to re-install oneAPI with PIP installer.
 * Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with PyTorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with PyTorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
--- a/docs/mddocs/Overview/known_issues.md
+++ b/docs/mddocs/Overview/known_issues.md
@ -1 +0,0 @@
 # IPEX-LLM Known Issues
--- a/docs/mddocs/Overview/llm.md
+++ b/docs/mddocs/Overview/llm.md
@ -17,13 +17,11 @@ model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path="open
                                             load_in_4bit=True)
 ```
-```eval_rst
+> [!TIP]
-.. tip::
+> [open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2) is a pretrained large language model hosted on Hugging Face. `openlm-research/open_llama_3b_v2` is its Hugging Face model id. `from_pretrained` will automatically download the model from Hugging Face to a local cache path (e.g. ``~/.cache/huggingface``), load the model, and converted it to `ipex-llm` INT4 format.
 >
 > It may take a long time to download the model using API. You can also download the model yourself, and set `pretrained_model_name_or_path` to the local path of the downloaded model. This way, `from_pretrained` will load and convert directly from local path without download.
   `open_llama_3b_v2 <https://huggingface.co/openlm-research/open_llama_3b_v2>`_ is a pretrained large language model hosted on Hugging Face. ``openlm-research/open_llama_3b_v2`` is its Hugging Face model id. ``from_pretrained`` will automatically download the model from Hugging Face to a local cache path (e.g. ``~/.cache/huggingface``), load the model, and converted it to ``ipex-llm`` INT4 format.
   It may take a long time to download the model using API. You can also download the model yourself, and set ``pretrained_model_name_or_path`` to the local path of the downloaded model. This way, ``from_pretrained`` will load and convert directly from local path without download.
 ```
 ## Load Tokenizer
 You also need a tokenizer for inference. Just use the official `transformers` API to load `LlamaTokenizer`: