Change quickstart documentation to use oneapi offline installer (#10350)

* Change to oneapi offline installer * Fixes * Add "call" * Fixes
2024-03-08 19:24:00 +08:00 · 2024-03-08 19:24:00 +08:00 · 6829efd350
commit 6829efd350
parent 9026c08633
2 changed files with 41 additions and 27 deletions
--- a/docs/readthedocs/source/doc/LLM/Quickstart/install_windows_gpu.md
+++ b/docs/readthedocs/source/doc/LLM/Quickstart/install_windows_gpu.md
@ -4,7 +4,7 @@ This guide demonstrates how to install BigDL-LLM on Windows with Intel GPUs.

 It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU.

-## Install GPU Driver
+## Install Visual Studio 2022

 * Download and Install Visual Studio 2022 Community Edition from the [official Microsoft Visual Studio website](https://visualstudio.microsoft.com/downloads/). Ensure you select the **Desktop development with C++ workload** during the installation process.
   
@ -13,6 +13,8 @@ It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as
    > 
    > <img src="https://llm-assets.readthedocs.io/en/latest/_images/quickstart_windows_gpu_1.png" alt="image-20240221102252560" width=100%; />

+## Install GPU Driver
+
 * Download and install the latest GPU driver from the [official Intel download page](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html). A system reboot is necessary to apply the changes after the installation is complete.
   
    > Note: The process could take around 10 minutes. After reboot, check for the **Intel Arc Control** application to verify the driver has been installed correctly. If the installation was successful, you should see the **Arc Control** interface similar to the figure below
@ -22,6 +24,17 @@ It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as
 * To monitor your GPU's performance and status, you can use either the **Windows Task Manager** (see the left side of the figure below) or the **Arc Control** application (see the right side of the figure below) :
    >  <img src="https://llm-assets.readthedocs.io/en/latest/_images/quickstart_windows_gpu_4.png"  width=70%; />

+## Install oneAPI 
+
+<!-- * With the `llm` environment active, use `pip` to install the [**Intel oneAPI Base Toolkit**](https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html):
+  ```cmd
+  pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
+  ``` -->
+
+* Download and install the [**Intel oneAPI Base Toolkit**](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=window&distributions=offline). During installation, you can continue with the default installation settings.
+
+  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/quickstart_windows_gpu_oneapi_offline_installer.png"  width=90%; />
+
 ## Setup Python Environment

 * Visit [Miniconda installation page](https://docs.anaconda.com/free/miniconda/), download the **Miniconda installer for Windows**, and follow the instructions to complete the installation.
@ -29,31 +42,24 @@ It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/quickstart_windows_gpu_5.png"  width=50%; />

 * After installation, open the **Anaconda Prompt**, create a new python environment `llm`:
-  ```bash
+  ```cmd
  conda create -n llm python=3.9 libuv
  ```
 * Activate the newly created environment `llm`:
-  ```bash
+  ```cmd
  conda activate llm
  ```
- 
-## Install oneAPI 
-
-* With the `llm` environment active, use `pip` to install the [**Intel oneAPI Base Toolkit**](https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html):
-  ```bash
-  pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
-  ```
  
 ## Install `bigdl-llm`

 * With the `llm` environment active, use `pip` to install `bigdl-llm` for GPU:
  Choose either US or CN website for `extra-index-url`:
  * US: 
-     ```bash
+     ```cmd
     pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
     ```
  * CN:
-     ```bash
+     ```cmd
     pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
     ```
  > Note: If you encounter network issues while installing IPEX, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#install-bigdl-llm-from-wheel) for troubleshooting advice. 
@ -68,17 +74,21 @@ It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as
 Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface.co/microsoft/phi-1_5) model, a 1.3 billion parameter LLM for this demostration. Follow the steps below to setup and run the model, and observe how it responds to a prompt "What is AI?". 

 * Step 1: Open the **Anaconda Prompt** and activate the Python environment `llm` you previously created: 
-   ```bash
+   ```cmd
   conda activate llm
   ```
-* Step 2: If you're running on iGPU, set some environment variables by running below commands:
-  > For more details about runtime configurations, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration): 
-  ```bash
+* Step 2: Configure oneAPI variables by running the following command:
+  > For more details about runtime configurations, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration):
+  ```cmd
+  call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
+  ```
+  If you're running on iGPU, set additional environment variables by running the following commands:
+  ```cmd
  set SYCL_CACHE_PERSISTENT=1
  set BIGDL_LLM_XMX_DISABLED=1
  ```
 * Step 3: To ensure compatibility with `phi-1.5`, update the transformers library to version 4.37.0:
-   ```bash
+   ```cmd
   pip install -U transformers==4.37.0 
   ```
 * Step 4: Create a new file named `demo.py` and insert the code snippet below.
@ -111,7 +121,7 @@ Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface
   > This will allow the memory-intensive embedding layer to utilize the CPU instead of GPU.

 * Step 5. Run `demo.py` within the activated Python environment using the following command:
-  ```bash
+  ```cmd
  python demo.py
  ```
   
--- a/docs/readthedocs/source/doc/LLM/Quickstart/webui_quickstart.md
+++ b/docs/readthedocs/source/doc/LLM/Quickstart/webui_quickstart.md
@ -29,7 +29,7 @@ Open **Anaconda Prompt** and activate the conda environment you have created in
 conda activate llm
 ```
 Then, change to the directory of WebUI (e.g.,`C:\text-generation-webui`) and install the necessary dependencies:
-```bash
+```cmd
 cd C:\text-generation-webui
 pip install -r requirements_cpu_only.txt
 ```
@ -37,23 +37,27 @@ pip install -r requirements_cpu_only.txt
 ## 3 Start the WebUI Server

 ### Set Environment Variables
-If you're running on iGPUs, set some environment variables by running below commands in **Anaconda Prompt**:
-  > Note: For more details about runtime configurations, refer to [this link](../Overview/install_gpu.html#runtime-configuration): 
-  ```bash
-  set SYCL_CACHE_PERSISTENT=1
-  set BIGDL_LLM_XMX_DISABLED=1
-  ```
+Configure oneAPI variables by running the following command in **Anaconda Prompt**:
+> Note: For more details about runtime configurations, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration):
+```cmd
+call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
+```
+If you're running on iGPU, set additional environment variables by running the following commands:
+```cmd
+set SYCL_CACHE_PERSISTENT=1
+set BIGDL_LLM_XMX_DISABLED=1
+```

 ### Launch the Server
 In **Anaconda Prompt** with the conda environment `llm` activated, navigate to the text-generation-webui folder and start the server using the following command:
  > Note: with `--load-in-4bit` option, the models will be optimized and run at 4-bit precision. For configuration for other formats and precisions, refer to [this link](https://github.com/intel-analytics/text-generation-webui?tab=readme-ov-file#32-optimizations-for-other-percisions).
-   ```bash
+   ```cmd
   python server.py --load-in-4bit
   ```

 ### Access the WebUI
 Upon successful launch, URLs to access the WebUI will be displayed in the terminal as shown below. Open the provided local URL in your browser to interact with the WebUI. 
-  <!-- ```bash
+  <!-- ```cmd
  Running on local URL:  http://127.0.0.1:7860
  ``` -->
  <img src="https://llm-assets.readthedocs.io/en/latest/_images/webui_quickstart_launch_server.png" width=80%; />