Change quickstart documentation to use oneapi offline installer (#10350)

* Change to oneapi offline installer

* Fixes

* Add "call"

* Fixes
This commit is contained in:
Cheen Hau, 俊豪 2024-03-08 19:24:00 +08:00 committed by GitHub
parent 9026c08633
commit 6829efd350
2 changed files with 41 additions and 27 deletions

View file

@ -4,7 +4,7 @@ This guide demonstrates how to install BigDL-LLM on Windows with Intel GPUs.
It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU.
## Install GPU Driver
## Install Visual Studio 2022
* Download and Install Visual Studio 2022 Community Edition from the [official Microsoft Visual Studio website](https://visualstudio.microsoft.com/downloads/). Ensure you select the **Desktop development with C++ workload** during the installation process.
@ -13,6 +13,8 @@ It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as
>
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/quickstart_windows_gpu_1.png" alt="image-20240221102252560" width=100%; />
## Install GPU Driver
* Download and install the latest GPU driver from the [official Intel download page](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html). A system reboot is necessary to apply the changes after the installation is complete.
> Note: The process could take around 10 minutes. After reboot, check for the **Intel Arc Control** application to verify the driver has been installed correctly. If the installation was successful, you should see the **Arc Control** interface similar to the figure below
@ -22,6 +24,17 @@ It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as
* To monitor your GPU's performance and status, you can use either the **Windows Task Manager** (see the left side of the figure below) or the **Arc Control** application (see the right side of the figure below) :
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/quickstart_windows_gpu_4.png" width=70%; />
## Install oneAPI
<!-- * With the `llm` environment active, use `pip` to install the [**Intel oneAPI Base Toolkit**](https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html):
```cmd
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
``` -->
* Download and install the [**Intel oneAPI Base Toolkit**](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=window&distributions=offline). During installation, you can continue with the default installation settings.
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/quickstart_windows_gpu_oneapi_offline_installer.png" width=90%; />
## Setup Python Environment
* Visit [Miniconda installation page](https://docs.anaconda.com/free/miniconda/), download the **Miniconda installer for Windows**, and follow the instructions to complete the installation.
@ -29,31 +42,24 @@ It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/quickstart_windows_gpu_5.png" width=50%; />
* After installation, open the **Anaconda Prompt**, create a new python environment `llm`:
```bash
```cmd
conda create -n llm python=3.9 libuv
```
* Activate the newly created environment `llm`:
```bash
```cmd
conda activate llm
```
## Install oneAPI
* With the `llm` environment active, use `pip` to install the [**Intel oneAPI Base Toolkit**](https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html):
```bash
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
```
## Install `bigdl-llm`
* With the `llm` environment active, use `pip` to install `bigdl-llm` for GPU:
Choose either US or CN website for `extra-index-url`:
* US:
```bash
```cmd
pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
```
* CN:
```bash
```cmd
pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
```
> Note: If you encounter network issues while installing IPEX, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#install-bigdl-llm-from-wheel) for troubleshooting advice.
@ -68,17 +74,21 @@ It applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as
Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface.co/microsoft/phi-1_5) model, a 1.3 billion parameter LLM for this demostration. Follow the steps below to setup and run the model, and observe how it responds to a prompt "What is AI?".
* Step 1: Open the **Anaconda Prompt** and activate the Python environment `llm` you previously created:
```bash
```cmd
conda activate llm
```
* Step 2: If you're running on iGPU, set some environment variables by running below commands:
> For more details about runtime configurations, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration):
```bash
* Step 2: Configure oneAPI variables by running the following command:
> For more details about runtime configurations, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration):
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
If you're running on iGPU, set additional environment variables by running the following commands:
```cmd
set SYCL_CACHE_PERSISTENT=1
set BIGDL_LLM_XMX_DISABLED=1
```
* Step 3: To ensure compatibility with `phi-1.5`, update the transformers library to version 4.37.0:
```bash
```cmd
pip install -U transformers==4.37.0
```
* Step 4: Create a new file named `demo.py` and insert the code snippet below.
@ -111,7 +121,7 @@ Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface
> This will allow the memory-intensive embedding layer to utilize the CPU instead of GPU.
* Step 5. Run `demo.py` within the activated Python environment using the following command:
```bash
```cmd
python demo.py
```

View file

@ -29,7 +29,7 @@ Open **Anaconda Prompt** and activate the conda environment you have created in
conda activate llm
```
Then, change to the directory of WebUI (e.g.,`C:\text-generation-webui`) and install the necessary dependencies:
```bash
```cmd
cd C:\text-generation-webui
pip install -r requirements_cpu_only.txt
```
@ -37,23 +37,27 @@ pip install -r requirements_cpu_only.txt
## 3 Start the WebUI Server
### Set Environment Variables
If you're running on iGPUs, set some environment variables by running below commands in **Anaconda Prompt**:
> Note: For more details about runtime configurations, refer to [this link](../Overview/install_gpu.html#runtime-configuration):
```bash
set SYCL_CACHE_PERSISTENT=1
set BIGDL_LLM_XMX_DISABLED=1
```
Configure oneAPI variables by running the following command in **Anaconda Prompt**:
> Note: For more details about runtime configurations, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration):
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
If you're running on iGPU, set additional environment variables by running the following commands:
```cmd
set SYCL_CACHE_PERSISTENT=1
set BIGDL_LLM_XMX_DISABLED=1
```
### Launch the Server
In **Anaconda Prompt** with the conda environment `llm` activated, navigate to the text-generation-webui folder and start the server using the following command:
> Note: with `--load-in-4bit` option, the models will be optimized and run at 4-bit precision. For configuration for other formats and precisions, refer to [this link](https://github.com/intel-analytics/text-generation-webui?tab=readme-ov-file#32-optimizations-for-other-percisions).
```bash
```cmd
python server.py --load-in-4bit
```
### Access the WebUI
Upon successful launch, URLs to access the WebUI will be displayed in the terminal as shown below. Open the provided local URL in your browser to interact with the WebUI.
<!-- ```bash
<!-- ```cmd
Running on local URL: http://127.0.0.1:7860
``` -->
<img src="https://llm-assets.readthedocs.io/en/latest/_images/webui_quickstart_launch_server.png" width=80%; />