From 04a6b0040c24582f125db3a5a43d869149daaef6 Mon Sep 17 00:00:00 2001 From: Shengsheng Huang Date: Tue, 27 Feb 2024 13:14:39 +0800 Subject: [PATCH] Windows GPU Install Quickstart update (#10240) * Update install_windows_gpu.md * Update install_windows_gpu.md * Update install_windows_gpu.md * fix numbering * Update install_windows_gpu.md * Update install_windows_gpu.md --- .../doc/LLM/Quickstart/install_windows_gpu.md | 65 +++++++++++++------ 1 file changed, 46 insertions(+), 19 deletions(-) diff --git a/docs/readthedocs/source/doc/LLM/Quickstart/install_windows_gpu.md b/docs/readthedocs/source/doc/LLM/Quickstart/install_windows_gpu.md index 7b896bb5..c68d4776 100644 --- a/docs/readthedocs/source/doc/LLM/Quickstart/install_windows_gpu.md +++ b/docs/readthedocs/source/doc/LLM/Quickstart/install_windows_gpu.md @@ -1,6 +1,8 @@ # Install BigDL-LLM on Windows for Intel GPU -This guide applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs, as well as Intel Arc Series GPU. +This guide demonstrates how to install BigDL-LLM on Windows with Intel GPUs. + +This process applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU. ## Install GPU driver @@ -37,17 +39,23 @@ This guide applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs, as ## Install oneAPI -* With the `llm` environment active, use `pip` to install the **OneAPI Base Toolkit**: +* With the `llm` environment active, use `pip` to install the [**Intel oneAPI Base Toolkit**](https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html): ```bash pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0 ``` ## Install `bigdl-llm` -* With the `llm` environment active, use `pip` to install `bigdl-llm` for GPU: - ```bash - pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/ - ``` +* With the `llm` environment active, use `pip` to install `bigdl-llm` for GPU: + Choose either US or CN website for extra index url: + * US: + ```bash + pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ + ``` + * CN: + ```bash + pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/ + ``` > Note: If there are network issues when installing IPEX, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#install-bigdl-llm-from-wheel) for more details. * You can verfy if bigdl-llm is successfully by simply importing a few classes from the library. For example, in the Python interactive shell, execute the following import command: @@ -56,15 +64,26 @@ This guide applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs, as ``` ## A quick example -* Next step you can start play with a real LLM. We use [phi-1.5](https://huggingface.co/microsoft/phi-1_5) (an 1.3B model) for demostration. You can copy/paste the following code in a python script and run it. -> Note: to use phi-1.5, you may need to update your transformer version to 4.37.0. -> ``` -> pip install -U transformers==4.37.0 -> ``` -> Note: when running LLMs on Intel iGPUs for Windows users, we recommend setting `cpu_embedding=True` in the from_pretrained function. -> This will allow the memory-intensive embedding layer to utilize the CPU instead of iGPU. +Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface.co/microsoft/phi-1_5) model, a 1.3 billion parameter LLM for this demostration. Follow the steps below to setup and run the model, and observe how it responds to a prompt "What is AI?". + +* Step 1: Open the **Anaconda Prompt** and activate the Python environment `llm` you previously created: + ```bash + conda activate llm + ``` +* Step 2: If you're running on integrated GPU, set some environment variables by running below commands: + > For more details about runtime configurations, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration): + ```bash + set SYCL_CACHE_PERSISTENT=1 + set BIGDL_LLM_XMX_DISABLED=1 + ``` +* Step 3: To ensure compatibility with `phi-1.5`, update the transformers library to version 4.37.0: + ```bash + pip install -U transformers==4.37.0 + ``` +* Step 4: Create a new file named `demo.py` and insert the code snippet below. ```python + # Copy/Paste the contents to a new file demo.py import torch from bigdl.llm.transformers import AutoModelForCausalLM from transformers import AutoTokenizer, GenerationConfig @@ -86,11 +105,19 @@ This guide applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs, as output_str = tokenizer.decode(output[0], skip_special_tokens=True) print(output_str) ``` + > Note: when running LLMs on Intel iGPUs with limited memory size, we recommend setting `cpu_embedding=True` in the from_pretrained function. + > This will allow the memory-intensive embedding layer to utilize the CPU instead of GPU. -* An example output on the laptop equipped with i7 11th Gen Intel Core CPU and Iris Xe Graphics iGPU looks like below. - -``` -Question:What is AI? -Answer: AI stands for Artificial Intelligence, which is the simulation of human intelligence in machines. -``` +* Step 5. Run `demo.py` within the activated Python environment using the following command: + ```bash + python demo.py + ``` + + ### Example output + + Example output on a system equipped with an 11th Gen Intel Core i7 CPU and Iris Xe Graphics iGPU: + ``` + Question:What is AI? + Answer: AI stands for Artificial Intelligence, which is the simulation of human intelligence in machines. + ```