Windows GPU Install Quickstart update (#10240)
* Update install_windows_gpu.md * Update install_windows_gpu.md * Update install_windows_gpu.md * fix numbering * Update install_windows_gpu.md * Update install_windows_gpu.md
This commit is contained in:
parent
38ae4b372f
commit
04a6b0040c
1 changed files with 46 additions and 19 deletions
|
|
@ -1,6 +1,8 @@
|
||||||
# Install BigDL-LLM on Windows for Intel GPU
|
# Install BigDL-LLM on Windows for Intel GPU
|
||||||
|
|
||||||
This guide applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs, as well as Intel Arc Series GPU.
|
This guide demonstrates how to install BigDL-LLM on Windows with Intel GPUs.
|
||||||
|
|
||||||
|
This process applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU.
|
||||||
|
|
||||||
## Install GPU driver
|
## Install GPU driver
|
||||||
|
|
||||||
|
|
@ -37,17 +39,23 @@ This guide applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs, as
|
||||||
|
|
||||||
## Install oneAPI
|
## Install oneAPI
|
||||||
|
|
||||||
* With the `llm` environment active, use `pip` to install the **OneAPI Base Toolkit**:
|
* With the `llm` environment active, use `pip` to install the [**Intel oneAPI Base Toolkit**](https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html):
|
||||||
```bash
|
```bash
|
||||||
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
```
|
```
|
||||||
|
|
||||||
## Install `bigdl-llm`
|
## Install `bigdl-llm`
|
||||||
|
|
||||||
* With the `llm` environment active, use `pip` to install `bigdl-llm` for GPU:
|
* With the `llm` environment active, use `pip` to install `bigdl-llm` for GPU:
|
||||||
```bash
|
Choose either US or CN website for extra index url:
|
||||||
pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
|
* US:
|
||||||
```
|
```bash
|
||||||
|
pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
```
|
||||||
|
* CN:
|
||||||
|
```bash
|
||||||
|
pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
|
||||||
|
```
|
||||||
> Note: If there are network issues when installing IPEX, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#install-bigdl-llm-from-wheel) for more details.
|
> Note: If there are network issues when installing IPEX, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#install-bigdl-llm-from-wheel) for more details.
|
||||||
|
|
||||||
* You can verfy if bigdl-llm is successfully by simply importing a few classes from the library. For example, in the Python interactive shell, execute the following import command:
|
* You can verfy if bigdl-llm is successfully by simply importing a few classes from the library. For example, in the Python interactive shell, execute the following import command:
|
||||||
|
|
@ -56,15 +64,26 @@ This guide applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs, as
|
||||||
```
|
```
|
||||||
|
|
||||||
## A quick example
|
## A quick example
|
||||||
* Next step you can start play with a real LLM. We use [phi-1.5](https://huggingface.co/microsoft/phi-1_5) (an 1.3B model) for demostration. You can copy/paste the following code in a python script and run it.
|
|
||||||
> Note: to use phi-1.5, you may need to update your transformer version to 4.37.0.
|
|
||||||
> ```
|
|
||||||
> pip install -U transformers==4.37.0
|
|
||||||
> ```
|
|
||||||
> Note: when running LLMs on Intel iGPUs for Windows users, we recommend setting `cpu_embedding=True` in the from_pretrained function.
|
|
||||||
> This will allow the memory-intensive embedding layer to utilize the CPU instead of iGPU.
|
|
||||||
|
|
||||||
|
Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface.co/microsoft/phi-1_5) model, a 1.3 billion parameter LLM for this demostration. Follow the steps below to setup and run the model, and observe how it responds to a prompt "What is AI?".
|
||||||
|
|
||||||
|
* Step 1: Open the **Anaconda Prompt** and activate the Python environment `llm` you previously created:
|
||||||
|
```bash
|
||||||
|
conda activate llm
|
||||||
|
```
|
||||||
|
* Step 2: If you're running on integrated GPU, set some environment variables by running below commands:
|
||||||
|
> For more details about runtime configurations, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration):
|
||||||
|
```bash
|
||||||
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
* Step 3: To ensure compatibility with `phi-1.5`, update the transformers library to version 4.37.0:
|
||||||
|
```bash
|
||||||
|
pip install -U transformers==4.37.0
|
||||||
|
```
|
||||||
|
* Step 4: Create a new file named `demo.py` and insert the code snippet below.
|
||||||
```python
|
```python
|
||||||
|
# Copy/Paste the contents to a new file demo.py
|
||||||
import torch
|
import torch
|
||||||
from bigdl.llm.transformers import AutoModelForCausalLM
|
from bigdl.llm.transformers import AutoModelForCausalLM
|
||||||
from transformers import AutoTokenizer, GenerationConfig
|
from transformers import AutoTokenizer, GenerationConfig
|
||||||
|
|
@ -86,11 +105,19 @@ This guide applies to Intel Core Ultra and Core 12 - 14 gen integrated GPUs, as
|
||||||
output_str = tokenizer.decode(output[0], skip_special_tokens=True)
|
output_str = tokenizer.decode(output[0], skip_special_tokens=True)
|
||||||
print(output_str)
|
print(output_str)
|
||||||
```
|
```
|
||||||
|
> Note: when running LLMs on Intel iGPUs with limited memory size, we recommend setting `cpu_embedding=True` in the from_pretrained function.
|
||||||
|
> This will allow the memory-intensive embedding layer to utilize the CPU instead of GPU.
|
||||||
|
|
||||||
* An example output on the laptop equipped with i7 11th Gen Intel Core CPU and Iris Xe Graphics iGPU looks like below.
|
* Step 5. Run `demo.py` within the activated Python environment using the following command:
|
||||||
|
```bash
|
||||||
```
|
python demo.py
|
||||||
Question:What is AI?
|
```
|
||||||
Answer: AI stands for Artificial Intelligence, which is the simulation of human intelligence in machines.
|
|
||||||
```
|
### Example output
|
||||||
|
|
||||||
|
Example output on a system equipped with an 11th Gen Intel Core i7 CPU and Iris Xe Graphics iGPU:
|
||||||
|
```
|
||||||
|
Question:What is AI?
|
||||||
|
Answer: AI stands for Artificial Intelligence, which is the simulation of human intelligence in machines.
|
||||||
|
```
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue