update linux quickstart and formats of migration (#10530)
* update linux quickstart and formats of migration * update quickstart * update format
This commit is contained in:
parent
c8a1c76304
commit
de5bbf83de
3 changed files with 106 additions and 117 deletions
|
|
@ -1,11 +1,36 @@
|
||||||
# `bigdl-llm` Migration Guide
|
# `bigdl-llm` Migration Guide
|
||||||
|
|
||||||
|
This guide helps you migrate your `bigdl-llm` application to use `ipex-llm`.
|
||||||
|
|
||||||
## Upgrade `bigdl-llm` package to `ipex-llm`
|
## Upgrade `bigdl-llm` package to `ipex-llm`
|
||||||
First uninstall `bigdl-llm` and install `ipex-llm`.
|
|
||||||
|
```eval_rst
|
||||||
|
.. note::
|
||||||
|
This step assumes you have already installed `bigdl-llm`.
|
||||||
|
```
|
||||||
|
You need to uninstall `bigdl-llm` and install `ipex-llm`With your `bigdl-llm` conda envionment activated, exeucte the folloiwng command according to your device type and location:
|
||||||
|
|
||||||
|
### For CPU
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip uninstall -y bigdl-llm
|
pip uninstall -y bigdl-llm
|
||||||
pip install --pre --upgrade ipex-llm[all] # for cpu
|
pip install --pre --upgrade ipex-llm[all] # for cpu
|
||||||
pip install --pre --upgrade ipex-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu # for xpu
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### For GPU
|
||||||
|
```eval_rst
|
||||||
|
.. tabs::
|
||||||
|
.. tab:: US
|
||||||
|
.. code-block:: cmd
|
||||||
|
pip uninstall -y bigdl-llm
|
||||||
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
.. tab:: CN
|
||||||
|
.. code-block:: cmd
|
||||||
|
pip uninstall -y bigdl-llm
|
||||||
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
## Migrate `bigdl-llm` code to `ipex-llm`
|
## Migrate `bigdl-llm` code to `ipex-llm`
|
||||||
There are two options to migrate `bigdl-llm` code to `ipex-llm`.
|
There are two options to migrate `bigdl-llm` code to `ipex-llm`.
|
||||||
|
|
||||||
|
|
@ -13,8 +38,8 @@ There are two options to migrate `bigdl-llm` code to `ipex-llm`.
|
||||||
To upgrade `bigdl-llm` code to `ipex-llm`, simply replace all `bigdl.llm` with `ipex_llm`:
|
To upgrade `bigdl-llm` code to `ipex-llm`, simply replace all `bigdl.llm` with `ipex_llm`:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
#from bigdl.llm.transformers import AutoModelForCausalLM
|
#from bigdl.llm.transformers import AutoModelForCausalLM # Original line
|
||||||
from ipex_llm.transformers import AutoModelForCausalLM
|
from ipex_llm.transformers import AutoModelForCausalLM #Updated line
|
||||||
model = AutoModelForCausalLM.from_pretrained(model_path,
|
model = AutoModelForCausalLM.from_pretrained(model_path,
|
||||||
load_in_4bit=True,
|
load_in_4bit=True,
|
||||||
trust_remote_code=True)
|
trust_remote_code=True)
|
||||||
|
|
@ -24,8 +49,7 @@ model = AutoModelForCausalLM.from_pretrained(model_path,
|
||||||
To run in the compatible mode, simply add `import ipex_llm` at the beginning of the existing `bigdl-llm` code:
|
To run in the compatible mode, simply add `import ipex_llm` at the beginning of the existing `bigdl-llm` code:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# need to add the below line before "import bigdl.llm"
|
import ipex_llm # Add this line before any bigdl.llm imports
|
||||||
import ipex_llm
|
|
||||||
from bigdl.llm.transformers import AutoModelForCausalLM
|
from bigdl.llm.transformers import AutoModelForCausalLM
|
||||||
model = AutoModelForCausalLM.from_pretrained(model_path,
|
model = AutoModelForCausalLM.from_pretrained(model_path,
|
||||||
load_in_4bit=True,
|
load_in_4bit=True,
|
||||||
|
|
|
||||||
|
|
@ -4,10 +4,11 @@ This guide demonstrates how to install IPEX-LLM on Linux with Intel GPUs. It app
|
||||||
|
|
||||||
IPEX-LLM currently supports the Ubuntu 20.04 operating system and later, and supports PyTorch 2.0 and PyTorch 2.1 on Linux. This page demonstrates IPEX-LLM with PyTorch 2.1. Check the [Installation](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux) page for more details.
|
IPEX-LLM currently supports the Ubuntu 20.04 operating system and later, and supports PyTorch 2.0 and PyTorch 2.1 on Linux. This page demonstrates IPEX-LLM with PyTorch 2.1. Check the [Installation](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux) page for more details.
|
||||||
|
|
||||||
|
## Install Prerequisites
|
||||||
|
|
||||||
## Install Intel GPU Driver
|
### Install GPU Driver
|
||||||
|
|
||||||
### For Linux kernel 6.2
|
#### For Linux kernel 6.2
|
||||||
|
|
||||||
* Install arc driver
|
* Install arc driver
|
||||||
```bash
|
```bash
|
||||||
|
|
@ -24,15 +25,12 @@ IPEX-LLM currently supports the Ubuntu 20.04 operating system and later, and sup
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
sudo apt-get update
|
sudo apt-get update
|
||||||
|
|
||||||
sudo apt-get -y install \
|
sudo apt-get -y install \
|
||||||
gawk \
|
gawk \
|
||||||
dkms \
|
dkms \
|
||||||
linux-headers-$(uname -r) \
|
linux-headers-$(uname -r) \
|
||||||
libc6-dev
|
libc6-dev
|
||||||
|
|
||||||
sudo apt install intel-i915-dkms intel-fw-gpu
|
sudo apt install intel-i915-dkms intel-fw-gpu
|
||||||
|
|
||||||
sudo apt-get install -y gawk libc6-dev udev\
|
sudo apt-get install -y gawk libc6-dev udev\
|
||||||
intel-opencl-icd intel-level-zero-gpu level-zero \
|
intel-opencl-icd intel-level-zero-gpu level-zero \
|
||||||
intel-media-va-driver-non-free libmfx1 libmfxgen1 libvpl2 \
|
intel-media-va-driver-non-free libmfx1 libmfxgen1 libvpl2 \
|
||||||
|
|
@ -59,24 +57,7 @@ IPEX-LLM currently supports the Ubuntu 20.04 operating system and later, and sup
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
## Setup Python Environment
|
### Install oneAPI
|
||||||
|
|
||||||
Install the Miniconda as follows if you don't have conda installed on your machine:
|
|
||||||
```bash
|
|
||||||
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
|
|
||||||
|
|
||||||
bash Miniconda3-latest-Linux-x86_64.sh
|
|
||||||
|
|
||||||
source ~/.bashrc
|
|
||||||
|
|
||||||
# Verify the installation
|
|
||||||
conda --version
|
|
||||||
# rm Miniconda3-latest-Linux-x86_64.sh # if you don't need this file any longer
|
|
||||||
```
|
|
||||||
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/install_conda.png" alt="image-20240221102252569" width=100%; />
|
|
||||||
|
|
||||||
|
|
||||||
## Install oneAPI
|
|
||||||
```
|
```
|
||||||
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
|
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
|
||||||
|
|
||||||
|
|
@ -90,23 +71,36 @@ Install the Miniconda as follows if you don't have conda installed on your machi
|
||||||
|
|
||||||
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/basekit.png" alt="image-20240221102252565" width=100%; />
|
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/basekit.png" alt="image-20240221102252565" width=100%; />
|
||||||
|
|
||||||
|
### Setup Python Environment
|
||||||
|
|
||||||
|
Download and install the Miniconda as follows if you don't have conda installed on your machine:
|
||||||
|
```bash
|
||||||
|
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
|
||||||
|
bash Miniconda3-latest-Linux-x86_64.sh
|
||||||
|
source ~/.bashrc
|
||||||
|
```
|
||||||
|
|
||||||
|
You can use `conda --version` to verify you conda installation.
|
||||||
|
|
||||||
|
After installation, create a new python environment `llm`:
|
||||||
|
```cmd
|
||||||
|
conda create -n llm python=3.9 libuv
|
||||||
|
```
|
||||||
|
Activate the newly created environment `llm`:
|
||||||
|
```cmd
|
||||||
|
conda activate llm
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
## Install `ipex-llm`
|
## Install `ipex-llm`
|
||||||
|
|
||||||
* With the `llm` environment active, use `pip` to install `ipex-llm` for GPU:
|
* With the `llm` environment active, use `pip` to install `ipex-llm` for GPU:
|
||||||
```
|
```
|
||||||
conda create -n llm python=3.9
|
|
||||||
conda activate llm
|
|
||||||
|
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://developer.intel.com/ipex-whl-stable-xpu
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://developer.intel.com/ipex-whl-stable-xpu
|
||||||
```
|
```
|
||||||
|
|
||||||
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/create_conda_env.png" alt="image-20240221102252564" width=100%; />
|
## Verify Installation
|
||||||
|
* You can verify if `ipex-llm` is successfully installed by simply importing a few classes from the library. For example, execute the following import command in the terminal:
|
||||||
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/create_conda_env.png" alt="image-20240221102252564" width=100%; />
|
|
||||||
|
|
||||||
|
|
||||||
* You can verify if ipex-llm is successfully installed by simply importing a few classes from the library. For example, execute the following import command in the terminal:
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
|
|
||||||
|
|
@ -115,38 +109,44 @@ Install the Miniconda as follows if you don't have conda installed on your machi
|
||||||
> from ipex_llm.transformers import AutoModel, AutoModelForCausalLM
|
> from ipex_llm.transformers import AutoModel, AutoModelForCausalLM
|
||||||
```
|
```
|
||||||
|
|
||||||
> <img src="https://llm-assets.readthedocs.io/en/latest/_images/verify_ipex_import.png" alt="image-20240221102252562" width=100%; />
|
|
||||||
|
|
||||||
|
|
||||||
## Runtime Configurations
|
## Runtime Configurations
|
||||||
|
|
||||||
To use GPU acceleration on Linux, several environment variables are required or recommended before running a GPU example.
|
To use GPU acceleration on Linux, several environment variables are required or recommended before running a GPU example.
|
||||||
|
|
||||||
* For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series, we recommend:
|
```eval_rst
|
||||||
```bash
|
.. tabs::
|
||||||
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
|
.. tab:: Intel Arc™ A-Series and Intel Data Center GPU Flex
|
||||||
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
|
|
||||||
source /opt/intel/oneapi/setvars.sh
|
|
||||||
|
|
||||||
# Recommended Environment Variables for optimal performance
|
For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series, we recommend:
|
||||||
export USE_XETLA=OFF
|
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
|
||||||
```
|
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
* For Intel Data Center GPU Max Series, we recommend:
|
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
|
||||||
```bash
|
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
|
||||||
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
|
source /opt/intel/oneapi/setvars.sh
|
||||||
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
|
|
||||||
source /opt/intel/oneapi/setvars.sh
|
|
||||||
|
|
||||||
# Recommended Environment Variables for optimal performance
|
# Recommended Environment Variables for optimal performance
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
export ENABLE_SDP_FUSION=1
|
|
||||||
```
|
|
||||||
Please note that `libtcmalloc.so` can be installed by ```conda install -c conda-forge -y gperftools=2.10```.
|
|
||||||
|
|
||||||
|
.. tab:: Intel Data Center GPU Max
|
||||||
|
|
||||||
|
For Intel Data Center GPU Max Series, we recommend:
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
|
||||||
|
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
|
||||||
|
source /opt/intel/oneapi/setvars.sh
|
||||||
|
|
||||||
|
# Recommended Environment Variables for optimal performance
|
||||||
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export ENABLE_SDP_FUSION=1
|
||||||
|
|
||||||
|
Please note that ``libtcmalloc.so`` can be installed by ``conda install -c conda-forge -y gperftools=2.10``
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
## A Quick Example
|
## A Quick Example
|
||||||
|
|
||||||
|
|
@ -156,16 +156,7 @@ Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface
|
||||||
```bash
|
```bash
|
||||||
conda activate llm
|
conda activate llm
|
||||||
```
|
```
|
||||||
* Step 2: If you're running on iGPU, set some environment variables by running below commands:
|
* Step 2: Follow [Runtime Configurations Section](#runtime-configurations) above to prepare your runtime environment.
|
||||||
> For more details about runtime configurations, refer to [this guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration):
|
|
||||||
```bash
|
|
||||||
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
|
|
||||||
source /opt/intel/oneapi/setvars.sh
|
|
||||||
|
|
||||||
# Recommended Environment Variables for optimal performance
|
|
||||||
export USE_XETLA=OFF
|
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
|
||||||
```
|
|
||||||
* Step 3: Create a new file named `demo.py` and insert the code snippet below.
|
* Step 3: Create a new file named `demo.py` and insert the code snippet below.
|
||||||
```python
|
```python
|
||||||
# Copy/Paste the contents to a new file demo.py
|
# Copy/Paste the contents to a new file demo.py
|
||||||
|
|
|
||||||
|
|
@ -93,18 +93,18 @@ Choose either US or CN website for `extra-index-url`:
|
||||||
```
|
```
|
||||||
|
|
||||||
## Verify Installation
|
## Verify Installation
|
||||||
You can verify if `ipex-llm` is successfully installed by simply running a few lines of code:
|
You can verify if `ipex-llm` is successfully installed following below steps.
|
||||||
|
|
||||||
* Step 1: Open the **Anaconda Prompt** and activate the Python environment `llm` you previously created:
|
### Step 1: Runtime Configurations
|
||||||
```cmd
|
* Open the **Anaconda Prompt** and activate the Python environment `llm` you previously created:
|
||||||
conda activate llm
|
```cmd
|
||||||
```
|
conda activate llm
|
||||||
* Step 2: Configure oneAPI variables by running the following command:
|
```
|
||||||
```cmd
|
* Configure oneAPI variables by running the following command:
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
```cmd
|
||||||
```
|
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
||||||
* Step 3:
|
```
|
||||||
Please also set the following environment variable according to your device:
|
* Set the following environment variables according to your device:
|
||||||
|
|
||||||
```eval_rst
|
```eval_rst
|
||||||
.. tabs::
|
.. tabs::
|
||||||
|
|
@ -125,9 +125,12 @@ You can verify if `ipex-llm` is successfully installed by simply running a few l
|
||||||
|
|
||||||
For other Intel dGPU Series, please refer to `this guide <https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration>`_ for more details regarding runtime configuration.
|
For other Intel dGPU Series, please refer to `this guide <https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration>`_ for more details regarding runtime configuration.
|
||||||
```
|
```
|
||||||
* Step 4: Launch the Python interactive shell by typing `python` in the Anaconda prompt window and then press Enter.
|
|
||||||
|
|
||||||
* Step 5: Copy following code to Anaconda prompt **line by line** and press Enter **after copying each line**.
|
### Step 2: Run Python Code
|
||||||
|
|
||||||
|
* Launch the Python interactive shell by typing `python` in the Anaconda prompt window and then press Enter.
|
||||||
|
|
||||||
|
* Copy following code to Anaconda prompt **line by line** and press Enter **after copying each line**.
|
||||||
```python
|
```python
|
||||||
import torch
|
import torch
|
||||||
from ipex_llm.transformers import AutoModel,AutoModelForCausalLM
|
from ipex_llm.transformers import AutoModel,AutoModelForCausalLM
|
||||||
|
|
@ -156,41 +159,12 @@ To monitor your GPU's performance and status (e.g. memory consumption, utilizati
|
||||||
|
|
||||||
Now let's play with a real LLM. We'll be using the [Qwen-1.8B-Chat](https://huggingface.co/Qwen/Qwen-1_8B-Chat) model, a 1.8 billion parameter LLM for this demonstration. Follow the steps below to setup and run the model, and observe how it responds to a prompt "What is AI?".
|
Now let's play with a real LLM. We'll be using the [Qwen-1.8B-Chat](https://huggingface.co/Qwen/Qwen-1_8B-Chat) model, a 1.8 billion parameter LLM for this demonstration. Follow the steps below to setup and run the model, and observe how it responds to a prompt "What is AI?".
|
||||||
|
|
||||||
* Step 1: Open the **Anaconda Prompt** and activate the Python environment `llm` you previously created:
|
* Step 1: Follow [Runtime Configurations Section](#step-1-runtime-configurations) above to prepare your runtime environment.
|
||||||
```cmd
|
* Step 2: Install additional package required for Qwen-1.8B-Chat to conduct:
|
||||||
conda activate llm
|
|
||||||
```
|
|
||||||
* Step 2: Configure oneAPI variables by running the following command:
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
* Step 3:
|
|
||||||
Please also set the following environment variable according to your device:
|
|
||||||
|
|
||||||
```eval_rst
|
|
||||||
.. tabs::
|
|
||||||
.. tab:: Intel iGPU
|
|
||||||
|
|
||||||
.. code-block:: cmd
|
|
||||||
|
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
|
||||||
set BIGDL_LLM_XMX_DISABLED=1
|
|
||||||
|
|
||||||
.. tab:: Intel Arc™ A770
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
```
|
|
||||||
|
|
||||||
```eval_rst
|
|
||||||
.. seealso::
|
|
||||||
|
|
||||||
For other Intel dGPU Series, please refer to `this guide <https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration>`_ for more details regarding runtime configuration.
|
|
||||||
```
|
|
||||||
* Step 4: Install additional package required for Qwen-1.8B-Chat to conduct:
|
|
||||||
```cmd
|
```cmd
|
||||||
pip install tiktoken transformers_stream_generator einops
|
pip install tiktoken transformers_stream_generator einops
|
||||||
```
|
```
|
||||||
* Step 5: Create code file. IPEX-LLM supports loading model from Hugging Face or ModelScope. Please choose according to your requirements.
|
* Step 3: Create code file. IPEX-LLM supports loading model from Hugging Face or ModelScope. Please choose according to your requirements.
|
||||||
```eval_rst
|
```eval_rst
|
||||||
.. tabs::
|
.. tabs::
|
||||||
.. tab:: Hugging Face
|
.. tab:: Hugging Face
|
||||||
|
|
@ -322,7 +296,7 @@ Now let's play with a real LLM. We'll be using the [Qwen-1.8B-Chat](https://hugg
|
||||||
This will allow the memory-intensive embedding layer to utilize the CPU instead of GPU.
|
This will allow the memory-intensive embedding layer to utilize the CPU instead of GPU.
|
||||||
```
|
```
|
||||||
|
|
||||||
* Step 5. Run `demo.py` within the activated Python environment using the following command:
|
* Step 4. Run `demo.py` within the activated Python environment using the following command:
|
||||||
```cmd
|
```cmd
|
||||||
python demo.py
|
python demo.py
|
||||||
```
|
```
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue