GPU configuration update for examples (windows pip installer, etc.) (#10762)

* renew chatglm3-6b gpu example readme

fix

fix

fix

* fix for comments

* fix

* fix

* fix

* fix

* fix

* apply on HF-Transformers-AutoModels

* apply on PyTorch-Models

* fix

* fix
This commit is contained in:
Jin Qiao 2024-04-15 17:42:52 +08:00 committed by GitHub
parent 1bd431976d
commit 73a67804a4
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
74 changed files with 2253 additions and 1532 deletions

View file

@ -21,27 +21,30 @@ conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
#### 1.2 Installation on Windows #### 1.2 Installation on Windows
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -52,6 +55,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -63,11 +67,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -82,7 +98,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -90,15 +106,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -27,22 +27,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -53,6 +55,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -64,11 +67,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -83,7 +98,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -91,15 +106,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -14,6 +14,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
``` ```
@ -22,23 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -14,6 +14,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers_stream_generator # additional package required for Baichuan-7B-Chat to conduct generation pip install transformers_stream_generator # additional package required for Baichuan-7B-Chat to conduct generation
``` ```
@ -22,23 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers_stream_generator # additional package required for Baichuan-7B-Chat to conduct generation pip install transformers_stream_generator # additional package required for Baichuan-7B-Chat to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -21,22 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -47,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -58,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -77,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -85,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -17,25 +17,29 @@ conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
#### 1.2 Installation on Windows #### 1.2 Installation on Windows
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
@ -47,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -58,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -77,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -85,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -22,31 +22,35 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
<details> <details>
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary> <summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -58,11 +62,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -77,7 +93,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -85,15 +101,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```
@ -149,20 +158,23 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
@ -174,6 +186,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -185,11 +198,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -204,7 +229,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -212,15 +237,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
**Stream Chat using `stream_chat()` API**: **Stream Chat using `stream_chat()` API**:

View file

@ -21,20 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -45,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -56,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -75,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -83,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -14,6 +14,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
``` ```
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,38 +8,41 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a DeciLM-7B model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a DeciLM-7B model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
# you can install specific ipex/torch version for your need
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.35.2 # required by DeciLM-7B pip install transformers==4.35.2 # required by DeciLM-7B
``` ```
#### 1.2 Installation on Windows #### 1.2 Installation on Windows
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.35.2 # required by DeciLM-7B pip install transformers==4.35.2 # required by DeciLM-7B
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -61,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples

View file

@ -12,8 +12,7 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
# you can install specific ipex/torch version for your need
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -22,20 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples

View file

@ -9,14 +9,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [recognize.py](./recognize.py), we show a basic use case for a Distil-Whisper model to conduct transcription using `pipeline()` API for long audio input, with IPEX-LLM INT4 optimizations. In the example [recognize.py](./recognize.py), we show a basic use case for a Distil-Whisper model to conduct transcription using `pipeline()` API for long audio input, with IPEX-LLM INT4 optimizations.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install datasets soundfile librosa # required by audio processing pip install datasets soundfile librosa # required by audio processing
``` ```
@ -25,22 +24,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install datasets soundfile librosa # required by audio processing pip install datasets soundfile librosa # required by audio processing
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -51,6 +54,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -62,11 +66,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -81,7 +97,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -89,15 +105,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -23,22 +23,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -21,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -15,6 +15,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for falcon-7b-instruct to conduct generation pip install einops # additional package required for falcon-7b-instruct to conduct generation
``` ```
@ -23,8 +24,12 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for falcon-7b-instruct to conduct generation pip install einops # additional package required for falcon-7b-instruct to conduct generation
``` ```
@ -53,18 +58,17 @@ print(f'tiiuae/falcon-7b-instruct checkpoint is downloaded to {model_path}')
#### 2.2 Replace `modelling_RW.py` #### 2.2 Replace `modelling_RW.py`
For `tiiuae/falcon-7b-instruct`, you should replace the `modelling_RW.py` with [falcon-7b-instruct/modelling_RW.py](./falcon-7b-instruct/modelling_RW.py). For `tiiuae/falcon-7b-instruct`, you should replace the `modelling_RW.py` with [falcon-7b-instruct/modelling_RW.py](./falcon-7b-instruct/modelling_RW.py).
### 3. Configures OneAPI environment variables for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
### 3. Configures OneAPI environment variables
#### 3.1 Configurations for Linux
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 3.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 4. Runtime Configurations ### 4. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 4.1 Configurations for Linux #### 4.1 Configurations for Linux
@ -75,6 +79,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -86,11 +91,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 4.2 Configurations for Windows #### 4.2 Configurations for Windows
<details> <details>
@ -105,7 +122,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -113,15 +130,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 5. Running examples ### 5. Running examples
``` ```

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Flan-t5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Flan-t5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Gemma model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Gemma model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -29,6 +26,9 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -36,17 +36,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
pip install transformers==4.38.1 pip install transformers==4.38.1
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -57,6 +57,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -68,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -87,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -95,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -21,22 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -47,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -58,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -77,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -85,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -21,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -21,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -21,20 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -45,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -56,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -75,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -83,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Mistral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Mistral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -29,6 +26,9 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -36,17 +36,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
pip install transformers==4.34.0 pip install transformers==4.34.0
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -57,6 +57,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -68,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -87,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -95,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Mixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Mixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -29,6 +26,9 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -36,17 +36,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
pip install transformers==4.36.0 pip install transformers==4.36.0
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -57,6 +57,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -68,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -87,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -95,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -14,6 +14,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for mpt-7b-chat and mpt-30b-chat to conduct generation pip install einops # additional package required for mpt-7b-chat and mpt-30b-chat to conduct generation
``` ```
@ -22,21 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for mpt-7b-chat and mpt-30b-chat to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -47,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -58,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -77,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -85,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -14,6 +14,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for phi-1_5 to conduct generation pip install einops # additional package required for phi-1_5 to conduct generation
``` ```
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for phi-1_5 to conduct generation pip install einops # additional package required for phi-1_5 to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -14,27 +14,34 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for phi-2 to conduct generation pip install einops # additional package required for phi-2 to conduct generation
``` ```
#### 1.2 Installation on Windows #### 1.2 Installation on Windows
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for phi-2 to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
@ -46,7 +53,9 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
<details> <details>
@ -56,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -74,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -82,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -21,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,14 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [chat.py](./chat.py), we show a basic use case for a Qwen-VL model to start a multimodal chat using `chat()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [chat.py](./chat.py), we show a basic use case for a Qwen-VL model to start a multimodal chat using `chat()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
``` ```
@ -24,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -61,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -14,6 +14,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install tiktoken einops transformers_stream_generator # additional package required for Qwen-7B-Chat to conduct generation pip install tiktoken einops transformers_stream_generator # additional package required for Qwen-7B-Chat to conduct generation
``` ```
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install tiktoken einops transformers_stream_generator # additional package required for Qwen-7B-Chat to conduct generation pip install tiktoken einops transformers_stream_generator # additional package required for Qwen-7B-Chat to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -14,6 +14,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.37.0 # install transformers which supports Qwen2 pip install transformers==4.37.0 # install transformers which supports Qwen2
``` ```
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.37.2 # install transformers which supports Qwen2
pip install transformers==4.37.0 # install transformers which supports Qwen2
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,9 +8,7 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for an redpajama model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for an redpajama model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
@ -23,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -48,9 +49,9 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
<details> <details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,14 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for an Replit model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for an Replit model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install "transformers<4.35" pip install "transformers<4.35"
``` ```
@ -24,21 +23,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,9 +51,9 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
<details> <details>
@ -61,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -80,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -88,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -17,25 +17,29 @@ conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
#### 1.2 Installation on Windows #### 1.2 Installation on Windows
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
@ -47,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -58,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -77,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -85,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -17,25 +17,30 @@ conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
#### 1.2 Installation on Windows #### 1.2 Installation on Windows
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -46,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -57,10 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -75,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -83,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```
python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --n-predict N_PREDICT python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --n-predict N_PREDICT

View file

@ -14,6 +14,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.35.2 # required by SOLAR pip install transformers==4.35.2 # required by SOLAR
``` ```
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.35.2 # required by SOLAR pip install transformers==4.35.2 # required by SOLAR
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a StableLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a StableLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -27,6 +24,9 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -34,17 +34,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
pip install transformers==4.38.0 pip install transformers==4.38.0
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -55,6 +55,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -66,11 +67,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -85,7 +98,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -93,15 +106,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -21,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -23,22 +23,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -16,6 +16,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install librosa soundfile datasets pip install librosa soundfile datasets
pip install accelerate pip install accelerate
pip install SpeechRecognition sentencepiece colorama pip install SpeechRecognition sentencepiece colorama
@ -28,25 +29,29 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install librosa soundfile datasets pip install librosa soundfile datasets
pip install accelerate pip install accelerate
pip install SpeechRecognition sentencepiece colorama pip install SpeechRecognition sentencepiece colorama
pip install PyAudio inquirer pip install PyAudio inquirer
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -57,6 +62,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -68,11 +74,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -87,7 +105,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -95,15 +113,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -15,6 +15,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install datasets soundfile librosa # required by audio processing pip install datasets soundfile librosa # required by audio processing
``` ```
@ -23,23 +24,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install datasets soundfile librosa # required by audio processing pip install datasets soundfile librosa # required by audio processing
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -50,6 +54,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -61,11 +66,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -80,7 +97,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -88,15 +105,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```
python ./recognize.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE python ./recognize.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Yi model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Yi model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for Yi-6B to conduct generation pip install einops # additional package required for Yi-6B to conduct generation
``` ```
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for Yi-6B to conduct generation pip install einops # additional package required for Yi-6B to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -51,9 +53,9 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
<details> <details>
@ -63,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -82,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -90,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -15,31 +15,37 @@ We suggest using conda to manage environment:
conda create -n llm python=3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for Yuan2 to conduct generation pip install einops # additional package required for Yuan2 to conduct generation
pip install pandas # additional package required for Yuan2 to conduct generation pip install pandas # additional package required for Yuan2 to conduct generation
``` ```
#### 1.2 Installation on Windows #### 1.2 Installation on Windows
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for Yuan2 to conduct generation pip install einops # additional package required for Yuan2 to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -50,6 +56,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -61,10 +68,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +99,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +107,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Aquila2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Aquila2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Baichuan model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Baichuan model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
``` ```
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -51,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -62,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Baichuan2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Baichuan2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers_stream_generator # additional package required for Baichuan2-7B-Chat to conduct generation pip install transformers_stream_generator # additional package required for Baichuan2-7B-Chat to conduct generation
``` ```
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers_stream_generator # additional package required for Baichuan2-7B-Chat to conduct generation pip install transformers_stream_generator # additional package required for Baichuan2-7B-Chat to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -51,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -62,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -81,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -89,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
In the example [synthesize_speech.py](./synthesize_speech.py), we show a basic use case for Bark model to synthesize speech based on the given text, with IPEX-LLM INT4 optimizations. In the example [synthesize_speech.py](./synthesize_speech.py), we show a basic use case for Bark model to synthesize speech based on the given text, with IPEX-LLM INT4 optimizations.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install scipy pip install scipy
``` ```
@ -25,23 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install scipy pip install scipy
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -52,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -63,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -82,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -90,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a BlueLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a BlueLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a ChatGLM2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a ChatGLM2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash
@ -132,13 +138,10 @@ Inference time: xxxx s
In the example [streamchat.py](./streamchat.py), we show a basic use case for a ChatGLM2 model to stream chat, with IPEX-LLM INT4 optimizations. In the example [streamchat.py](./streamchat.py), we show a basic use case for a ChatGLM2 model to stream chat, with IPEX-LLM INT4 optimizations.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -148,21 +151,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -173,6 +179,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -184,10 +191,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -202,7 +222,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -210,15 +230,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
**Stream Chat using `stream_chat()` API**: **Stream Chat using `stream_chat()` API**:

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a ChatGLM3 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a ChatGLM3 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash
@ -131,13 +137,10 @@ AI stands for Artificial Intelligence. It refers to the development of computer
In the example [streamchat.py](./streamchat.py), we show a basic use case for a ChatGLM3 model to stream chat, with IPEX-LLM INT4 optimizations. In the example [streamchat.py](./streamchat.py), we show a basic use case for a ChatGLM3 model to stream chat, with IPEX-LLM INT4 optimizations.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -147,21 +150,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -172,6 +178,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -183,10 +190,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -201,7 +221,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -209,15 +229,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
**Stream Chat using `stream_chat()` API**: **Stream Chat using `stream_chat()` API**:
``` ```

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a CodeLlama model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a CodeLlama model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
``` ```
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -51,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -62,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -81,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -89,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,17 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a DeciLM-7B model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a DeciLM-7B model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default
# you can install specific ipex/torch version for your need
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.35.2 # required by DeciLM-7B pip install transformers==4.35.2 # required by DeciLM-7B
``` ```
@ -27,20 +23,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -51,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -62,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -81,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -89,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples

View file

@ -8,15 +8,11 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Deepseek model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Deepseek model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default
# you can install specific ipex/torch version for your need
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -25,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -50,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -61,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -80,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -88,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -9,14 +9,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [recognize.py](./recognize.py), we show a basic use case for a Distil-Whisper model to conduct transcription using `pipeline()` API for long audio input, with IPEX-LLM INT4 optimizations. In the example [recognize.py](./recognize.py), we show a basic use case for a Distil-Whisper model to conduct transcription using `pipeline()` API for long audio input, with IPEX-LLM INT4 optimizations.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install datasets soundfile librosa # required by audio processing pip install datasets soundfile librosa # required by audio processing
``` ```
@ -25,22 +24,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install datasets soundfile librosa # required by audio processing pip install datasets soundfile librosa # required by audio processing
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -51,6 +54,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -62,11 +66,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -81,7 +97,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -89,15 +105,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Dolly v1 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Dolly v1 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Dolly v2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Dolly v2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Flan-t5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Flan-t5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -21,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Llama2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Llama2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,30 +21,35 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
<details> <details>
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary> <summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -59,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -78,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -86,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a LLaVA model to start a multi-turn chat centered around an image using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a LLaVA model to start a multi-turn chat centered around an image using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # install dependencies required by llava pip install einops # install dependencies required by llava
pip install transformers==4.36.2 pip install transformers==4.36.2
@ -31,8 +29,12 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # install dependencies required by llava pip install einops # install dependencies required by llava
pip install transformers==4.36.2 pip install transformers==4.36.2
@ -40,29 +42,30 @@ git clone https://github.com/haotian-liu/LLaVA.git # clone the llava libary
copy generate.py .\LLaVA\ # copy our example to the LLaVA folder copy generate.py .\LLaVA\ # copy our example to the LLaVA folder
cd LLaVA # change the working directory to the LLaVA folder cd LLaVA # change the working directory to the LLaVA folder
git checkout tags/v1.2.0 -b 1.2.0 # Get the branch which is compatible with transformers 4.36 git checkout tags/v1.2.0 -b 1.2.0 # Get the branch which is compatible with transformers 4.36
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
<details> <details>
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary> <summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -74,11 +77,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -93,7 +108,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -101,15 +116,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -7,33 +7,107 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
## Example: Predict Tokens using `generate()` API ## Example: Predict Tokens using `generate()` API
In the example [generate.py](./generate.py), we show a basic use case for a Mamba model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Mamba model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). #### 1.1 Installation on Linux
We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default
# you can install specific ipex/torch version for your need
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # package required by Mamba pip install einops # package required by Mamba
``` ```
### 2. Configures OneAPI environment variables #### 1.2 Installation on Windows
We suggest using conda to manage environment:
```bash
conda create -n llm python=3.11 libuv
conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # package required by Mamba
```
### 2. Configures OneAPI environment variables for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
### 3. Run ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux
<details>
For optimal performance on Arc, it is recommended to set several environment variables. <summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details>
<details>
<summary>For Intel Data Center GPU Max Series</summary>
```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1
```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows
<details>
<summary>For Intel iGPU</summary>
```cmd
set SYCL_CACHE_PERSISTENT=1
set BIGDL_LLM_XMX_DISABLED=1
```
</details>
<details>
<summary>For Intel Arc™ A-Series Graphics</summary>
```cmd
set SYCL_CACHE_PERSISTENT=1
```
</details>
> [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples
```bash ```bash
python ./generate.py python ./generate.py
``` ```
@ -45,7 +119,7 @@ In the example, several arguments can be passed to satisfy your requirements:
- `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'What is AI?'`. - `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'What is AI?'`.
- `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`. - `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
#### 2.3 Sample Output #### 4.3 Sample Output
#### [state-spaces/mamba-1.4b](https://huggingface.co/state-spaces/mamba-1.4b) #### [state-spaces/mamba-1.4b](https://huggingface.co/state-spaces/mamba-1.4b)
```log ```log
Inference time: xxxx s Inference time: xxxx s

View file

@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Mistral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Mistral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -29,21 +26,27 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
# Refer to https://huggingface.co/mistralai/Mistral-7B-v0.1#troubleshooting, please make sure you are using a stable version of Transformers, 4.34.0 or newer.
pip install transformers==4.34.0 pip install transformers==4.34.0
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -54,6 +57,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -65,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -84,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -92,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Mixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Mixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -29,6 +26,9 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -36,26 +36,28 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
pip install transformers==4.36.0 pip install transformers==4.36.0
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
<details> <details>
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary> <summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -67,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -86,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -94,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,14 +8,13 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
In the example [generate.py](./generate.py), we show a basic use case for a phi-1_5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a phi-1_5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for phi-1_5 to conduct generation pip install einops # additional package required for phi-1_5 to conduct generation
``` ```
@ -24,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for phi-1_5 to conduct generation pip install einops # additional package required for phi-1_5 to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -61,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,36 +8,39 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
In the example [generate.py](./generate.py), we show a basic use case for a phi-2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a phi-2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for phi-2 to conduct generation pip install einops # additional package required for phi-2 to conduct generation
``` ```
#### 1.2 Installation on Windows #### 1.2 Installation on Windows
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -48,7 +51,9 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
<details> <details>
@ -58,10 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -76,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -84,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,14 +8,13 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
In the example [generate.py](./generate.py), we show a basic use case for a phixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a phixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for phixtral to conduct generation pip install einops # additional package required for phixtral to conduct generation
``` ```
@ -24,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for phixtral to conduct generation pip install einops # additional package required for phixtral to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -61,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,14 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [chat.py](./chat.py), we show a basic use case for a Qwen-VL model to start a multimodal chat using `chat()` API, with IPEX-LLM 'optimize_model' API on Intel GPUs. In the example [chat.py](./chat.py), we show a basic use case for a Qwen-VL model to start a multimodal chat using `chat()` API, with IPEX-LLM 'optimize_model' API on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
``` ```
@ -24,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -61,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -14,6 +14,7 @@ conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.37.0 # install transformers which supports Qwen2 pip install transformers==4.37.0 # install transformers which supports Qwen2
``` ```
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.37.2 # install transformers which supports Qwen2
pip install transformers==4.37.0 # install transformers which supports Qwen2
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
``` ```

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Replit model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Replit model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install "transformers<4.35" pip install "transformers<4.35"
``` ```
@ -25,21 +23,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -47,13 +48,10 @@ For optimal performance, it is recommended to set several environment variables.
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary> <summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
### 3. Run
For optimal performance on Arc, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -65,10 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -83,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -91,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a SOLAR model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a SOLAR model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.35.2 # required by SOLAR pip install transformers==4.35.2 # required by SOLAR
``` ```
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install transformers==4.35.2 # required by SOLAR pip install transformers==4.35.2 # required by SOLAR
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -51,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -62,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
In the example [synthesize_speech.py](./synthesize_speech.py), we show a basic use case for SpeechT5 model to synthesize speech based on the given text, with IPEX-LLM INT4 optimizations. In the example [synthesize_speech.py](./synthesize_speech.py), we show a basic use case for SpeechT5 model to synthesize speech based on the given text, with IPEX-LLM INT4 optimizations.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install "datasets<2.18" soundfile # additional package required for SpeechT5 to conduct generation pip install "datasets<2.18" soundfile # additional package required for SpeechT5 to conduct generation
``` ```
@ -25,23 +23,26 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install "datasets<2.18" soundfile # additional package required for SpeechT5 to conduct generation pip install "datasets<2.18" soundfile # additional package required for SpeechT5 to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -52,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -63,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -82,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -90,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a StableLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a StableLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -27,6 +24,9 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
@ -34,17 +34,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
pip install transformers==4.38.0 pip install transformers==4.38.0
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -55,6 +55,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -66,11 +67,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -85,7 +98,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -93,15 +106,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a StarCoder model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a StarCoder model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
In the example [generate.py](./generate.py), we show a basic use case for a Yi model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for a Yi model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 # recommend to use Python 3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for Yi-6B to conduct generation pip install einops # additional package required for Yi-6B to conduct generation
``` ```
@ -25,31 +23,37 @@ We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for Yi-6B to conduct generation pip install einops # additional package required for Yi-6B to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
<details> <details>
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary> <summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
@ -61,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash

View file

@ -10,38 +10,42 @@ In addition, you need to modify some files in Yuan2-2B-hf folder, since Flash at
In the example [generate.py](./generate.py), we show a basic use case for an Yuan2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs. In the example [generate.py](./generate.py), we show a basic use case for an Yuan2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
### 1. Install ### 1. Install
#### 1.1 Installation on Linux #### 1.1 Installation on Linux
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#). We suggest using conda to manage environment:
After installing conda, create a Python environment for IPEX-LLM:
```bash ```bash
conda create -n llm python=3.11 conda create -n llm python=3.11
conda activate llm conda activate llm
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option
pip install einops # additional package required for Yuan2 to conduct generation pip install einops # additional package required for Yuan2 to conduct generation
pip install pandas # additional package required for Yuan2 to conduct generation pip install pandas # additional package required for Yuan2 to conduct generation
``` ```
#### 1.2 Installation on Windows #### 1.2 Installation on Windows
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
conda create -n llm python=3.11 libuv conda create -n llm python=3.11 libuv
conda activate llm conda activate llm
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
pip install einops # additional package required for Yuan2 to conduct generation pip install einops # additional package required for Yuan2 to conduct generation
``` ```
### 2. Configures OneAPI environment variables ### 2. Configures OneAPI environment variables for Linux
#### 2.1 Configurations for Linux
> [!NOTE]
> Skip this step if you are running on Windows.
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
```bash ```bash
source /opt/intel/oneapi/setvars.sh source /opt/intel/oneapi/setvars.sh
``` ```
#### 2.2 Configurations for Windows
```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
```
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
### 3. Runtime Configurations ### 3. Runtime Configurations
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device. For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
#### 3.1 Configurations for Linux #### 3.1 Configurations for Linux
@ -49,12 +53,12 @@ For optimal performance, it is recommended to set several environment variables.
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary> <summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
For optimal performance on Arc, it is recommended to set several environment variables.
```bash ```bash
export USE_XETLA=OFF export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
``` ```
</details> </details>
<details> <details>
@ -64,10 +68,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
```bash ```bash
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1 export ENABLE_SDP_FUSION=1
``` ```
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`. > Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
</details> </details>
<details>
<summary>For Intel iGPU</summary>
```bash
export SYCL_CACHE_PERSISTENT=1
export BIGDL_LLM_XMX_DISABLED=1
```
</details>
#### 3.2 Configurations for Windows #### 3.2 Configurations for Windows
<details> <details>
@ -82,7 +99,7 @@ set BIGDL_LLM_XMX_DISABLED=1
<details> <details>
<summary>For Intel Arc™ A300-Series or Pro A60</summary> <summary>For Intel Arc™ A-Series Graphics</summary>
```cmd ```cmd
set SYCL_CACHE_PERSISTENT=1 set SYCL_CACHE_PERSISTENT=1
@ -90,15 +107,8 @@ set SYCL_CACHE_PERSISTENT=1
</details> </details>
<details> > [!NOTE]
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
<summary>For other Intel dGPU Series</summary>
There is no need to set further environment variables.
</details>
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
### 4. Running examples ### 4. Running examples
```bash ```bash