GPU configuration update for examples (windows pip installer, etc.) (#10762)
* renew chatglm3-6b gpu example readme fix fix fix * fix for comments * fix * fix * fix * fix * fix * apply on HF-Transformers-AutoModels * apply on PyTorch-Models * fix * fix
This commit is contained in:
parent
1bd431976d
commit
73a67804a4
74 changed files with 2253 additions and 1532 deletions
|
|
@ -21,27 +21,30 @@ conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 1.2 Installation on Windows
|
#### 1.2 Installation on Windows
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -52,6 +55,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -63,11 +67,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -82,7 +98,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -90,15 +106,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -27,22 +27,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -53,6 +55,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -64,11 +67,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -83,7 +98,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -91,15 +106,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
|
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,23 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
|
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers_stream_generator # additional package required for Baichuan-7B-Chat to conduct generation
|
pip install transformers_stream_generator # additional package required for Baichuan-7B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,23 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers_stream_generator # additional package required for Baichuan-7B-Chat to conduct generation
|
pip install transformers_stream_generator # additional package required for Baichuan-7B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -21,22 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -47,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -58,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -77,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -85,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -17,25 +17,29 @@ conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 1.2 Installation on Windows
|
#### 1.2 Installation on Windows
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
|
|
@ -47,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -58,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -77,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -85,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -22,31 +22,35 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -58,11 +62,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -77,7 +93,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -85,15 +101,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
@ -149,20 +158,23 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
|
|
@ -174,6 +186,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -185,11 +198,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -204,7 +229,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -212,15 +237,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
**Stream Chat using `stream_chat()` API**:
|
**Stream Chat using `stream_chat()` API**:
|
||||||
|
|
|
||||||
|
|
@ -21,20 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -45,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -56,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -75,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -83,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
|
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
|
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,38 +8,41 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a DeciLM-7B model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a DeciLM-7B model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
# you can install specific ipex/torch version for your need
|
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.35.2 # required by DeciLM-7B
|
pip install transformers==4.35.2 # required by DeciLM-7B
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 1.2 Installation on Windows
|
#### 1.2 Installation on Windows
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.35.2 # required by DeciLM-7B
|
pip install transformers==4.35.2 # required by DeciLM-7B
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -61,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -12,8 +12,7 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
# you can install specific ipex/torch version for your need
|
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,20 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -9,14 +9,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [recognize.py](./recognize.py), we show a basic use case for a Distil-Whisper model to conduct transcription using `pipeline()` API for long audio input, with IPEX-LLM INT4 optimizations.
|
In the example [recognize.py](./recognize.py), we show a basic use case for a Distil-Whisper model to conduct transcription using `pipeline()` API for long audio input, with IPEX-LLM INT4 optimizations.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install datasets soundfile librosa # required by audio processing
|
pip install datasets soundfile librosa # required by audio processing
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,22 +24,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install datasets soundfile librosa # required by audio processing
|
pip install datasets soundfile librosa # required by audio processing
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -51,6 +54,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -62,11 +66,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -81,7 +97,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -89,15 +105,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -23,22 +23,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -21,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -15,6 +15,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for falcon-7b-instruct to conduct generation
|
pip install einops # additional package required for falcon-7b-instruct to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -23,8 +24,12 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for falcon-7b-instruct to conduct generation
|
pip install einops # additional package required for falcon-7b-instruct to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -53,18 +58,17 @@ print(f'tiiuae/falcon-7b-instruct checkpoint is downloaded to {model_path}')
|
||||||
#### 2.2 Replace `modelling_RW.py`
|
#### 2.2 Replace `modelling_RW.py`
|
||||||
For `tiiuae/falcon-7b-instruct`, you should replace the `modelling_RW.py` with [falcon-7b-instruct/modelling_RW.py](./falcon-7b-instruct/modelling_RW.py).
|
For `tiiuae/falcon-7b-instruct`, you should replace the `modelling_RW.py` with [falcon-7b-instruct/modelling_RW.py](./falcon-7b-instruct/modelling_RW.py).
|
||||||
|
|
||||||
|
### 3. Configures OneAPI environment variables for Linux
|
||||||
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
### 3. Configures OneAPI environment variables
|
|
||||||
#### 3.1 Configurations for Linux
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 4. Runtime Configurations
|
### 4. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 4.1 Configurations for Linux
|
#### 4.1 Configurations for Linux
|
||||||
|
|
@ -75,6 +79,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -86,11 +91,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 4.2 Configurations for Windows
|
#### 4.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -105,7 +122,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -113,15 +130,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 5. Running examples
|
### 5. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Flan-t5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Flan-t5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Gemma model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Gemma model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -29,6 +26,9 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -36,17 +36,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
|
||||||
pip install transformers==4.38.1
|
pip install transformers==4.38.1
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -57,6 +57,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -68,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -87,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -95,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -21,22 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -47,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -58,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -77,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -85,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -21,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -21,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -21,20 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -45,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -56,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -75,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -83,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Mistral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Mistral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -29,6 +26,9 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -36,17 +36,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
|
||||||
pip install transformers==4.34.0
|
pip install transformers==4.34.0
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -57,6 +57,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -68,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -87,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -95,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Mixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Mixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -29,6 +26,9 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -36,17 +36,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
|
||||||
pip install transformers==4.36.0
|
pip install transformers==4.36.0
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -57,6 +57,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -68,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -87,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -95,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for mpt-7b-chat and mpt-30b-chat to conduct generation
|
pip install einops # additional package required for mpt-7b-chat and mpt-30b-chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,21 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
pip install einops # additional package required for mpt-7b-chat and mpt-30b-chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -47,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -58,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -77,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -85,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for phi-1_5 to conduct generation
|
pip install einops # additional package required for phi-1_5 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for phi-1_5 to conduct generation
|
pip install einops # additional package required for phi-1_5 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -14,27 +14,34 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for phi-2 to conduct generation
|
pip install einops # additional package required for phi-2 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 1.2 Installation on Windows
|
#### 1.2 Installation on Windows
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
pip install einops # additional package required for phi-2 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
|
|
@ -46,7 +53,9 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
@ -56,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -74,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -82,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -21,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,14 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [chat.py](./chat.py), we show a basic use case for a Qwen-VL model to start a multimodal chat using `chat()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [chat.py](./chat.py), we show a basic use case for a Qwen-VL model to start a multimodal chat using `chat()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
|
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -24,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
|
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -61,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install tiktoken einops transformers_stream_generator # additional package required for Qwen-7B-Chat to conduct generation
|
pip install tiktoken einops transformers_stream_generator # additional package required for Qwen-7B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install tiktoken einops transformers_stream_generator # additional package required for Qwen-7B-Chat to conduct generation
|
pip install tiktoken einops transformers_stream_generator # additional package required for Qwen-7B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.37.0 # install transformers which supports Qwen2
|
pip install transformers==4.37.0 # install transformers which supports Qwen2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
pip install transformers==4.37.2 # install transformers which supports Qwen2
|
|
||||||
|
pip install transformers==4.37.0 # install transformers which supports Qwen2
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,9 +8,7 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for an redpajama model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for an redpajama model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
@ -23,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -48,9 +49,9 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,14 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for an Replit model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for an Replit model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install "transformers<4.35"
|
pip install "transformers<4.35"
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -24,21 +23,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,9 +51,9 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
@ -61,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -80,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -88,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -17,25 +17,29 @@ conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 1.2 Installation on Windows
|
#### 1.2 Installation on Windows
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
|
|
@ -47,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -58,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -77,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -85,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -17,25 +17,30 @@ conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 1.2 Installation on Windows
|
#### 1.2 Installation on Windows
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -46,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -57,10 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -75,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -83,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
```
|
```
|
||||||
python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --n-predict N_PREDICT
|
python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --n-predict N_PREDICT
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.35.2 # required by SOLAR
|
pip install transformers==4.35.2 # required by SOLAR
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.35.2 # required by SOLAR
|
pip install transformers==4.35.2 # required by SOLAR
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a StableLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a StableLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -27,6 +24,9 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -34,17 +34,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
|
||||||
pip install transformers==4.38.0
|
pip install transformers==4.38.0
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -55,6 +55,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -66,11 +67,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -85,7 +98,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -93,15 +106,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -21,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -23,22 +23,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -16,6 +16,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install librosa soundfile datasets
|
pip install librosa soundfile datasets
|
||||||
pip install accelerate
|
pip install accelerate
|
||||||
pip install SpeechRecognition sentencepiece colorama
|
pip install SpeechRecognition sentencepiece colorama
|
||||||
|
|
@ -28,25 +29,29 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install librosa soundfile datasets
|
pip install librosa soundfile datasets
|
||||||
pip install accelerate
|
pip install accelerate
|
||||||
pip install SpeechRecognition sentencepiece colorama
|
pip install SpeechRecognition sentencepiece colorama
|
||||||
pip install PyAudio inquirer
|
pip install PyAudio inquirer
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -57,6 +62,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -68,11 +74,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -87,7 +105,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -95,15 +113,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -15,6 +15,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install datasets soundfile librosa # required by audio processing
|
pip install datasets soundfile librosa # required by audio processing
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -23,23 +24,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install datasets soundfile librosa # required by audio processing
|
pip install datasets soundfile librosa # required by audio processing
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -50,6 +54,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -61,11 +66,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -80,7 +97,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -88,15 +105,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
```
|
```
|
||||||
python ./recognize.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE
|
python ./recognize.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Yi model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Yi model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for Yi-6B to conduct generation
|
pip install einops # additional package required for Yi-6B to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for Yi-6B to conduct generation
|
pip install einops # additional package required for Yi-6B to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -51,9 +53,9 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
@ -63,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -82,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -90,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -15,31 +15,37 @@ We suggest using conda to manage environment:
|
||||||
conda create -n llm python=3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for Yuan2 to conduct generation
|
pip install einops # additional package required for Yuan2 to conduct generation
|
||||||
pip install pandas # additional package required for Yuan2 to conduct generation
|
pip install pandas # additional package required for Yuan2 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 1.2 Installation on Windows
|
#### 1.2 Installation on Windows
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for Yuan2 to conduct generation
|
pip install einops # additional package required for Yuan2 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -50,6 +56,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -61,10 +68,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +99,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +107,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Aquila2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Aquila2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Baichuan model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Baichuan model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
|
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
|
pip install transformers_stream_generator # additional package required for Baichuan-13B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -51,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -62,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Baichuan2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Baichuan2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers_stream_generator # additional package required for Baichuan2-7B-Chat to conduct generation
|
pip install transformers_stream_generator # additional package required for Baichuan2-7B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers_stream_generator # additional package required for Baichuan2-7B-Chat to conduct generation
|
pip install transformers_stream_generator # additional package required for Baichuan2-7B-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -51,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -62,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -81,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -89,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
|
||||||
In the example [synthesize_speech.py](./synthesize_speech.py), we show a basic use case for Bark model to synthesize speech based on the given text, with IPEX-LLM INT4 optimizations.
|
In the example [synthesize_speech.py](./synthesize_speech.py), we show a basic use case for Bark model to synthesize speech based on the given text, with IPEX-LLM INT4 optimizations.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install scipy
|
pip install scipy
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,23 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install scipy
|
pip install scipy
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -52,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -63,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -82,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -90,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a BlueLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a BlueLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a ChatGLM2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a ChatGLM2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
@ -132,13 +138,10 @@ Inference time: xxxx s
|
||||||
In the example [streamchat.py](./streamchat.py), we show a basic use case for a ChatGLM2 model to stream chat, with IPEX-LLM INT4 optimizations.
|
In the example [streamchat.py](./streamchat.py), we show a basic use case for a ChatGLM2 model to stream chat, with IPEX-LLM INT4 optimizations.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -148,21 +151,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -173,6 +179,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -184,10 +191,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -202,7 +222,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -210,15 +230,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
**Stream Chat using `stream_chat()` API**:
|
**Stream Chat using `stream_chat()` API**:
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a ChatGLM3 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a ChatGLM3 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
@ -131,13 +137,10 @@ AI stands for Artificial Intelligence. It refers to the development of computer
|
||||||
In the example [streamchat.py](./streamchat.py), we show a basic use case for a ChatGLM3 model to stream chat, with IPEX-LLM INT4 optimizations.
|
In the example [streamchat.py](./streamchat.py), we show a basic use case for a ChatGLM3 model to stream chat, with IPEX-LLM INT4 optimizations.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -147,21 +150,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -172,6 +178,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -183,10 +190,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -201,7 +221,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -209,15 +229,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
**Stream Chat using `stream_chat()` API**:
|
**Stream Chat using `stream_chat()` API**:
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a CodeLlama model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a CodeLlama model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
|
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
|
pip install transformers==4.34.1 # CodeLlamaTokenizer is supported in higher version of transformers
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -51,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -62,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -81,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -89,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,17 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a DeciLM-7B model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a DeciLM-7B model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default
|
|
||||||
# you can install specific ipex/torch version for your need
|
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.35.2 # required by DeciLM-7B
|
pip install transformers==4.35.2 # required by DeciLM-7B
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -27,20 +23,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -51,6 +51,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -62,11 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -81,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -89,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,11 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Deepseek model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Deepseek model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default
|
|
||||||
# you can install specific ipex/torch version for your need
|
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -50,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -61,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -80,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -88,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -9,14 +9,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [recognize.py](./recognize.py), we show a basic use case for a Distil-Whisper model to conduct transcription using `pipeline()` API for long audio input, with IPEX-LLM INT4 optimizations.
|
In the example [recognize.py](./recognize.py), we show a basic use case for a Distil-Whisper model to conduct transcription using `pipeline()` API for long audio input, with IPEX-LLM INT4 optimizations.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install datasets soundfile librosa # required by audio processing
|
pip install datasets soundfile librosa # required by audio processing
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,22 +24,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install datasets soundfile librosa # required by audio processing
|
pip install datasets soundfile librosa # required by audio processing
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -51,6 +54,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -62,11 +66,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -81,7 +97,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -89,15 +105,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Dolly v1 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Dolly v1 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Dolly v2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Dolly v2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Flan-t5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Flan-t5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -21,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -46,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -57,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -76,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -84,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Llama2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Llama2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,30 +21,35 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -59,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -78,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -86,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a LLaVA model to start a multi-turn chat centered around an image using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a LLaVA model to start a multi-turn chat centered around an image using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # install dependencies required by llava
|
pip install einops # install dependencies required by llava
|
||||||
pip install transformers==4.36.2
|
pip install transformers==4.36.2
|
||||||
|
|
||||||
|
|
@ -31,8 +29,12 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # install dependencies required by llava
|
pip install einops # install dependencies required by llava
|
||||||
pip install transformers==4.36.2
|
pip install transformers==4.36.2
|
||||||
|
|
||||||
|
|
@ -40,29 +42,30 @@ git clone https://github.com/haotian-liu/LLaVA.git # clone the llava libary
|
||||||
copy generate.py .\LLaVA\ # copy our example to the LLaVA folder
|
copy generate.py .\LLaVA\ # copy our example to the LLaVA folder
|
||||||
cd LLaVA # change the working directory to the LLaVA folder
|
cd LLaVA # change the working directory to the LLaVA folder
|
||||||
git checkout tags/v1.2.0 -b 1.2.0 # Get the branch which is compatible with transformers 4.36
|
git checkout tags/v1.2.0 -b 1.2.0 # Get the branch which is compatible with transformers 4.36
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -74,11 +77,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -93,7 +108,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -101,15 +116,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -7,33 +7,107 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
## Example: Predict Tokens using `generate()` API
|
## Example: Predict Tokens using `generate()` API
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Mamba model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Mamba model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
#### 1.1 Installation on Linux
|
||||||
|
We suggest using conda to manage environment:
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
# below command will install intel_extension_for_pytorch==2.0.110+xpu as default
|
|
||||||
# you can install specific ipex/torch version for your need
|
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # package required by Mamba
|
pip install einops # package required by Mamba
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
#### 1.2 Installation on Windows
|
||||||
|
We suggest using conda to manage environment:
|
||||||
|
```bash
|
||||||
|
conda create -n llm python=3.11 libuv
|
||||||
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
pip install einops # package required by Mamba
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
### 3. Run
|
### 3. Runtime Configurations
|
||||||
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
|
#### 3.1 Configurations for Linux
|
||||||
|
<details>
|
||||||
|
|
||||||
For optimal performance on Arc, it is recommended to set several environment variables.
|
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel Data Center GPU Max Series</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export ENABLE_SDP_FUSION=1
|
||||||
|
```
|
||||||
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
#### 3.2 Configurations for Windows
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```cmd
|
||||||
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
|
```cmd
|
||||||
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python ./generate.py
|
python ./generate.py
|
||||||
```
|
```
|
||||||
|
|
@ -45,7 +119,7 @@ In the example, several arguments can be passed to satisfy your requirements:
|
||||||
- `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'What is AI?'`.
|
- `--prompt PROMPT`: argument defining the prompt to be infered (with integrated prompt format for chat). It is default to be `'What is AI?'`.
|
||||||
- `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
|
- `--n-predict N_PREDICT`: argument defining the max number of tokens to predict. It is default to be `32`.
|
||||||
|
|
||||||
#### 2.3 Sample Output
|
#### 4.3 Sample Output
|
||||||
#### [state-spaces/mamba-1.4b](https://huggingface.co/state-spaces/mamba-1.4b)
|
#### [state-spaces/mamba-1.4b](https://huggingface.co/state-spaces/mamba-1.4b)
|
||||||
```log
|
```log
|
||||||
Inference time: xxxx s
|
Inference time: xxxx s
|
||||||
|
|
|
||||||
|
|
@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Mistral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Mistral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -29,21 +26,27 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
# Refer to https://huggingface.co/mistralai/Mistral-7B-v0.1#troubleshooting, please make sure you are using a stable version of Transformers, 4.34.0 or newer.
|
||||||
pip install transformers==4.34.0
|
pip install transformers==4.34.0
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -54,6 +57,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -65,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -84,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -92,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -10,13 +10,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Mixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Mixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -29,6 +26,9 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -36,26 +36,28 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
|
||||||
pip install transformers==4.36.0
|
pip install transformers==4.36.0
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -67,11 +69,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -86,7 +100,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -94,15 +108,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,14 +8,13 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a phi-1_5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a phi-1_5 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for phi-1_5 to conduct generation
|
pip install einops # additional package required for phi-1_5 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -24,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for phi-1_5 to conduct generation
|
pip install einops # additional package required for phi-1_5 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -61,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,36 +8,39 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a phi-2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a phi-2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for phi-2 to conduct generation
|
pip install einops # additional package required for phi-2 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 1.2 Installation on Windows
|
#### 1.2 Installation on Windows
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -48,7 +51,9 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
@ -58,10 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -76,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -84,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,14 +8,13 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a phixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a phixtral model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for phixtral to conduct generation
|
pip install einops # additional package required for phixtral to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -24,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for phixtral to conduct generation
|
pip install einops # additional package required for phixtral to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -61,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,14 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [chat.py](./chat.py), we show a basic use case for a Qwen-VL model to start a multimodal chat using `chat()` API, with IPEX-LLM 'optimize_model' API on Intel GPUs.
|
In the example [chat.py](./chat.py), we show a basic use case for a Qwen-VL model to start a multimodal chat using `chat()` API, with IPEX-LLM 'optimize_model' API on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
|
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -24,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
|
pip install accelerate tiktoken einops transformers_stream_generator==0.0.4 scipy torchvision pillow tensorboard matplotlib # additional package required for Qwen-VL-Chat to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -50,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -61,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -14,6 +14,7 @@ conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.37.0 # install transformers which supports Qwen2
|
pip install transformers==4.37.0 # install transformers which supports Qwen2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -22,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
pip install transformers==4.37.2 # install transformers which supports Qwen2
|
|
||||||
|
pip install transformers==4.37.0 # install transformers which supports Qwen2
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -48,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -59,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -78,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -86,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Replit model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Replit model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install "transformers<4.35"
|
pip install "transformers<4.35"
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,21 +23,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -47,13 +48,10 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
|
|
||||||
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
||||||
|
|
||||||
### 3. Run
|
|
||||||
|
|
||||||
For optimal performance on Arc, it is recommended to set several environment variables.
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -65,10 +63,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -83,7 +94,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -91,15 +102,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a SOLAR model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a SOLAR model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.35.2 # required by SOLAR
|
pip install transformers==4.35.2 # required by SOLAR
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,22 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install transformers==4.35.2 # required by SOLAR
|
pip install transformers==4.35.2 # required by SOLAR
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -51,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -62,10 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
|
||||||
In the example [synthesize_speech.py](./synthesize_speech.py), we show a basic use case for SpeechT5 model to synthesize speech based on the given text, with IPEX-LLM INT4 optimizations.
|
In the example [synthesize_speech.py](./synthesize_speech.py), we show a basic use case for SpeechT5 model to synthesize speech based on the given text, with IPEX-LLM INT4 optimizations.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install "datasets<2.18" soundfile # additional package required for SpeechT5 to conduct generation
|
pip install "datasets<2.18" soundfile # additional package required for SpeechT5 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,23 +23,26 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install "datasets<2.18" soundfile # additional package required for SpeechT5 to conduct generation
|
pip install "datasets<2.18" soundfile # additional package required for SpeechT5 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -52,6 +53,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -63,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -82,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -90,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a StableLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a StableLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -27,6 +24,9 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
|
|
@ -34,17 +34,17 @@ pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-exte
|
||||||
pip install transformers==4.38.0
|
pip install transformers==4.38.0
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -55,6 +55,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -66,11 +67,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -85,7 +98,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -93,15 +106,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,13 +8,10 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a StarCoder model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a StarCoder model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
@ -24,21 +21,24 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,6 +49,7 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -60,11 +61,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -79,7 +92,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -87,15 +100,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -8,15 +8,13 @@ To run these examples with IPEX-LLM on Intel GPUs, we have some recommended requ
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Yi model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for a Yi model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 # recommend to use Python 3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for Yi-6B to conduct generation
|
pip install einops # additional package required for Yi-6B to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -25,31 +23,37 @@ We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for Yi-6B to conduct generation
|
pip install einops # additional package required for Yi-6B to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
@ -61,11 +65,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -80,7 +96,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -88,15 +104,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
|
|
@ -10,38 +10,42 @@ In addition, you need to modify some files in Yuan2-2B-hf folder, since Flash at
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for an Yuan2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
In the example [generate.py](./generate.py), we show a basic use case for an Yuan2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations on Intel GPUs.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
#### 1.1 Installation on Linux
|
#### 1.1 Installation on Linux
|
||||||
We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
We suggest using conda to manage environment:
|
||||||
|
|
||||||
After installing conda, create a Python environment for IPEX-LLM:
|
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11
|
conda create -n llm python=3.11
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option
|
|
||||||
pip install einops # additional package required for Yuan2 to conduct generation
|
pip install einops # additional package required for Yuan2 to conduct generation
|
||||||
pip install pandas # additional package required for Yuan2 to conduct generation
|
pip install pandas # additional package required for Yuan2 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 1.2 Installation on Windows
|
#### 1.2 Installation on Windows
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
conda create -n llm python=3.11 libuv
|
conda create -n llm python=3.11 libuv
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
# below command will use pip to install the Intel oneAPI Base Toolkit 2024.0
|
||||||
|
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0
|
||||||
|
|
||||||
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
# below command will install intel_extension_for_pytorch==2.1.10+xpu as default
|
||||||
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
|
||||||
|
|
||||||
pip install einops # additional package required for Yuan2 to conduct generation
|
pip install einops # additional package required for Yuan2 to conduct generation
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Configures OneAPI environment variables
|
### 2. Configures OneAPI environment variables for Linux
|
||||||
#### 2.1 Configurations for Linux
|
|
||||||
|
> [!NOTE]
|
||||||
|
> Skip this step if you are running on Windows.
|
||||||
|
|
||||||
|
This is a required step on Linux for APT or offline installed oneAPI. Skip this step for PIP-installed oneAPI.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
source /opt/intel/oneapi/setvars.sh
|
source /opt/intel/oneapi/setvars.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Configurations for Windows
|
|
||||||
```cmd
|
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
|
||||||
```
|
|
||||||
> Note: Please make sure you are using **CMD** (**Anaconda Prompt** if using conda) to run the command as PowerShell is not supported.
|
|
||||||
### 3. Runtime Configurations
|
### 3. Runtime Configurations
|
||||||
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
For optimal performance, it is recommended to set several environment variables. Please check out the suggestions based on your device.
|
||||||
#### 3.1 Configurations for Linux
|
#### 3.1 Configurations for Linux
|
||||||
|
|
@ -49,12 +53,12 @@ For optimal performance, it is recommended to set several environment variables.
|
||||||
|
|
||||||
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
<summary>For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series</summary>
|
||||||
|
|
||||||
For optimal performance on Arc, it is recommended to set several environment variables.
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export USE_XETLA=OFF
|
export USE_XETLA=OFF
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
@ -64,10 +68,23 @@ export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
```bash
|
```bash
|
||||||
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
|
||||||
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
export ENABLE_SDP_FUSION=1
|
export ENABLE_SDP_FUSION=1
|
||||||
```
|
```
|
||||||
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
> Note: Please note that `libtcmalloc.so` can be installed by `conda install -c conda-forge -y gperftools=2.10`.
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
|
||||||
|
<summary>For Intel iGPU</summary>
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export SYCL_CACHE_PERSISTENT=1
|
||||||
|
export BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
#### 3.2 Configurations for Windows
|
#### 3.2 Configurations for Windows
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
|
|
@ -82,7 +99,7 @@ set BIGDL_LLM_XMX_DISABLED=1
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
|
|
||||||
<summary>For Intel Arc™ A300-Series or Pro A60</summary>
|
<summary>For Intel Arc™ A-Series Graphics</summary>
|
||||||
|
|
||||||
```cmd
|
```cmd
|
||||||
set SYCL_CACHE_PERSISTENT=1
|
set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
@ -90,15 +107,8 @@ set SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
<details>
|
> [!NOTE]
|
||||||
|
> For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
||||||
<summary>For other Intel dGPU Series</summary>
|
|
||||||
|
|
||||||
There is no need to set further environment variables.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
> Note: For the first time that each model runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
|
|
||||||
### 4. Running examples
|
### 4. Running examples
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue