Update part of Overview guide in mddocs (1/2) (#11378)

* Create install.md

* Update install_cpu.md

* Delete original docs/mddocs/Overview/install_cpu.md

* Update install_cpu.md

* Update install_gpu.md

* update llm.md and install.md

* Update docs in KeyFeatures

* Review and fix typos

* Fix on folded NOTE

* Small fix

* Small fix

* Remove empty known_issue.md

* Small fix

* Small fix

* Further fix

* Fixes

* Fix

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
This commit is contained in:
Zijie Li 2024-06-21 10:45:17 +08:00 committed by GitHub
parent 4ba82191f2
commit 33b9a9c4c9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 441 additions and 521 deletions

View file

@ -6,24 +6,22 @@ In [Inference on GPU](inference_on_gpu.md) and [Finetune (QLoRA)](finetune.md),
The `sycl-ls` tool enumerates a list of devices available in the system. You can use it after you setup oneapi environment: The `sycl-ls` tool enumerates a list of devices available in the system. You can use it after you setup oneapi environment:
```eval_rst - For **Windows users**:
.. tabs::
.. tab:: Windows
Please make sure you are using CMD (Miniforge Prompt if using conda): Please make sure you are using CMD (Miniforge Prompt if using conda):
.. code-block:: cmd ```cmd
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
sycl-ls
```
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" - For **Linux users**:
sycl-ls
.. tab:: Linux ```bash
source /opt/intel/oneapi/setvars.sh
sycl-ls
```
.. code-block:: bash
source /opt/intel/oneapi/setvars.sh
sycl-ls
```
If you have two Arc770 GPUs, you can get something like below: If you have two Arc770 GPUs, you can get something like below:
``` ```
@ -40,7 +38,7 @@ This output shows there are two Arc A770 GPUs as well as an Intel iGPU on this m
## Devices selection ## Devices selection
To enable xpu, you should convert your model and input to xpu by below code: To enable xpu, you should convert your model and input to xpu by below code:
``` ```python
model = model.to('xpu') model = model.to('xpu')
input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu') input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu')
``` ```
@ -50,7 +48,7 @@ To select the desired devices, there are two ways: one is changing the code, ano
To specify a xpu, you can change the `to('xpu')` to `to('xpu:[device_id]')`, this device_id is counted from zero. To specify a xpu, you can change the `to('xpu')` to `to('xpu:[device_id]')`, this device_id is counted from zero.
If you you want to use the second device, you can change the code like this: If you you want to use the second device, you can change the code like this:
``` ```python
model = model.to('xpu:1') model = model.to('xpu:1')
input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu:1') input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu:1')
``` ```
@ -59,28 +57,23 @@ input_ids = tokenizer.encode(prompt, return_tensors="pt").to('xpu:1')
Device selection environment variable, `ONEAPI_DEVICE_SELECTOR`, can be used to limit the choice of Intel GPU devices. As upon `sycl-ls` shows, the last three lines are three Level Zero GPU devices. So we can use `ONEAPI_DEVICE_SELECTOR=level_zero:[gpu_id]` to select devices. Device selection environment variable, `ONEAPI_DEVICE_SELECTOR`, can be used to limit the choice of Intel GPU devices. As upon `sycl-ls` shows, the last three lines are three Level Zero GPU devices. So we can use `ONEAPI_DEVICE_SELECTOR=level_zero:[gpu_id]` to select devices.
For example, you want to use the second A770 GPU, you can run the python like this: For example, you want to use the second A770 GPU, you can run the python like this:
```eval_rst - For **Windows users**:
.. tabs::
.. tab:: Windows
.. code-block:: cmd ```cmd
set ONEAPI_DEVICE_SELECTOR=level_zero:1
python generate.py
```
Through ``set ONEAPI_DEVICE_SELECTOR=level_zero:1``, only the second A770 GPU will be available for the current environment.
set ONEAPI_DEVICE_SELECTOR=level_zero:1 - For **Linux users**:
python generate.py
Through ``set ONEAPI_DEVICE_SELECTOR=level_zero:1``, only the second A770 GPU will be available for the current environment. ```bash
ONEAPI_DEVICE_SELECTOR=level_zero:1 python generate.py
```
.. tab:: Linux ``ONEAPI_DEVICE_SELECTOR=level_zero:1`` in upon command only affect in current python program. Also, you can export the environment variable, then run your python:
.. code-block:: bash ```bash
export ONEAPI_DEVICE_SELECTOR=level_zero:1
ONEAPI_DEVICE_SELECTOR=level_zero:1 python generate.py python generate.py
```
``ONEAPI_DEVICE_SELECTOR=level_zero:1`` in upon command only affect in current python program. Also, you can export the environment variable, then run your python:
.. code-block:: bash
export ONEAPI_DEVICE_SELECTOR=level_zero:1
python generate.py
```

View file

@ -2,17 +2,15 @@
You may also convert Hugging Face *Transformers* models into native INT4 format for maximum performance as follows. You may also convert Hugging Face *Transformers* models into native INT4 format for maximum performance as follows.
```eval_rst > [!NOTE]
.. note:: > Currently only llama/bloom/gptneox/starcoder/chatglm model families are supported; you may use the corresponding API to load the converted model. (For other models, you can use the Hugging Face ``transformers`` format as described [here](./hugging_face_format.md))
Currently only llama/bloom/gptneox/starcoder/chatglm model families are supported; you may use the corresponding API to load the converted model. (For other models, you can use the Hugging Face ``transformers`` format as described `here <./hugging_face_format.html>`_).
```
```python ```python
# convert the model # convert the model
from ipex_llm import llm_convert from ipex_llm import llm_convert
ipex_llm_path = llm_convert(model='/path/to/model/', ipex_llm_path = llm_convert(model='/path/to/model/',
outfile='/path/to/output/', outtype='int4', model_family="llama") outfile='/path/to/output/', outtype='int4', model_family="llama")
# load the converted model # load the converted model
# switch to ChatGLMForCausalLM/GptneoxForCausalLM/BloomForCausalLM/StarcoderForCausalLM to load other models # switch to ChatGLMForCausalLM/GptneoxForCausalLM/BloomForCausalLM/StarcoderForCausalLM to load other models
@ -25,8 +23,5 @@ output_ids = llm.generate(input_ids, ...)
output = llm.batch_decode(output_ids) output = llm.batch_decode(output_ids)
``` ```
```eval_rst > [!NOTE]
.. seealso:: > See the complete example [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models)
See the complete example `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models>`_
```

View file

@ -60,10 +60,7 @@ model = load_low_bit(model, saved_dir) # Load the optimized model
``` ```
```eval_rst > [!NOTE]
.. seealso:: > - Please refer to the [API documentation](https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/optimize.html) for more details.
> - We also provide detailed examples on how to run PyTorch models (e.g., Openai Whisper, LLaMA2, ChatGLM2, Falcon, MPT, Baichuan2, etc.) using IPEX-LLM. See the complete CPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models) and GPU examples [here](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models)
* Please refer to the `API documentation <https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/optimize.html>`_ for more details.
* We also provide detailed examples on how to run PyTorch models (e.g., Openai Whisper, LLaMA2, ChatGLM2, Falcon, MPT, Baichuan2, etc.) using IPEX-LLM. See the complete CPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models>`_ and GPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models>`_.
```

View file

@ -0,0 +1,6 @@
# `transformers`-style API
You may run the LLMs using `transformers`-style API in `ipex-llm`.
* [Hugging Face `transformers` Format](./hugging_face_format.md)
* [Native Format](./native_format.md)

View file

@ -1,10 +0,0 @@
``transformers``-style API
================================
You may run the LLMs using ``transformers``-style API in ``ipex-llm``.
* |hugging_face_transformers_format|_
* `Native Format <./native_format.html>`_
.. |hugging_face_transformers_format| replace:: Hugging Face ``transformers`` Format
.. _hugging_face_transformers_format: ./hugging_face_format.html

View file

@ -0,0 +1,6 @@
# IPEX-LLM Installation
Here, we provide instructions on how to install `ipex-llm` and best practices for setting up your environment. Please refer to the appropriate guide based on your device:
- [CPU](./install_cpu.md)
- [GPU](./install_gpu.md)

View file

@ -1,7 +0,0 @@
IPEX-LLM Installation
================================
Here, we provide instructions on how to install ``ipex-llm`` and best practices for setting up your environment. Please refer to the appropriate guide based on your device:
* `CPU <./install_cpu.html>`_
* `GPU <./install_gpu.html>`_

View file

@ -4,33 +4,26 @@
Install IPEX-LLM for CPU supports using pip through: Install IPEX-LLM for CPU supports using pip through:
```eval_rst - For **Linux users**:
.. tabs::
.. tab:: Linux ```bash
pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
```
.. code-block:: bash - For **Windows users**:
pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu ```cmd
pip install --pre --upgrade ipex-llm[all]
.. tab:: Windows ```
.. code-block:: cmd
pip install --pre --upgrade ipex-llm[all]
```
Please refer to [Environment Setup](#environment-setup) for more information. Please refer to [Environment Setup](#environment-setup) for more information.
```eval_rst > [!NOTE]
.. note:: > `all` option will trigger installation of all the dependencies for common LLM application development.
``all`` option will trigger installation of all the dependencies for common LLM application development. > [!IMPORTANT]
> `ipex-llm` is tested with Python 3.9, 3.10 and 3.11; Python 3.11 is recommended for best practices.
.. important::
``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11; Python 3.11 is recommended for best practices.
```
## Recommended Requirements ## Recommended Requirements
@ -53,48 +46,39 @@ For optimal performance with LLM models using IPEX-LLM optimizations on Intel CP
First we recommend using [Conda](https://conda-forge.org/download/) to create a python 3.11 enviroment: First we recommend using [Conda](https://conda-forge.org/download/) to create a python 3.11 enviroment:
```eval_rst - For **Linux users**:
.. tabs::
.. tab:: Linux ```bash
conda create -n llm python=3.11
conda activate llm
.. code-block:: bash pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
```
conda create -n llm python=3.11 - For
conda activate llm ```cmd
conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu pip install --pre --upgrade ipex-llm[all]
.. tab:: Windows
.. code-block:: cmd
conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[all]
``` ```
Then for running a LLM model with IPEX-LLM optimizations (taking an `example.py` an example): Then for running a LLM model with IPEX-LLM optimizations (taking an `example.py` an example):
```eval_rst - For **running on Client**:
.. tabs::
.. tab:: Client It is recommended to run directly with full utilization of all CPU cores:
It is recommended to run directly with full utilization of all CPU cores: ```bash
python example.py
```
.. code-block:: bash - For **running on Server**:
python example.py It is recommended to run with all the physical cores of a single socket:
.. tab:: Server ```bash
# e.g. for a server with 48 cores per socket
It is recommended to run with all the physical cores of a single socket: export OMP_NUM_THREADS=48
numactl -C 0-47 -m 0 python example.py
.. code-block:: bash ```
# e.g. for a server with 48 cores per socket
export OMP_NUM_THREADS=48
numactl -C 0-47 -m 0 python example.py
```

View file

@ -6,21 +6,15 @@
IPEX-LLM on Windows supports Intel iGPU and dGPU. IPEX-LLM on Windows supports Intel iGPU and dGPU.
```eval_rst > [!IMPORTANT]
.. important:: > IPEX-LLM on Windows only supports PyTorch 2.1.
IPEX-LLM on Windows only supports PyTorch 2.1.
```
To apply Intel GPU acceleration, please first verify your GPU driver version. To apply Intel GPU acceleration, please first verify your GPU driver version.
```eval_rst > [!NOTE]
.. note:: > The GPU driver version of your device can be checked in the "Task Manager" -> GPU 0 (or GPU 1, etc.) -> Driver version.
The GPU driver version of your device can be checked in the "Task Manager" -> GPU 0 (or GPU 1, etc.) -> Driver version. If you have driver version lower than `31.0.101.5122`, it is recommended to [**update your GPU driver to the latest**](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html).
```
If you have driver version lower than `31.0.101.5122`, it is recommended to [**update your GPU driver to the latest**](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html):
<!-- Intel® oneAPI Base Toolkit 2024.0 installation methods: <!-- Intel® oneAPI Base Toolkit 2024.0 installation methods:
@ -47,34 +41,28 @@ If you have driver version lower than `31.0.101.5122`, it is recommended to [**u
We recommend using [Miniforge](https://conda-forge.org/download/) to create a python 3.11 enviroment. We recommend using [Miniforge](https://conda-forge.org/download/) to create a python 3.11 enviroment.
```eval_rst > [!IMPORTANT]
.. important:: > ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
```
The easiest ways to install `ipex-llm` is the following commands, choosing either US or CN website for `extra-index-url`: The easiest ways to install `ipex-llm` is the following commands, choosing either US or CN website for `extra-index-url`:
```eval_rst - For **US**:
.. tabs::
.. tab:: US
.. code-block:: cmd ```cmd
conda create -n llm python=3.11 libuv
conda activate llm
conda create -n llm python=3.11 libuv pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
conda activate llm ```
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ - For **CN**:
.. tab:: CN ```cmd
conda create -n llm python=3.11 libuv
conda activate llm
.. code-block:: cmd pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
```
conda create -n llm python=3.11 libuv
conda activate llm
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
```
#### Install IPEX-LLM From Wheel #### Install IPEX-LLM From Wheel
@ -98,11 +86,9 @@ pip install intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-win_amd64.whl
pip install --pre --upgrade ipex-llm[xpu] pip install --pre --upgrade ipex-llm[xpu]
``` ```
```eval_rst > [!NOTE]
.. note:: > All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
```
### Runtime Configuration ### Runtime Configuration
@ -116,27 +102,20 @@ call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
Please also set the following environment variable if you would like to run LLMs on: --> Please also set the following environment variable if you would like to run LLMs on: -->
```eval_rst - For **Intel iGPU**:
.. tabs:: ```cmd
.. tab:: Intel iGPU set SYCL_CACHE_PERSISTENT=1
set BIGDL_LLM_XMX_DISABLED=1
```
.. code-block:: cmd - For **Intel Arc™ A-Series Graphics**:
```cmd
set SYCL_CACHE_PERSISTENT=1
```
set SYCL_CACHE_PERSISTENT=1 > [!NOTE]
set BIGDL_LLM_XMX_DISABLED=1 > For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
.. tab:: Intel Arc™ A-Series Graphics
.. code-block:: cmd
set SYCL_CACHE_PERSISTENT=1
```
```eval_rst
.. note::
For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
```
### Troubleshooting ### Troubleshooting
@ -173,458 +152,438 @@ IPEX-LLM GPU support on Linux has been verified on:
* Intel Data Center GPU Flex Series * Intel Data Center GPU Flex Series
* Intel Data Center GPU Max Series * Intel Data Center GPU Max Series
```eval_rst > [!IMPORTANT]
.. important:: > IPEX-LLM on Linux supports PyTorch 2.0 and PyTorch 2.1.
>
> **Warning**
>
> IPEX-LLM support for Pytorch 2.0 is deprecated as of ``ipex-llm >= 2.1.0b20240511``.
IPEX-LLM on Linux supports PyTorch 2.0 and PyTorch 2.1. > [!IMPORTANT]
> We currently support the Ubuntu 20.04 operating system and later.
.. warning:: - For **PyTorch 2.1**:
IPEX-LLM support for Pytorch 2.0 is deprecated as of ``ipex-llm >= 2.1.0b20240511``. To enable IPEX-LLM for Intel GPUs with PyTorch 2.1, here are several prerequisite steps for tools installation and environment preparation:
```
```eval_rst
.. important::
We currently support the Ubuntu 20.04 operating system and later.
```
```eval_rst
.. tabs::
.. tab:: PyTorch 2.1
To enable IPEX-LLM for Intel GPUs with PyTorch 2.1, here are several prerequisite steps for tools installation and environment preparation:
* Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. We highly recommend installing the latest version of intel-i915-dkms using apt. - Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. We highly recommend installing the latest version of intel-i915-dkms using apt.
.. seealso:: > **Tip**:
>
> Please refer to our [driver installation](https://dgpu-docs.intel.com/driver/installation.html) for general purpose GPU capabilities.
>
> See [release page](https://dgpu-docs.intel.com/releases/index.html) for latest version.
Please refer to our `driver installation <https://dgpu-docs.intel.com/driver/installation.html>`_ for general purpose GPU capabilities. > **Note**:
>
> For Intel Core™ Ultra integrated GPU, please make sure level_zero version >= 1.3.28717. The level_zero version can be checked with ``sycl-ls``, and verison will be tagged be ``[ext_oneapi_level_zero:gpu]``.
> ```
> [opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2023.16.12.0.12_195853.xmain-hotfix]
> [opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2023.16.12.0.12_195853.xmain-hotfix]
> [opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO [24.09.28717.12]
> [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.28717]
> ```
>
> If you have level_zero version < 1.3.28717, you could update as follows:
>
> ```bash
> wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-core_1.0.16238.4_amd64.deb
> wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-opencl_1.0.16238.4_amd64.deb
> wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu-dbgsym_1.3.28717.12_amd64.ddeb
> wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu_1.3.28717.12_amd64.deb
> wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd-dbgsym_24.09.28717.12_amd64.ddeb
> wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd_24.09.28717.12_amd64.deb
> wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/libigdgmm12_22.3.17_amd64.deb
> sudo dpkg -i *.deb
> ```
See `release page <https://dgpu-docs.intel.com/releases/index.html>`_ for latest version. - Step 2: Download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html) with version 2024.0. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
.. note::
For Intel Core™ Ultra integrated GPU, please make sure level_zero version >= 1.3.28717. The level_zero version can be checked with ``sycl-ls``, and verison will be tagged be ``[ext_oneapi_level_zero:gpu]``.
.. code-block::
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2023.16.12.0.12_195853.xmain-hotfix]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2023.16.12.0.12_195853.xmain-hotfix]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO [24.09.28717.12]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.28717]
If you have level_zero version < 1.3.28717, you could update as follows:
.. code-block:: bash
wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-core_1.0.16238.4_amd64.deb
wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.16238.4/intel-igc-opencl_1.0.16238.4_amd64.deb
wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu-dbgsym_1.3.28717.12_amd64.ddeb
wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-level-zero-gpu_1.3.28717.12_amd64.deb
wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd-dbgsym_24.09.28717.12_amd64.ddeb
wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/intel-opencl-icd_24.09.28717.12_amd64.deb
wget https://github.com/intel/compute-runtime/releases/download/24.09.28717.12/libigdgmm12_22.3.17_amd64.deb
sudo dpkg -i *.deb
* Step 2: Download and install `Intel® oneAPI Base Toolkit <https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html>`_ with version 2024.0. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
Intel® oneAPI Base Toolkit 2024.0 installation methods: Intel® oneAPI Base Toolkit 2024.0 installation methods:
<details>
<summary> For <b>APT installer</b> </summary>
.. tabs:: - Step 1: Set up repository
.. tab:: APT installer ```bash
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update
```
Step 1: Set up repository - Step 2: Install the package
.. code-block:: bash ```bash
sudo apt install intel-oneapi-common-vars=2024.0.0-49406 \
intel-oneapi-common-oneapi-vars=2024.0.0-49406 \
intel-oneapi-diagnostics-utility=2024.0.0-49093 \
intel-oneapi-compiler-dpcpp-cpp=2024.0.2-49895 \
intel-oneapi-dpcpp-ct=2024.0.0-49381 \
intel-oneapi-mkl=2024.0.0-49656 \
intel-oneapi-mkl-devel=2024.0.0-49656 \
intel-oneapi-mpi=2021.11.0-49493 \
intel-oneapi-mpi-devel=2021.11.0-49493 \
intel-oneapi-dal=2024.0.1-25 \
intel-oneapi-dal-devel=2024.0.1-25 \
intel-oneapi-ippcp=2021.9.1-5 \
intel-oneapi-ippcp-devel=2021.9.1-5 \
intel-oneapi-ipp=2021.10.1-13 \
intel-oneapi-ipp-devel=2021.10.1-13 \
intel-oneapi-tlt=2024.0.0-352 \
intel-oneapi-ccl=2021.11.2-5 \
intel-oneapi-ccl-devel=2021.11.2-5 \
intel-oneapi-dnnl-devel=2024.0.0-49521 \
intel-oneapi-dnnl=2024.0.0-49521 \
intel-oneapi-tcm-1.0=1.0.0-435
```
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null > **Note**:
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list >
sudo apt update > You can uninstall the package by running the following command:
>
> ```bash
> sudo apt autoremove intel-oneapi-common-vars
> ```
</details>
Step 2: Install the package <details>
<summary> For <b>PIP installer</b> </summary>
.. code-block:: bash - Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``.
sudo apt install intel-oneapi-common-vars=2024.0.0-49406 \ ```bash
intel-oneapi-common-oneapi-vars=2024.0.0-49406 \ export PYTHONUSERBASE=~/intel/oneapi
intel-oneapi-diagnostics-utility=2024.0.0-49093 \ pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0 --user
intel-oneapi-compiler-dpcpp-cpp=2024.0.2-49895 \ ```
intel-oneapi-dpcpp-ct=2024.0.0-49381 \
intel-oneapi-mkl=2024.0.0-49656 \
intel-oneapi-mkl-devel=2024.0.0-49656 \
intel-oneapi-mpi=2021.11.0-49493 \
intel-oneapi-mpi-devel=2021.11.0-49493 \
intel-oneapi-dal=2024.0.1-25 \
intel-oneapi-dal-devel=2024.0.1-25 \
intel-oneapi-ippcp=2021.9.1-5 \
intel-oneapi-ippcp-devel=2021.9.1-5 \
intel-oneapi-ipp=2021.10.1-13 \
intel-oneapi-ipp-devel=2021.10.1-13 \
intel-oneapi-tlt=2024.0.0-352 \
intel-oneapi-ccl=2021.11.2-5 \
intel-oneapi-ccl-devel=2021.11.2-5 \
intel-oneapi-dnnl-devel=2024.0.0-49521 \
intel-oneapi-dnnl=2024.0.0-49521 \
intel-oneapi-tcm-1.0=1.0.0-435
.. note:: > **Note**:
>
> The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.
You can uninstall the package by running the following command: - Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.
.. code-block:: bash ```bash
conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
```
sudo apt autoremove intel-oneapi-common-vars > **Note**:
>
> You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
> You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.
.. tab:: PIP installer > **Note**:
>
> You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
> You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
>
> ```bash
> rm -r ~/intel/oneapi
> conda env config vars unset LD_LIBRARY_PATH -n llm
> ```
</details>
Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``. <details>
<summary> For <b>Offline installer</b> </summary>
.. code-block:: bash Using the offline installer allows you to customize the installation path.
export PYTHONUSERBASE=~/intel/oneapi ```bash
pip install dpcpp-cpp-rt==2024.0.2 mkl-dpcpp==2024.0.0 onednn==2024.0.0 --user wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/20f4e6a1-6b0b-4752-b8c1-e5eacba10e01/l_BaseKit_p_2024.0.0.49564_offline.sh
sudo sh ./l_BaseKit_p_2024.0.0.49564_offline.sh
```
> **Note**:
>
> You can also modify the installation or uninstall the package by running the following commands:
>
> ```bash
> cd /opt/intel/oneapi/installer
> sudo ./installer
> ```
</details>
.. note:: - For **PyTorch 2.0** (deprecated for versions ``ipex-llm >= 2.1.0b20240511``):
The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set. To enable IPEX-LLM for Intel GPUs with PyTorch 2.0, here're several prerequisite steps for tools installation and environment preparation:
Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.
.. code-block:: bash
conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
.. note::
You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.
.. note::
You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
.. code-block:: bash
rm -r ~/intel/oneapi
conda env config vars unset LD_LIBRARY_PATH -n llm
.. tab:: Offline installer
Using the offline installer allows you to customize the installation path.
.. code-block:: bash
wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/20f4e6a1-6b0b-4752-b8c1-e5eacba10e01/l_BaseKit_p_2024.0.0.49564_offline.sh
sudo sh ./l_BaseKit_p_2024.0.0.49564_offline.sh
.. note::
You can also modify the installation or uninstall the package by running the following commands:
.. code-block:: bash
cd /opt/intel/oneapi/installer
sudo ./installer
.. tab:: PyTorch 2.0 (deprecated for versions ``ipex-llm >= 2.1.0b20240511``)
To enable IPEX-LLM for Intel GPUs with PyTorch 2.0, here're several prerequisite steps for tools installation and environment preparation:
* Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. Highly recommend installing the latest version of intel-i915-dkms using apt. - Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. Highly recommend installing the latest version of intel-i915-dkms using apt.
.. seealso:: > **Tip**:
>
> Please refer to our [driver installation](https://dgpu-docs.intel.com/driver/installation.html) for general purpose GPU capabilities.
>
> See [release page](https://dgpu-docs.intel.com/releases/index.html) for latest version.
Please refer to our `driver installation <https://dgpu-docs.intel.com/driver/installation.html>`_ for general purpose GPU capabilities. - Step 2: Download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html) with version 2023.2. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
See `release page <https://dgpu-docs.intel.com/releases/index.html>`_ for latest version.
* Step 2: Download and install `Intel® oneAPI Base Toolkit <https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html>`_ with version 2023.2. OneDNN, OneMKL and DPC++ compiler are needed, others are optional.
Intel® oneAPI Base Toolkit 2023.2 installation methods: Intel® oneAPI Base Toolkit 2023.2 installation methods:
.. tabs:: <details>
.. tab:: APT installer <summary> For <b>APT installer</b> </summary>
Step 1: Set up repository - Step 1: Set up repository
.. code-block:: bash ```bash
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update
```
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null - Step 2: Install the packages
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update
Step 2: Install the packages ```bash
sudo apt install -y intel-oneapi-common-vars=2023.2.0-49462 \
intel-oneapi-compiler-cpp-eclipse-cfg=2023.2.0-49495 intel-oneapi-compiler-dpcpp-eclipse-cfg=2023.2.0-49495 \
intel-oneapi-diagnostics-utility=2022.4.0-49091 \
intel-oneapi-compiler-dpcpp-cpp=2023.2.0-49495 \
intel-oneapi-mkl=2023.2.0-49495 intel-oneapi-mkl-devel=2023.2.0-49495 \
intel-oneapi-mpi=2021.10.0-49371 intel-oneapi-mpi-devel=2021.10.0-49371 \
intel-oneapi-tbb=2021.10.0-49541 intel-oneapi-tbb-devel=2021.10.0-49541\
intel-oneapi-ccl=2021.10.0-49084 intel-oneapi-ccl-devel=2021.10.0-49084\
intel-oneapi-dnnl-devel=2023.2.0-49516 intel-oneapi-dnnl=2023.2.0-49516
```
.. code-block:: bash > **Note**:
>
> You can uninstall the package by running the following command:
>
> ```bash
> sudo apt autoremove intel-oneapi-common-vars
> ```
</details>
sudo apt install -y intel-oneapi-common-vars=2023.2.0-49462 \ <details>
intel-oneapi-compiler-cpp-eclipse-cfg=2023.2.0-49495 intel-oneapi-compiler-dpcpp-eclipse-cfg=2023.2.0-49495 \ <summary> For <b>PIP installer</b> </summary>
intel-oneapi-diagnostics-utility=2022.4.0-49091 \
intel-oneapi-compiler-dpcpp-cpp=2023.2.0-49495 \
intel-oneapi-mkl=2023.2.0-49495 intel-oneapi-mkl-devel=2023.2.0-49495 \
intel-oneapi-mpi=2021.10.0-49371 intel-oneapi-mpi-devel=2021.10.0-49371 \
intel-oneapi-tbb=2021.10.0-49541 intel-oneapi-tbb-devel=2021.10.0-49541\
intel-oneapi-ccl=2021.10.0-49084 intel-oneapi-ccl-devel=2021.10.0-49084\
intel-oneapi-dnnl-devel=2023.2.0-49516 intel-oneapi-dnnl=2023.2.0-49516
.. note:: - Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi``
You can uninstall the package by running the following command: ```bash
export PYTHONUSERBASE=~/intel/oneapi
pip install dpcpp-cpp-rt==2023.2.0 mkl-dpcpp==2023.2.0 onednn-cpu-dpcpp-gpu-dpcpp==2023.2.0 --user
```
.. code-block:: bash > **Note**:
>
> The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set.
sudo apt autoremove intel-oneapi-common-vars - Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``.
.. tab:: PIP installer ```bash
conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm
```
Step 1: Install oneAPI in a user-defined folder, e.g., ``~/intel/oneapi`` > **Note**:
>
> You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
> You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.
.. code-block:: bash > **Note**:
>
> You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
> You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
>
> ```bash
> rm -r ~/intel/oneapi
> conda env config vars unset LD_LIBRARY_PATH -n llm
> ```
</details>
export PYTHONUSERBASE=~/intel/oneapi <details>
pip install dpcpp-cpp-rt==2023.2.0 mkl-dpcpp==2023.2.0 onednn-cpu-dpcpp-gpu-dpcpp==2023.2.0 --user <summary> For <b>Offline installer</b> </summary>
.. note:: Using the offline installer allows you to customize the installation path.
The oneAPI packages are visible in ``pip list`` only if ``PYTHONUSERBASE`` is properly set. ```bash
wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/992857b9-624c-45de-9701-f6445d845359/l_BaseKit_p_2023.2.0.49397_offline.sh
Step 2: Configure your working conda environment (e.g. with name ``llm``) to append oneAPI path (e.g. ``~/intel/oneapi/lib``) to the environment variable ``LD_LIBRARY_PATH``. sudo sh ./l_BaseKit_p_2023.2.0.49397_offline.sh
```
.. code-block:: bash > **Note**:
>
conda env config vars set LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/intel/oneapi/lib -n llm > You can also modify the installation or uninstall the package by running the following commands:
>
.. note:: > ```bash
You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``. > cd /opt/intel/oneapi/installer
You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section. > sudo ./installer
> ```
.. note:: </details>
You are recommended not to install other pip packages in the user-defined folder for oneAPI (e.g. ``~/intel/oneapi``).
You can uninstall the oneAPI package by simply deleting the package folder, and unsetting the configuration of your working conda environment (e.g., with name ``llm``).
.. code-block:: bash
rm -r ~/intel/oneapi
conda env config vars unset LD_LIBRARY_PATH -n llm
.. tab:: Offline installer
Using the offline installer allows you to customize the installation path.
.. code-block:: bash
wget https://registrationcenter-download.intel.com/akdlm/IRC_NAS/992857b9-624c-45de-9701-f6445d845359/l_BaseKit_p_2023.2.0.49397_offline.sh
sudo sh ./l_BaseKit_p_2023.2.0.49397_offline.sh
.. note::
You can also modify the installation or uninstall the package by running the following commands:
.. code-block:: bash
cd /opt/intel/oneapi/installer
sudo ./installer
```
### Install IPEX-LLM ### Install IPEX-LLM
#### Install IPEX-LLM From PyPI #### Install IPEX-LLM From PyPI
We recommend using [Miniforge](https://conda-forge.org/download/ to create a python 3.11 enviroment: We recommend using [Miniforge](https://conda-forge.org/download/) to create a python 3.11 enviroment:
```eval_rst > [!IMPORTANT]
.. important:: > ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.11 is recommended for best practices.
```
```eval_rst
.. important::
Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with Pytorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with Pytorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
```
```eval_rst
.. tabs::
.. tab:: PyTorch 2.1
Choose either US or CN website for ``extra-index-url``:
.. tabs::
.. tab:: US
.. code-block:: bash
conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
.. note::
The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
.. code-block:: bash
pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
.. tab:: CN
.. code-block:: bash
conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
.. note::
The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
.. code-block:: bash
pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
.. tab:: PyTorch 2.0 (deprecated for versions ``ipex-llm >= 2.1.0b20240511``) > [!IMPORTANT]
Choose either US or CN website for ``extra-index-url``: > Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with Pytorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with Pytorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
.. tabs::
.. tab:: US
.. code-block:: bash - For **PyTorch 2.1**:
conda create -n llm python=3.11 Choose either US or CN website for ``extra-index-url``:
conda activate llm
pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ - For **US**:
.. tab:: CN ```bash
conda create -n llm python=3.11
conda activate llm
.. code-block:: bash pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
```
conda create -n llm python=3.11 > **Note**:
conda activate llm >
> The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
>
> ```bash
> pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/> xpu/us/
> ```
pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/ - For **CN**:
```bash
conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
```
> **Note**:
>
> The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to
>
> ```bash
> pip install --pre --upgrade ipex-llm[xpu_2.1] --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/> xpu/cn/
> ```
- For **PyTorch 2.0** (deprecated for versions ``ipex-llm >= 2.1.0b20240511``):
Choose either US or CN website for ``extra-index-url``:
- For **US**:
```bash
conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
```
- For **CN**:
```bash
conda create -n llm python=3.11
conda activate llm
pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510 --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/
```
```
#### Install IPEX-LLM From Wheel #### Install IPEX-LLM From Wheel
If you encounter network issues when installing IPEX, you can also install IPEX-LLM dependencies for Intel XPU from source archives. First you need to download and install torch/torchvision/ipex from wheels listed below before installing `ipex-llm`. If you encounter network issues when installing IPEX, you can also install IPEX-LLM dependencies for Intel XPU from source archives. First you need to download and install torch/torchvision/ipex from wheels listed below before installing `ipex-llm`.
```eval_rst
.. tabs::
.. tab:: PyTorch 2.1
.. code-block:: bash - For **PyTorch 2.1**:
# get the wheels on Linux system for IPEX 2.1.10+xpu ```bash
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.1.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl # get the wheels on Linux system for IPEX 2.1.10+xpu
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.16.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.1.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.1.10%2Bxpu-cp311-cp311-linux_x86_64.whl wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.16.0a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.1.10%2Bxpu-cp311-cp311-linux_x86_64.whl
```
Then you may install directly from the wheel archives using following commands: Then you may install directly from the wheel archives using following commands:
.. code-block:: bash ```bash
# install the packages from the wheels
pip install torch-2.1.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
pip install torchvision-0.16.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
pip install intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-linux_x86_64.whl
# install the packages from the wheels # install ipex-llm for Intel GPU
pip install torch-2.1.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl pip install --pre --upgrade ipex-llm[xpu]
pip install torchvision-0.16.0a0+cxx11.abi-cp311-cp311-linux_x86_64.whl ```
pip install intel_extension_for_pytorch-2.1.10+xpu-cp311-cp311-linux_x86_64.whl
# install ipex-llm for Intel GPU - For **PyTorch 2.0** (deprecated for versions ``ipex-llm >= 2.1.0b20240511``):
pip install --pre --upgrade ipex-llm[xpu]
.. tab:: PyTorch 2.0 (deprecated for versions ``ipex-llm >= 2.1.0b20240511``) ```bash
# get the wheels on Linux system for IPEX 2.0.110+xpu
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.0.1a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.15.2a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.0.110%2Bxpu-cp311-cp311-linux_x86_64.whl
```
.. code-block:: bash Then you may install directly from the wheel archives using following commands:
# get the wheels on Linux system for IPEX 2.0.110+xpu ```bash
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torch-2.0.1a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl # install the packages from the wheels
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchvision-0.15.2a0%2Bcxx11.abi-cp311-cp311-linux_x86_64.whl pip install torch-2.0.1a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.0.110%2Bxpu-cp311-cp311-linux_x86_64.whl pip install torchvision-0.15.2a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
pip install intel_extension_for_pytorch-2.0.110+xpu-cp311-cp311-linux_x86_64.whl
Then you may install directly from the wheel archives using following commands: # install ipex-llm for Intel GPU
pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510
```
.. code-block:: bash > [!NOTE]
> All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
# install the packages from the wheels
pip install torch-2.0.1a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
pip install torchvision-0.15.2a0+cxx11.abi-cp311-cp311-linux_x86_64.whl
pip install intel_extension_for_pytorch-2.0.110+xpu-cp311-cp311-linux_x86_64.whl
# install ipex-llm for Intel GPU
pip install --pre --upgrade ipex-llm[xpu_2.0]==2.1.0b20240510
```
```eval_rst
.. note::
All the wheel packages mentioned here are for Python 3.11. If you would like to use Python 3.9 or 3.10, you should modify the wheel names for ``torch``, ``torchvision``, and ``intel_extension_for_pytorch`` by replacing ``cp11`` with ``cp39`` or ``cp310``, respectively.
```
### Runtime Configuration ### Runtime Configuration
To use GPU acceleration on Linux, several environment variables are required or recommended before running a GPU example. To use GPU acceleration on Linux, several environment variables are required or recommended before running a GPU example.
```eval_rst
.. tabs:: - For **Intel Arc™ A-Series and Intel Data Center GPU Flex**:
.. tab:: Intel Arc™ A-Series and Intel Data Center GPU Flex
For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series, we recommend: For Intel Arc™ A-Series Graphics and Intel Data Center GPU Flex Series, we recommend:
.. code-block:: bash ```bash
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
source /opt/intel/oneapi/setvars.sh
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI. # Recommended Environment Variables for optimal performance
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH. export USE_XETLA=OFF
source /opt/intel/oneapi/setvars.sh export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
```
# Recommended Environment Variables for optimal performance - For **Intel Data Center GPU Max**:
export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
.. tab:: Intel Data Center GPU Max
For Intel Data Center GPU Max Series, we recommend: For Intel Data Center GPU Max Series, we recommend:
.. code-block:: bash ```bash
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
source /opt/intel/oneapi/setvars.sh
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI. # Recommended Environment Variables for optimal performance
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH. export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so
source /opt/intel/oneapi/setvars.sh export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
# Recommended Environment Variables for optimal performance export ENABLE_SDP_FUSION=1
export LD_PRELOAD=${LD_PRELOAD}:${CONDA_PREFIX}/lib/libtcmalloc.so ```
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
export SYCL_CACHE_PERSISTENT=1
export ENABLE_SDP_FUSION=1
Please note that ``libtcmalloc.so`` can be installed by ``conda install -c conda-forge -y gperftools=2.10`` Please note that ``libtcmalloc.so`` can be installed by ``conda install -c conda-forge -y gperftools=2.10``
.. tab:: Intel iGPU - For **Intel iGPU**:
.. code-block:: bash ```bash
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI.
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
source /opt/intel/oneapi/setvars.sh
# Configure oneAPI environment variables. Required step for APT or offline installed oneAPI. export SYCL_CACHE_PERSISTENT=1
# Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH. export BIGDL_LLM_XMX_DISABLED=1
source /opt/intel/oneapi/setvars.sh ```
export SYCL_CACHE_PERSISTENT=1 > [!NOTE]
export BIGDL_LLM_XMX_DISABLED=1 > For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
```
```eval_rst
.. note::
For **the first time** that **each model** runs on Intel iGPU/Intel Arc™ A300-Series or Pro A60, it may take several minutes to compile.
```
### Known issues ### Known issues
@ -662,5 +621,5 @@ Error: libmkl_sycl_blas.so.4: cannot open shared object file: No such file or di
The reason for such errors is that oneAPI has not been initialized properly before running IPEX-LLM code or before importing IPEX package. The reason for such errors is that oneAPI has not been initialized properly before running IPEX-LLM code or before importing IPEX package.
* For oneAPI installed using APT or Offline Installer, make sure you execute `setvars.sh` of oneAPI Base Toolkit before running IPEX-LLM. * For oneAPI installed using APT or Offline Installer, make sure you execute `setvars.sh` of oneAPI Base Toolkit before running IPEX-LLM.
* For PIP-installed oneAPI, activate your working environment and run ``echo $LD_LIBRARY_PATH`` to check if the installation path is properly configured for the environment. If the output does not contain oneAPI path (e.g. ``~/intel/oneapi/lib``), check [Prerequisites](#id1) to re-install oneAPI with PIP installer. * For PIP-installed oneAPI, activate your working environment and run ``echo $LD_LIBRARY_PATH`` to check if the installation path is properly configured for the environment. If the output does not contain oneAPI path (e.g. ``~/intel/oneapi/lib``), check [Prerequisites](#prerequisites-1) to re-install oneAPI with PIP installer.
* Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with PyTorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with PyTorch 2.0 should be used with oneAPI Base Toolkit version 2023.2. * Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with PyTorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with PyTorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.

View file

@ -1 +0,0 @@
# IPEX-LLM Known Issues

View file

@ -17,13 +17,11 @@ model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path="open
load_in_4bit=True) load_in_4bit=True)
``` ```
```eval_rst > [!TIP]
.. tip:: > [open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2) is a pretrained large language model hosted on Hugging Face. `openlm-research/open_llama_3b_v2` is its Hugging Face model id. `from_pretrained` will automatically download the model from Hugging Face to a local cache path (e.g. ``~/.cache/huggingface``), load the model, and converted it to `ipex-llm` INT4 format.
>
> It may take a long time to download the model using API. You can also download the model yourself, and set `pretrained_model_name_or_path` to the local path of the downloaded model. This way, `from_pretrained` will load and convert directly from local path without download.
`open_llama_3b_v2 <https://huggingface.co/openlm-research/open_llama_3b_v2>`_ is a pretrained large language model hosted on Hugging Face. ``openlm-research/open_llama_3b_v2`` is its Hugging Face model id. ``from_pretrained`` will automatically download the model from Hugging Face to a local cache path (e.g. ``~/.cache/huggingface``), load the model, and converted it to ``ipex-llm`` INT4 format.
It may take a long time to download the model using API. You can also download the model yourself, and set ``pretrained_model_name_or_path`` to the local path of the downloaded model. This way, ``from_pretrained`` will load and convert directly from local path without download.
```
## Load Tokenizer ## Load Tokenizer
You also need a tokenizer for inference. Just use the official `transformers` API to load `LlamaTokenizer`: You also need a tokenizer for inference. Just use the official `transformers` API to load `LlamaTokenizer`: