LLM: update key feature and installation page of document (#9068)
This commit is contained in:
parent
c91b2bd574
commit
760183bac6
8 changed files with 31 additions and 16 deletions
|
|
@ -38,12 +38,12 @@ subtrees:
|
||||||
title: "Key Features"
|
title: "Key Features"
|
||||||
subtrees:
|
subtrees:
|
||||||
- entries:
|
- entries:
|
||||||
|
- file: doc/LLM/Overview/KeyFeatures/optimize_model
|
||||||
- file: doc/LLM/Overview/KeyFeatures/transformers_style_api
|
- file: doc/LLM/Overview/KeyFeatures/transformers_style_api
|
||||||
subtrees:
|
subtrees:
|
||||||
- entries:
|
- entries:
|
||||||
- file: doc/LLM/Overview/KeyFeatures/hugging_face_format
|
- file: doc/LLM/Overview/KeyFeatures/hugging_face_format
|
||||||
- file: doc/LLM/Overview/KeyFeatures/native_format
|
- file: doc/LLM/Overview/KeyFeatures/native_format
|
||||||
- file: doc/LLM/Overview/KeyFeatures/optimize_model
|
|
||||||
- file: doc/LLM/Overview/KeyFeatures/langchain_api
|
- file: doc/LLM/Overview/KeyFeatures/langchain_api
|
||||||
# - file: doc/LLM/Overview/KeyFeatures/cli
|
# - file: doc/LLM/Overview/KeyFeatures/cli
|
||||||
- file: doc/LLM/Overview/KeyFeatures/gpu_supports
|
- file: doc/LLM/Overview/KeyFeatures/gpu_supports
|
||||||
|
|
|
||||||
|
|
@ -3,12 +3,12 @@ BigDL-LLM Key Features
|
||||||
|
|
||||||
You may run the LLMs using ``bigdl-llm`` through one of the following APIs:
|
You may run the LLMs using ``bigdl-llm`` through one of the following APIs:
|
||||||
|
|
||||||
|
* `PyTorch API <./optimize_model.html>`_
|
||||||
* |transformers_style_api|_
|
* |transformers_style_api|_
|
||||||
|
|
||||||
* |hugging_face_transformers_format|_
|
* |hugging_face_transformers_format|_
|
||||||
* `Native Format <./native_format.html>`_
|
* `Native Format <./native_format.html>`_
|
||||||
|
|
||||||
* `General PyTorch Model Supports <./langchain_api.html>`_
|
|
||||||
* `LangChain API <./langchain_api.html>`_
|
* `LangChain API <./langchain_api.html>`_
|
||||||
* `GPU Supports <./gpu_supports.html>`_
|
* `GPU Supports <./gpu_supports.html>`_
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,22 +1,27 @@
|
||||||
## General PyTorch Model Supports
|
## PyTorch API
|
||||||
|
|
||||||
You may apply BigDL-LLM optimizations on any Pytorch models, not only Hugging Face *Transformers* models for acceleration. With BigDL-LLM, PyTorch models (in FP16/BF16/FP32) can be optimized with low-bit quantizations (supported precisions include INT4/INT5/INT8).
|
In general, you just need one-line `optimize_model` to easily optimize any loaded PyTorch model, regardless of the library or API you are using. With BigDL-LLM, PyTorch models (in FP16/BF16/FP32) can be optimized with low-bit quantizations (supported precisions include INT4, INT5, INT8, etc).
|
||||||
|
|
||||||
You can easily enable BigDL-LLM INT4 optimizations on any Pytorch models just as follows:
|
First, use any PyTorch APIs you like to load your model. To help you better understand the process, here we use [Hugging Face Transformers](https://huggingface.co/docs/transformers/index) library `LlamaForCausalLM` to load a popular model [Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) as an example:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# Create or load any Pytorch model
|
# Create or load any Pytorch model, take Llama-2-7b-chat-hf as an example
|
||||||
model = ...
|
from transformers import LlamaForCausalLM
|
||||||
|
model = LlamaForCausalLM.from_pretrained('meta-llama/Llama-2-7b-chat-hf', torch_dtype='auto', low_cpu_mem_usage=True)
|
||||||
|
```
|
||||||
|
|
||||||
# Add only two lines to enable BigDL-LLM INT4 optimizations on model
|
Then, just need to call `optimize_model` to optimize the loaded model and INT4 optimization is applied on model by default:
|
||||||
|
```python
|
||||||
from bigdl.llm import optimize_model
|
from bigdl.llm import optimize_model
|
||||||
|
|
||||||
|
# With only one line to enable BigDL-LLM INT4 optimization
|
||||||
model = optimize_model(model)
|
model = optimize_model(model)
|
||||||
```
|
```
|
||||||
|
|
||||||
After optimizing the model, you may straightly run the optimized model with no API changed and less inference latency.
|
After optimizing the model, BigDL-LLM does not require any change in the inference code. You can use any libraries to run the optimized model with very low latency.
|
||||||
|
|
||||||
```eval_rst
|
```eval_rst
|
||||||
.. seealso::
|
.. seealso::
|
||||||
|
|
||||||
See the examples for Hugging Face *Transformers* models `here <https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/transformers/general_int4>`_. And examples for other general Pytorch models can be found `here <https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/pytorch-model>`_.
|
* For more detailed usage of ``optimize_model``, please refer to the `API documentation <https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/LLM/optimize.html>`_.
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -5,9 +5,11 @@
|
||||||
Install BigDL-LLM for CPU supports using pip through:
|
Install BigDL-LLM for CPU supports using pip through:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install bigdl-llm[all]
|
pip install --pre --upgrade bigdl-llm[all] # install the latest bigdl-llm nightly build with 'all' option
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Please refer to [Environment Setup](#environment-setup) for more information.
|
||||||
|
|
||||||
```eval_rst
|
```eval_rst
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
|
|
@ -43,7 +45,7 @@ First we recommend using [Conda](https://docs.conda.io/en/latest/miniconda.html)
|
||||||
conda create -n llm python=3.9
|
conda create -n llm python=3.9
|
||||||
conda activate llm
|
conda activate llm
|
||||||
|
|
||||||
pip install bigdl-llm[all] # install bigdl-llm for CPU with 'all' option
|
pip install --pre --upgrade bigdl-llm[all] # install the latest bigdl-llm nightly build with 'all' option
|
||||||
```
|
```
|
||||||
|
|
||||||
Then for running a LLM model with BigDL-LLM optimizations (taking an `example.py` an example):
|
Then for running a LLM model with BigDL-LLM optimizations (taking an `example.py` an example):
|
||||||
|
|
|
||||||
|
|
@ -5,9 +5,11 @@
|
||||||
Install BigDL-LLM for GPU supports using pip through:
|
Install BigDL-LLM for GPU supports using pip through:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
|
pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu # install bigdl-llm for GPU
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Please refer to [Environment Setup](#environment-setup) for more information.
|
||||||
|
|
||||||
```eval_rst
|
```eval_rst
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
|
|
@ -25,6 +27,12 @@ BigDL-LLM for GPU supports has been verified on:
|
||||||
* Intel Arc™ A-Series Graphics
|
* Intel Arc™ A-Series Graphics
|
||||||
* Intel Data Center GPU Flex Series
|
* Intel Data Center GPU Flex Series
|
||||||
|
|
||||||
|
```eval_rst
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
We currently supoort the Ubuntu 20.04 operating system or later. Windows supoort is in progress.
|
||||||
|
```
|
||||||
|
|
||||||
To apply Intel GPU acceleration, there're several steps for tools installation and environment preparation:
|
To apply Intel GPU acceleration, there're several steps for tools installation and environment preparation:
|
||||||
|
|
||||||
* Step 1, only Linux system is supported now, Ubuntu 22.04 is prefered.
|
* Step 1, only Linux system is supported now, Ubuntu 22.04 is prefered.
|
||||||
|
|
|
||||||
|
|
@ -32,8 +32,8 @@ BigDL-LLM
|
||||||
|
|
||||||
+++
|
+++
|
||||||
|
|
||||||
|
:bdg-link:`PyTorch <./Overview/KeyFeatures/optimize_model.html>` |
|
||||||
:bdg-link:`transformers-style <./Overview/KeyFeatures/transformers_style_api.html>` |
|
:bdg-link:`transformers-style <./Overview/KeyFeatures/transformers_style_api.html>` |
|
||||||
:bdg-link:`Optimize Model <./Overview/KeyFeatures/optimize_model.html>` |
|
|
||||||
:bdg-link:`LangChain <./Overview/KeyFeatures/langchain_api.html>` |
|
:bdg-link:`LangChain <./Overview/KeyFeatures/langchain_api.html>` |
|
||||||
:bdg-link:`GPU <./Overview/KeyFeatures/gpu_supports.html>`
|
:bdg-link:`GPU <./Overview/KeyFeatures/gpu_supports.html>`
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -4,6 +4,6 @@ BigDL-LLM API
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 3
|
:maxdepth: 3
|
||||||
|
|
||||||
|
optimize.rst
|
||||||
transformers.rst
|
transformers.rst
|
||||||
langchain.rst
|
langchain.rst
|
||||||
optimize.rst
|
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
BigDL-LLM Optimize API
|
BigDL-LLM PyTorch API
|
||||||
=====================
|
=====================
|
||||||
|
|
||||||
llm.optimize
|
llm.optimize
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue