LLM: update example doc page (#9186)

This commit is contained in:
binbin Deng 2023-10-17 16:26:11 +08:00 committed by GitHub
parent 66c2e45634
commit 330e67e2c0
3 changed files with 113 additions and 31 deletions

View file

@ -1,7 +1,7 @@
BigDL-LLM Examples BigDL-LLM Examples
================================ ================================
You can use BigDL-LLM to run any Huggingface *Transfomers* models with INT4 optimizations on either servers or laptops. You can use BigDL-LLM to run any PyTorch model with INT4 optimizations on Intel XPU (from Laptop to GPU to Cloud).
Here, we provide examples to help you quickly get started using BigDL-LLM to run some popular open-source models in the community. Please refer to the appropriate guide based on your device: Here, we provide examples to help you quickly get started using BigDL-LLM to run some popular open-source models in the community. Please refer to the appropriate guide based on your device:

View file

@ -6,21 +6,59 @@ To run these examples, please first refer to [here](./install_cpu.html) for more
The following models have been verified on either servers or laptops with Intel CPUs. The following models have been verified on either servers or laptops with Intel CPUs.
| Model | Example | ## Example of PyTorch API
|-----------|----------------------------------------------------------|
| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/native_int4), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/vicuna) | | Model | Example of PyTorch API |
| LLaMA 2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/llama2) | |------------|-------------------------------------------------------|
| MPT | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/mpt) | | LLaMA 2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/llama2) |
| Falcon | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/falcon) | | ChatGLM | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/chatglm) |
| ChatGLM | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/chatglm) | | Mistral | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/mistral) |
| ChatGLM2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/chatglm2) | | Bark | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/bark) |
| Qwen | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/qwen) | | BERT | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/bert) |
| MOSS | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/moss) | | Openai Whisper | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/openai-whisper) |
| Baichuan | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/baichuan) |
| Dolly-v1 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/dolly_v1) | ```eval_rst
| Dolly-v2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/dolly_v2) | .. important::
| RedPajama | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/native_int4), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/redpajama) |
| Phoenix | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/native_int4), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/phoenix) | In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through PyTorch API as `example <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/More-Data-Types>`_.
| StarCoder | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/native_int4), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/starcoder) | ```
| InternLM | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/internlm) |
| Whisper | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/transformers/transformers_int4/whisper) |
## Example of `transformers`-style API
| Model | Example of `transformers`-style API |
|------------|-------------------------------------------------------|
| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/vicuna) |
| LLaMA 2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/llama2) | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/llama2) |
| ChatGLM | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/chatglm) | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm) |
| ChatGLM2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm2) |
| Mistral | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mistral) |
| Falcon | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/falcon) |
| MPT | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt) |
| Dolly-v1 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1) |
| Dolly-v2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2) |
| Replit Code| [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit) |
| RedPajama | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/redpajama) |
| Phoenix | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/phoenix) |
| StarCoder | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/starcoder) |
| Baichuan | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan) |
| Baichuan2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan2) |
| InternLM | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm) |
| Qwen | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen) |
| Aquila | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila) |
| MOSS | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/moss) |
| Whisper | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/whisper) |
```eval_rst
.. important::
In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through ``transformers``-style API as `example <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/More-Data-Types>`_.
```
```eval_rst
.. seealso::
See the complete examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU>`_.
```

View file

@ -12,15 +12,59 @@ To run these examples, please first refer to [here](./install_gpu.html) for more
The following models have been verified on either servers or laptops with Intel GPUs. The following models have been verified on either servers or laptops with Intel GPUs.
| Model | Example | ## Example of PyTorch API
|-----------|----------------------------------------------------------|
| LLaMA 2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/llama2) | | Model | Example of PyTorch API |
| MPT | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/mpt) | |------------|-------------------------------------------------------|
| Falcon | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/falcon) | | LLaMA 2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/llama2) |
| ChatGLM2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/chatglm2) | | ChatGLM 2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/chatglm2) |
| Qwen | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/qwen) | | Mistral | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/mistral) |
| Baichuan | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/baichuan) | | Baichuan | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/baichuan) |
| StarCoder | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/starcoder) | | Baichuan2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/baichuan2) |
| InternLM | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/internlm) | | Replit | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/replit) |
| Whisper | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/whisper) | | StarCoder | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/starcoder) |
| GPT-J | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/gpu/gpt-j) | | Dolly-v1 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/dolly-v1) |
| Dolly-v2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/dolly-v2) |
```eval_rst
.. important::
In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through PyTorch API as `example <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/More-Data-Types>`_.
```
## Example of `transformers`-style API
| Model | Example of `transformers`-style API |
|------------|-------------------------------------------------------|
| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* |[link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/vicuna)|
| LLaMA 2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2) |
| ChatGLM2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm2) |
| Mistral | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/mistral) |
| Falcon | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/falcon) |
| MPT | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt) |
| Dolly-v1 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1) |
| Dolly-v2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2) |
| Replit | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit) |
| StarCoder | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/starcoder) |
| Baichuan | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan) |
| Baichuan2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2) |
| InternLM | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/internlm) |
| Qwen | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen) |
| Aquila | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/aquila) |
| Whisper | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/whisper) |
| Chinese Llama2 | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chinese-llama2) |
| GPT-J | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/gpt-j) |
```eval_rst
.. important::
In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through ``transformers``-style API as `example <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types>`_.
```
```eval_rst
.. seealso::
See the complete examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU>`_.
```