* first support of xpu * make it works on gpu update setup update add GPU llama2 examples add use_optimize flag to disbale optimize for gpu fix style update gpu exmaple readme fix * update example, and update env * fix setup to add cpp files * replace jit with aot to avoid data leak * rename to bigdl-core-xe * update installation in example readme |
||
|---|---|---|
| .. | ||
| baichuan | ||
| chatglm | ||
| chatglm2 | ||
| dolly_v1 | ||
| dolly_v2 | ||
| falcon | ||
| GPU | ||
| internlm | ||
| llama2 | ||
| moss | ||
| mpt | ||
| phoenix | ||
| qwen | ||
| redpajama | ||
| starcoder | ||
| vicuna | ||
| whisper | ||
| README.md | ||
BigDL-LLM Transformers INT4 Optimization for Large Language Model
You can use BigDL-LLM to run any Huggingface Transformer models with INT4 optimizations on either servers or laptops. This directory contains example scripts to help you quickly get started using BigDL-LLM to run some popular open-source models in the community. Each model has its own dedicated folder, where you can find detailed instructions on how to install and run it.
Verified models
| Model | Example |
|---|---|
| LLaMA | link |
| LLaMA 2 | link |
| MPT | link |
| Falcon | link |
| ChatGLM | link |
| ChatGLM2 | link |
| MOSS | link |
| Baichuan | link |
| Dolly-v1 | link |
| Dolly-v2 | link |
| RedPajama | link |
| Phoenix | link |
| StarCoder | link |
| InternLM | link |
| Whisper | link |
| Qwen | link |
Recommended Requirements
To run the examples, we recommend using Intel® Xeon® processors (server), or >= 12th Gen Intel® Core™ processor (client).
For OS, BigDL-LLM supports Ubuntu 20.04 or later, CentOS 7 or later, and Windows 10/11.
Best Known Configuration on Linux
For better performance, it is recommended to set environment variables on Linux with the help of BigDL-Nano:
pip install bigdl-nano
source bigdl-nano-init