History

dingbaorong 5a2ce421af add cpu and gpu examples of flan-t5 (#9171 ) * add cpu and gpu examples of flan-t5 * address yuwen's comments * Add explanation why we add modules to not convert * Refine prompt and add a translation example * Add a empty line at the end of files * add examples of flan-t5 using optimize_mdoel api * address bin's comments * address binbin's comments * add flan-t5 in readme		2023-10-24 15:24:01 +08:00
..
aquila	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
baichuan	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
baichuan2	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
chatglm	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
chatglm2	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
dolly_v1	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
dolly_v2	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
falcon	add position_ids and fuse embedding for falcon (#9242 )	2023-10-24 09:58:20 +08:00
flan-t5	add cpu and gpu examples of flan-t5 (#9171 )	2023-10-24 15:24:01 +08:00
internlm	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
llama2	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
mistral	LLM: add mistral examples (#9121 )	2023-10-11 13:38:15 +08:00
moss	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
mpt	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
phi-1_5	phi-1_5 CPU and GPU examples (#9173 )	2023-10-24 15:08:04 +08:00
phoenix	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
qwen	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
redpajama	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
replit	LLM: Add Replit CPU and GPU example (#9028 )	2023-10-12 13:42:14 +08:00
starcoder	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
vicuna	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
whisper	LLM: update example layout (#9046 )	2023-10-09 15:36:39 +08:00
README.md	add cpu and gpu examples of flan-t5 (#9171 )	2023-10-24 15:24:01 +08:00

README.md

BigDL-LLM Transformers INT4 Optimization for Large Language Model

You can use BigDL-LLM to run any Huggingface Transformer models with INT4 optimizations on either servers or laptops. This directory contains example scripts to help you quickly get started using BigDL-LLM to run some popular open-source models in the community. Each model has its own dedicated folder, where you can find detailed instructions on how to install and run it.

Verified models

Model	Example
LLaMA	link
LLaMA 2	link
MPT	link
Falcon	link
ChatGLM	link
ChatGLM2	link
MOSS	link
Baichuan	link
Baichuan2	link
Dolly-v1	link
Dolly-v2	link
RedPajama	link
Phoenix	link
StarCoder	link
InternLM	link
Whisper	link
Qwen	link
Aquila	link
Replit	link
Mistral	link
Flan-t5	link

Recommended Requirements

To run the examples, we recommend using Intel® Xeon® processors (server), or >= 12th Gen Intel® Core™ processor (client).

For OS, BigDL-LLM supports Ubuntu 20.04 or later, CentOS 7 or later, and Windows 10/11.

Best Known Configuration on Linux

For better performance, it is recommended to set environment variables on Linux with the help of BigDL-Nano:

pip install bigdl-nano
source bigdl-nano-init