From d03218674aa049aac583000d116f86f97a5e5d28 Mon Sep 17 00:00:00 2001 From: Jason Dai Date: Wed, 9 Aug 2023 14:47:26 +0800 Subject: [PATCH] Update llm readme (#8703) --- README.md | 9 ++++++--- python/llm/README.md | 18 ++++++++++++------ 2 files changed, 18 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index f502cd84..1e1add49 100644 --- a/README.md +++ b/README.md @@ -9,13 +9,14 @@ _**Fast, Distributed, Secure AI for Big Data**_ --- ## Latest News -- **Try the latest [`bigdl-llm`](python/llm) for running LLM (language language model) on your Intel laptop using INT4 with very low latency!** *(It is built on top of the excellent work of [llama.cpp](https://github.com/ggerganov/llama.cpp), [gptq](https://github.com/IST-DASLab/gptq), [bitsandbytes](https://github.com/TimDettmers/bitsandbytes), etc., and supports any Hugging Face Transformers model)* +- **Try the latest [`bigdl-llm`](python/llm) for running LLM (language language model) on your Intel laptop using INT4 with very low latency!**[^1] *(It is built on top of the excellent work of [llama.cpp](https://github.com/ggerganov/llama.cpp), [gptq](https://github.com/IST-DASLab/gptq), [bitsandbytes](https://github.com/TimDettmers/bitsandbytes), etc., and supports any Hugging Face Transformers model)*

- - + +

+- **[Update] Over a dozen models have been verified on [`bigdl-llm`](python/llm)**, including *LLaMA/LLaMA2, ChatGLM/ChatGLM2, MPT, Falcon, Dolly-v1/Dolly-v2, StarCoder, Whisper, QWen, Baichuan,* and more; see the complete list [here](python/llm/README.md#verified-models). --- ## Overview @@ -406,6 +407,8 @@ If you've found BigDL useful for your project, you may cite our papers as follow } ``` +[^1]: Performance varies by use, configuration and other factors. `bigdl-llm` may not optimize to the same degree for non-Intel products. Learn more at www.Intel.com/PerformanceIndex. + - *[BigDL](https://arxiv.org/abs/1804.05839): A Distributed Deep Learning Framework for Big Data* ``` @INPROCEEDINGS{10.1145/3357223.3362707, diff --git a/python/llm/README.md b/python/llm/README.md index f11b471e..84ee4f2b 100644 --- a/python/llm/README.md +++ b/python/llm/README.md @@ -1,15 +1,15 @@ ## BigDL-LLM -**`bigdl-llm`** is a library for running ***LLM*** (language language model) on your Intel ***laptop*** using INT4 with very low latency (for any Hugging Face *Transformers* model). +**`bigdl-llm`** is a library for running ***LLM*** (language language model) on your Intel ***laptop*** using INT4 with very low latency[^1] (for any Hugging Face *Transformers* model). -*(It is built on top of the excellent work of [llama.cpp](https://github.com/ggerganov/llama.cpp), [gptq](https://github.com/IST-DASLab/gptq), [ggml](https://github.com/ggerganov/ggml), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), [gptq_for_llama](https://github.com/qwopqwop200/GPTQ-for-LLaMa), [bitsandbytes](https://github.com/TimDettmers/bitsandbytes), [redpajama.cpp](https://github.com/togethercomputer/redpajama.cpp), [gptneox.cpp](https://github.com/byroneverson/gptneox.cpp), [bloomz.cpp](https://github.com/NouamaneTazi/bloomz.cpp/), etc.)* +>*(It is built on top of the excellent work of [llama.cpp](https://github.com/ggerganov/llama.cpp), [gptq](https://github.com/IST-DASLab/gptq), [ggml](https://github.com/ggerganov/ggml), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), [gptq_for_llama](https://github.com/qwopqwop200/GPTQ-for-LLaMa), [bitsandbytes](https://github.com/TimDettmers/bitsandbytes), [chatglm.cpp](https://github.com/li-plus/chatglm.cpp), [redpajama.cpp](https://github.com/togethercomputer/redpajama.cpp), [gptneox.cpp](https://github.com/byroneverson/gptneox.cpp), [bloomz.cpp](https://github.com/NouamaneTazi/bloomz.cpp/), etc.)* ### Demos -See the ***optimized performance*** of `phoenix-inst-chat-7b`, `vicuna-13b-v1.1`, and `starcoder-15b` models on a 12th Gen Intel Core CPU below. +See the ***optimized performance*** of `chatglm2-6b`, `vicuna-13b-v1.1`, and `starcoder-15b` models on a 12th Gen Intel Core CPU below.

- - + +

### Verified models @@ -22,6 +22,7 @@ We may use any Hugging Face Transfomer models on `bigdl-llm`, and the following | Falcon | [link](example/transformers/transformers_int4/falcon) | | ChatGLM | [link](example/transformers/transformers_int4/chatglm) | | ChatGLM2 | [link](example/transformers/transformers_int4/chatglm2) | +| Qwen | [link](example/transformers/transformers_int4/qwen) | | MOSS | [link](example/transformers/transformers_int4/moss) | | Baichuan | [link](example/transformers/transformers_int4/baichuan) | | Dolly-v1 | [link](example/transformers/transformers_int4/dolly_v1) | @@ -31,7 +32,6 @@ We may use any Hugging Face Transfomer models on `bigdl-llm`, and the following | StarCoder | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/starcoder) | | InternLM | [link](example/transformers/transformers_int4/internlm) | | Whisper | [link](example/transformers/transformers_int4/whisper) | -| Qwen | [link](example/transformers/transformers_int4/qwen) | ### Working with `bigdl-llm` @@ -44,6 +44,7 @@ We may use any Hugging Face Transfomer models on `bigdl-llm`, and the following - [Hugging Face `transformers` API](#hugging-face-transformers-api) - [LangChain API](#langchain-api) - [CLI Tool](#cli-tool) +- [`bigdl-llm` API Doc](#bigdl-llm-api-doc) - [`bigdl-llm` Dependence](#bigdl-llm-dependence) @@ -237,6 +238,11 @@ You may run the models using the LangChain API in `bigdl-llm`. llm-chat -m "/path/to/output/model.bin" -x llama ``` +### `bigdl-llm` API Doc +See the inital `bigdl-llm` API Doc [here](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/LLM/index.html). + +[^1]: Performance varies by use, configuration and other factors. `bigdl-llm` may not optimize to the same degree for non-Intel products. Learn more at www.Intel.com/PerformanceIndex. + ### `bigdl-llm` Dependence The native code/lib in `bigdl-llm` has been built using the following tools; in particular, lower `LIBC` version on your Linux system may be incompatible with `bigdl-llm`.