From 37f509bb95c729e4ccabbeb5409b2745dc1fe16c Mon Sep 17 00:00:00 2001 From: Jason Dai Date: Thu, 14 Dec 2023 19:50:21 +0800 Subject: [PATCH] Update readme (#9692) --- README.md | 7 +++++-- docs/readthedocs/source/index.rst | 3 ++- .../example/GPU/HF-Transformers-AutoModels/Model/README.md | 4 ++-- python/llm/example/GPU/PyTorch-Models/Model/README.md | 4 ++-- python/llm/example/GPU/README.md | 2 +- 5 files changed, 12 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 7a43ba0d..a26b6080 100644 --- a/README.md +++ b/README.md @@ -11,8 +11,9 @@ > *It is built on the excellent work of [llama.cpp](https://github.com/ggerganov/llama.cpp), [bitsandbytes](https://github.com/TimDettmers/bitsandbytes), [qlora](https://github.com/artidoro/qlora), [gptq](https://github.com/IST-DASLab/gptq), [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ), [awq](https://github.com/mit-han-lab/llm-awq), [AutoAWQ](https://github.com/casper-hansen/AutoAWQ), [vLLM](https://github.com/vllm-project/vllm), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), [gptq_for_llama](https://github.com/qwopqwop200/GPTQ-for-LLaMa), [chatglm.cpp](https://github.com/li-plus/chatglm.cpp), [redpajama.cpp](https://github.com/togethercomputer/redpajama.cpp), [gptneox.cpp](https://github.com/byroneverson/gptneox.cpp), [bloomz.cpp](https://github.com/NouamaneTazi/bloomz.cpp/), etc.* -### Latest update -- [2023/12]`bigdl-llm` now supports [QA-LoRA](python/llm/example/GPU/QLoRA-FineTuning/alpaca-qlora#qa-lora) (see *["QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models"](https://arxiv.org/abs/2309.14717)*) +### Latest update :fire: +- [2023/12] `bigdl-llm` now supports [Mixtra-7x8B](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral) on both Intel [GPU](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral) and [CPU](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mixtral). +- [2023/12] `bigdl-llm` now supports [QA-LoRA](python/llm/example/GPU/QLoRA-FineTuning/alpaca-qlora#qa-lora) (see *["QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models"](https://arxiv.org/abs/2309.14717)*) - [2023/12] `bigdl-llm` now supports [FP8 and FP4 inference](python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types) on Intel ***GPU***. - [2023/11] Initial support for directly loading [GGUF](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GGUF), [AWQ](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/AWQ) and [GPTQ](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GPTQ) models in to `bigdl-llm` is available. - [2023/11] Initial support for [vLLM continuous batching](python/llm/example/CPU/vLLM-Serving) is availabe on Intel ***CPU***. @@ -95,6 +96,8 @@ pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-w ``` > Note: `bigdl-llm` has been tested on Python 3.9 +See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for mode details. + ##### Run Model You may apply INT4 optimizations to any Hugging Face *Transformers* models as follows. diff --git a/docs/readthedocs/source/index.rst b/docs/readthedocs/source/index.rst index 8ebdfbae..4a696bfc 100644 --- a/docs/readthedocs/source/index.rst +++ b/docs/readthedocs/source/index.rst @@ -22,8 +22,9 @@ BigDL-LLM: low-Bit LLM library It is built on top of the excellent work of `llama.cpp `_, `gptq `_, `bitsandbytes `_, `qlora `_, etc. ============================================ -Latest update +Latest update ============================================ +- [2023/12] ``bigdl-llm`` now supports `Mixtra-7x8B `_ on both Intel `GPU `_ and `CPU `_. - [2023/12] ``bigdl-llm`` now supports `QA-LoRA `_ (see `"QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models" `_). - [2023/12] ``bigdl-llm`` now supports `FP8 and FP4 inference `_ on Intel **GPU**. - [2023/11] Initial support for directly loading `GGUF `_, `AWQ `_ and `GPTQ `_ models in to ``bigdl-llm`` is available. diff --git a/python/llm/example/GPU/HF-Transformers-AutoModels/Model/README.md b/python/llm/example/GPU/HF-Transformers-AutoModels/Model/README.md index b98d03e3..4d35b4aa 100644 --- a/python/llm/example/GPU/HF-Transformers-AutoModels/Model/README.md +++ b/python/llm/example/GPU/HF-Transformers-AutoModels/Model/README.md @@ -8,7 +8,7 @@ You can use BigDL-LLM to run almost every Huggingface Transformer models with IN - Intel Data Center GPU Max Series ## Recommended Requirements -To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation. +To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation. See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for mode details. Step 1, only Linux system is supported now, Ubuntu 22.04 is prefered. @@ -16,7 +16,7 @@ Step 2, please refer to our [driver installation](https://dgpu-docs.intel.com/dr > **Note**: IPEX 2.0.110+xpu requires Intel GPU Driver version is [Stable 647.21](https://dgpu-docs.intel.com/releases/stable_647_21_20230714.html). Step 3, you also need to download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html). OneMKL and DPC++ compiler are needed, others are optional. -> **Note**: IPEX 2.0.110+xpu requires Intel® oneAPI Base Toolkit's version >= 2023.2.0. +> **Note**: IPEX 2.0.110+xpu requires Intel® oneAPI Base Toolkit's version == 2023.2.0. ## Best Known Configuration on Linux For better performance, it is recommended to set environment variables on Linux: diff --git a/python/llm/example/GPU/PyTorch-Models/Model/README.md b/python/llm/example/GPU/PyTorch-Models/Model/README.md index 75b7f668..31825764 100644 --- a/python/llm/example/GPU/PyTorch-Models/Model/README.md +++ b/python/llm/example/GPU/PyTorch-Models/Model/README.md @@ -8,7 +8,7 @@ You can use `optimize_model` API to accelerate general PyTorch models on Intel G - Intel Data Center GPU Max Series ## Recommended Requirements -To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation. +To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation. See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for mode details. Step 1, only Linux system is supported now, Ubuntu 22.04 is prefered. @@ -16,7 +16,7 @@ Step 2, please refer to our [driver installation](https://dgpu-docs.intel.com/dr > **Note**: IPEX 2.0.110+xpu requires Intel GPU Driver version is [Stable 647.21](https://dgpu-docs.intel.com/releases/stable_647_21_20230714.html). Step 3, you also need to download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html). OneMKL and DPC++ compiler are needed, others are optional. -> **Note**: IPEX 2.0.110+xpu requires Intel® oneAPI Base Toolkit's version >= 2023.2.0. +> **Note**: IPEX 2.0.110+xpu requires Intel® oneAPI Base Toolkit's version == 2023.2.0. ## Best Known Configuration on Linux For better performance, it is recommended to set environment variables on Linux: diff --git a/python/llm/example/GPU/README.md b/python/llm/example/GPU/README.md index 38b4f207..7ff7359b 100644 --- a/python/llm/example/GPU/README.md +++ b/python/llm/example/GPU/README.md @@ -19,7 +19,7 @@ This folder contains examples of running BigDL-LLM on Intel GPU: - Ubuntu 20.04 or later (Ubuntu 22.04 is preferred) ## Requirements -To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation. +To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation. See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for mode details. Step 1, please refer to our [driver installation](https://dgpu-docs.intel.com/driver/installation.html) for general purpose GPU capabilities. > **Note**: IPEX 2.0.110+xpu requires Intel GPU Driver version is [Stable 647.21](https://dgpu-docs.intel.com/releases/stable_647_21_20230714.html).