Update readme (#9692)
This commit is contained in:
parent
1f0245039d
commit
37f509bb95
5 changed files with 12 additions and 8 deletions
|
|
@ -11,8 +11,9 @@
|
|||
|
||||
> *It is built on the excellent work of [llama.cpp](https://github.com/ggerganov/llama.cpp), [bitsandbytes](https://github.com/TimDettmers/bitsandbytes), [qlora](https://github.com/artidoro/qlora), [gptq](https://github.com/IST-DASLab/gptq), [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ), [awq](https://github.com/mit-han-lab/llm-awq), [AutoAWQ](https://github.com/casper-hansen/AutoAWQ), [vLLM](https://github.com/vllm-project/vllm), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), [gptq_for_llama](https://github.com/qwopqwop200/GPTQ-for-LLaMa), [chatglm.cpp](https://github.com/li-plus/chatglm.cpp), [redpajama.cpp](https://github.com/togethercomputer/redpajama.cpp), [gptneox.cpp](https://github.com/byroneverson/gptneox.cpp), [bloomz.cpp](https://github.com/NouamaneTazi/bloomz.cpp/), etc.*
|
||||
|
||||
### Latest update
|
||||
- [2023/12]`bigdl-llm` now supports [QA-LoRA](python/llm/example/GPU/QLoRA-FineTuning/alpaca-qlora#qa-lora) (see *["QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models"](https://arxiv.org/abs/2309.14717)*)
|
||||
### Latest update :fire:
|
||||
- [2023/12] `bigdl-llm` now supports [Mixtra-7x8B](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral) on both Intel [GPU](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral) and [CPU](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mixtral).
|
||||
- [2023/12] `bigdl-llm` now supports [QA-LoRA](python/llm/example/GPU/QLoRA-FineTuning/alpaca-qlora#qa-lora) (see *["QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models"](https://arxiv.org/abs/2309.14717)*)
|
||||
- [2023/12] `bigdl-llm` now supports [FP8 and FP4 inference](python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types) on Intel ***GPU***.
|
||||
- [2023/11] Initial support for directly loading [GGUF](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GGUF), [AWQ](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/AWQ) and [GPTQ](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GPTQ) models in to `bigdl-llm` is available.
|
||||
- [2023/11] Initial support for [vLLM continuous batching](python/llm/example/CPU/vLLM-Serving) is availabe on Intel ***CPU***.
|
||||
|
|
@ -95,6 +96,8 @@ pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-w
|
|||
```
|
||||
> Note: `bigdl-llm` has been tested on Python 3.9
|
||||
|
||||
See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for mode details.
|
||||
|
||||
##### Run Model
|
||||
You may apply INT4 optimizations to any Hugging Face *Transformers* models as follows.
|
||||
|
||||
|
|
|
|||
|
|
@ -22,8 +22,9 @@ BigDL-LLM: low-Bit LLM library
|
|||
It is built on top of the excellent work of `llama.cpp <https://github.com/ggerganov/llama.cpp>`_, `gptq <https://github.com/IST-DASLab/gptq>`_, `bitsandbytes <https://github.com/TimDettmers/bitsandbytes>`_, `qlora <https://github.com/artidoro/qlora>`_, etc.
|
||||
|
||||
============================================
|
||||
Latest update
|
||||
Latest update
|
||||
============================================
|
||||
- [2023/12] ``bigdl-llm`` now supports `Mixtra-7x8B <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral>`_ on both Intel `GPU <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral>`_ and `CPU <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mixtral>`_.
|
||||
- [2023/12] ``bigdl-llm`` now supports `QA-LoRA <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/QLoRA-FineTuning/alpaca-qlora#qa-lora>`_ (see `"QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models" <https://arxiv.org/abs/2309.14717>`_).
|
||||
- [2023/12] ``bigdl-llm`` now supports `FP8 and FP4 inference <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types>`_ on Intel **GPU**.
|
||||
- [2023/11] Initial support for directly loading `GGUF <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GGUF>`_, `AWQ <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/AWQ>`_ and `GPTQ <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GPTQ>`_ models in to ``bigdl-llm`` is available.
|
||||
|
|
|
|||
|
|
@ -8,7 +8,7 @@ You can use BigDL-LLM to run almost every Huggingface Transformer models with IN
|
|||
- Intel Data Center GPU Max Series
|
||||
|
||||
## Recommended Requirements
|
||||
To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation.
|
||||
To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation. See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for mode details.
|
||||
|
||||
Step 1, only Linux system is supported now, Ubuntu 22.04 is prefered.
|
||||
|
||||
|
|
@ -16,7 +16,7 @@ Step 2, please refer to our [driver installation](https://dgpu-docs.intel.com/dr
|
|||
> **Note**: IPEX 2.0.110+xpu requires Intel GPU Driver version is [Stable 647.21](https://dgpu-docs.intel.com/releases/stable_647_21_20230714.html).
|
||||
|
||||
Step 3, you also need to download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html). OneMKL and DPC++ compiler are needed, others are optional.
|
||||
> **Note**: IPEX 2.0.110+xpu requires Intel® oneAPI Base Toolkit's version >= 2023.2.0.
|
||||
> **Note**: IPEX 2.0.110+xpu requires Intel® oneAPI Base Toolkit's version == 2023.2.0.
|
||||
|
||||
## Best Known Configuration on Linux
|
||||
For better performance, it is recommended to set environment variables on Linux:
|
||||
|
|
|
|||
|
|
@ -8,7 +8,7 @@ You can use `optimize_model` API to accelerate general PyTorch models on Intel G
|
|||
- Intel Data Center GPU Max Series
|
||||
|
||||
## Recommended Requirements
|
||||
To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation.
|
||||
To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation. See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for mode details.
|
||||
|
||||
Step 1, only Linux system is supported now, Ubuntu 22.04 is prefered.
|
||||
|
||||
|
|
@ -16,7 +16,7 @@ Step 2, please refer to our [driver installation](https://dgpu-docs.intel.com/dr
|
|||
> **Note**: IPEX 2.0.110+xpu requires Intel GPU Driver version is [Stable 647.21](https://dgpu-docs.intel.com/releases/stable_647_21_20230714.html).
|
||||
|
||||
Step 3, you also need to download and install [Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html). OneMKL and DPC++ compiler are needed, others are optional.
|
||||
> **Note**: IPEX 2.0.110+xpu requires Intel® oneAPI Base Toolkit's version >= 2023.2.0.
|
||||
> **Note**: IPEX 2.0.110+xpu requires Intel® oneAPI Base Toolkit's version == 2023.2.0.
|
||||
|
||||
## Best Known Configuration on Linux
|
||||
For better performance, it is recommended to set environment variables on Linux:
|
||||
|
|
|
|||
|
|
@ -19,7 +19,7 @@ This folder contains examples of running BigDL-LLM on Intel GPU:
|
|||
- Ubuntu 20.04 or later (Ubuntu 22.04 is preferred)
|
||||
|
||||
## Requirements
|
||||
To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation.
|
||||
To apply Intel GPU acceleration, there’re several steps for tools installation and environment preparation. See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for mode details.
|
||||
|
||||
Step 1, please refer to our [driver installation](https://dgpu-docs.intel.com/driver/installation.html) for general purpose GPU capabilities.
|
||||
> **Note**: IPEX 2.0.110+xpu requires Intel GPU Driver version is [Stable 647.21](https://dgpu-docs.intel.com/releases/stable_647_21_20230714.html).
|
||||
|
|
|
|||
Loading…
Reference in a new issue