diff --git a/README.md b/README.md index f9d92ae1..d7e1876f 100644 --- a/README.md +++ b/README.md @@ -192,6 +192,7 @@ Over 40 models have been optimized/verified on `bigdl-llm`, including *LLaMA/LLa | Ziya-Coding-34B-v1.0 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/ziya) | | | Phi-2 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phi-2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/phi-2) | | Yuan2 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/yuan2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/yuan2) | +| Gemma | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/gemma) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/gemma) | ***For more details, please refer to the `bigdl-llm` [Document](https://test-bigdl-llm.readthedocs.io/en/main/doc/LLM/index.html), [Readme](python/llm), [Tutorial](https://github.com/intel-analytics/bigdl-llm-tutorial) and [API Doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/LLM/index.html).*** diff --git a/python/llm/example/CPU/HF-Transformers-AutoModels/Model/gemma/README.md b/python/llm/example/CPU/HF-Transformers-AutoModels/Model/gemma/README.md index 43d10230..672ac2a4 100644 --- a/python/llm/example/CPU/HF-Transformers-AutoModels/Model/gemma/README.md +++ b/python/llm/example/CPU/HF-Transformers-AutoModels/Model/gemma/README.md @@ -4,7 +4,7 @@ In this directory, you will find examples on how you could apply BigDL-LLM INT4 ## Requirements To run these examples with BigDL-LLM on Intel GPUs, we have some recommended requirements for your machine, please refer to [here](../../../README.md#requirements) for more information. -**Important: According to Gemma's requirement, please make sure you have installed `transformers==4.38.0` to run the example.** +**Important: According to Gemma's requirement, please make sure you have installed `transformers==4.38.1` to run the example.** ## Example: Predict Tokens using `generate()` API In the example [generate.py](./generate.py), we show a basic use case for a Gemma model to predict the next N tokens using `generate()` API, with BigDL-LLM INT4 optimizations on Intel GPUs. @@ -20,8 +20,8 @@ conda activate llm # below command will install intel_extension_for_pytorch==2.1.10+xpu as default pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu -# According to Gemma's requirement, please make sure you are using a stable version of Transformers, 4.38.0 or newer. -pip install transformers==4.38.0 +# According to Gemma's requirement, please make sure you are using a stable version of Transformers, 4.38.1 or newer. +pip install transformers==4.38.1 ``` #### 1.2 Installation on Windows @@ -32,8 +32,8 @@ conda activate llm # below command will install intel_extension_for_pytorch==2.1.10+xpu as default pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu -# According to Gemma's requirement, please make sure you are using a stable version of Transformers, 4.38.0 or newer. -pip install transformers==4.38.0 +# According to Gemma's requirement, please make sure you are using a stable version of Transformers, 4.38.1 or newer. +pip install transformers==4.38.1 ``` ### 2. Configures OneAPI environment variables diff --git a/python/llm/example/GPU/HF-Transformers-AutoModels/Model/gemma/README.md b/python/llm/example/GPU/HF-Transformers-AutoModels/Model/gemma/README.md index 43d10230..672ac2a4 100644 --- a/python/llm/example/GPU/HF-Transformers-AutoModels/Model/gemma/README.md +++ b/python/llm/example/GPU/HF-Transformers-AutoModels/Model/gemma/README.md @@ -4,7 +4,7 @@ In this directory, you will find examples on how you could apply BigDL-LLM INT4 ## Requirements To run these examples with BigDL-LLM on Intel GPUs, we have some recommended requirements for your machine, please refer to [here](../../../README.md#requirements) for more information. -**Important: According to Gemma's requirement, please make sure you have installed `transformers==4.38.0` to run the example.** +**Important: According to Gemma's requirement, please make sure you have installed `transformers==4.38.1` to run the example.** ## Example: Predict Tokens using `generate()` API In the example [generate.py](./generate.py), we show a basic use case for a Gemma model to predict the next N tokens using `generate()` API, with BigDL-LLM INT4 optimizations on Intel GPUs. @@ -20,8 +20,8 @@ conda activate llm # below command will install intel_extension_for_pytorch==2.1.10+xpu as default pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu -# According to Gemma's requirement, please make sure you are using a stable version of Transformers, 4.38.0 or newer. -pip install transformers==4.38.0 +# According to Gemma's requirement, please make sure you are using a stable version of Transformers, 4.38.1 or newer. +pip install transformers==4.38.1 ``` #### 1.2 Installation on Windows @@ -32,8 +32,8 @@ conda activate llm # below command will install intel_extension_for_pytorch==2.1.10+xpu as default pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu -# According to Gemma's requirement, please make sure you are using a stable version of Transformers, 4.38.0 or newer. -pip install transformers==4.38.0 +# According to Gemma's requirement, please make sure you are using a stable version of Transformers, 4.38.1 or newer. +pip install transformers==4.38.1 ``` ### 2. Configures OneAPI environment variables