[LLM] Small fixes to the Whisper transformers INT4 example (#8573)

* Small fixes to the whisper example

* Small fix

* Small fix
This commit is contained in:
Yuwen Hu 2023-07-20 10:11:33 +08:00 committed by GitHub
parent 7a9fdf74df
commit cad78740a7
4 changed files with 13 additions and 11 deletions

View file

@ -29,6 +29,7 @@ We may use any Hugging Face Transfomer models on `bigdl-llm`, and the following
| Phoenix | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/phoenix) | | Phoenix | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/phoenix) |
| StarCoder | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/starcoder) | | StarCoder | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/starcoder) |
| InternLM | [link](example/transformers/transformers_int4/internlm) | | InternLM | [link](example/transformers/transformers_int4/internlm) |
| Whisper | [link](example/transformers/transformers_int4/whisper) |
### Working with `bigdl-llm` ### Working with `bigdl-llm`

View file

@ -17,6 +17,7 @@ You can use BigDL-LLM to run any Huggingface Transformer models with INT4 optimi
| Phoenix | [link](phoenix) | | Phoenix | [link](phoenix) |
| StarCoder | [link](starcoder) | | StarCoder | [link](starcoder) |
| InternLM | [link](internlm) | | InternLM | [link](internlm) |
| Whisper | [link](whisper) |
## Recommended Requirements ## Recommended Requirements
To run the examples, we recommend using Intel® Xeon® processors (server), or >= 12th Gen Intel® Core™ processor (client). To run the examples, we recommend using Intel® Xeon® processors (server), or >= 12th Gen Intel® Core™ processor (client).

View file

@ -5,8 +5,8 @@ In this directory, you will find examples on how you could apply BigDL-LLM INT4
## 0. Requirements ## 0. Requirements
To run these examples with BigDL-LLM, we have some recommended requirements for your machine, please refer to [here](../README.md#recommended-requirements) for more information. To run these examples with BigDL-LLM, we have some recommended requirements for your machine, please refer to [here](../README.md#recommended-requirements) for more information.
## Example: Predict Tokens using `generate()` API ## Example: Recognize Tokens using `generate()` API
In the example [generate.py](./generate.py), we show a basic use case for a Whisper model to predict the next N tokens using `generate()` API, with BigDL-LLM INT4 optimizations. In the example [generate.py](./generate.py), we show a basic use case for a Whisper model to conduct transcription using `generate()` API, with BigDL-LLM INT4 optimizations.
### 1. Install ### 1. Install
We suggest using conda to manage environment: We suggest using conda to manage environment:
```bash ```bash
@ -18,7 +18,7 @@ pip install bigdl-llm[all] # install bigdl-llm with 'all' option
### 2. Run ### 2. Run
``` ```
python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE python ./recognize.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE
``` ```
Arguments info: Arguments info:
@ -33,7 +33,7 @@ Arguments info:
#### 2.1 Client #### 2.1 Client
On client Windows machine, it is recommended to run directly with full utilization of all cores: On client Windows machine, it is recommended to run directly with full utilization of all cores:
```powershell ```powershell
python ./generate.py python ./recognize.py
``` ```
#### 2.2 Server #### 2.2 Server
@ -53,7 +53,7 @@ numactl -C 0-47 -m 0 python ./generate.py
#### [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) #### [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)
```log ```log
Inference time: 0.23290777206420898 s Inference time: xxxx s
-------------------- Output -------------------- -------------------- Output --------------------
[" Mr. Quilter is the Apostle of the Middle classes and we're glad to welcome his Gospel."] [" Mr. Quilter is the Apostle of the Middle classes and we're glad to welcome his Gospel."]
``` ```

View file

@ -23,9 +23,9 @@ from transformers import WhisperProcessor
from datasets import load_dataset from datasets import load_dataset
if __name__ == '__main__': if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for Whisper model') parser = argparse.ArgumentParser(description='Recognize Tokens using `generate()` API for Whisper model')
parser.add_argument('--repo-id-or-model-path', type=str, default="openai/whisper-tiny", parser.add_argument('--repo-id-or-model-path', type=str, default="openai/whisper-tiny",
help='The huggingface repo id for the whisper model to be downloaded' help='The huggingface repo id for the Whisper model to be downloaded'
', or the path to the huggingface checkpoint folder') ', or the path to the huggingface checkpoint folder')
parser.add_argument('--repo-id-or-data-path', type=str, parser.add_argument('--repo-id-or-data-path', type=str,
default="hf-internal-testing/librispeech_asr_dummy", default="hf-internal-testing/librispeech_asr_dummy",
@ -45,11 +45,11 @@ if __name__ == '__main__':
load_in_4bit=True) load_in_4bit=True)
model.config.forced_decoder_ids = None model.config.forced_decoder_ids = None
# Load tokenizer # Load processor
processor = WhisperProcessor.from_pretrained(model_path) processor = WhisperProcessor.from_pretrained(model_path)
forced_decoder_ids = processor.get_decoder_prompt_ids(language=language, task="transcribe") forced_decoder_ids = processor.get_decoder_prompt_ids(language=language, task="transcribe")
# load dummy dataset and read audio files # Load dummy dataset and read audio files
ds = load_dataset(dataset_path, "clean", split="validation") ds = load_dataset(dataset_path, "clean", split="validation")
# Generate predicted tokens # Generate predicted tokens