[LLM] Small fixes to the Whisper transformers INT4 example (#8573)

* Small fixes to the whisper example

* Small fix

* Small fix
This commit is contained in:
Yuwen Hu 2023-07-20 10:11:33 +08:00 committed by GitHub
parent 7a9fdf74df
commit cad78740a7
4 changed files with 13 additions and 11 deletions

View file

@ -28,7 +28,8 @@ We may use any Hugging Face Transfomer models on `bigdl-llm`, and the following
| RedPajama | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/redpajama) |
| Phoenix | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/phoenix) |
| StarCoder | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/starcoder) |
| InternLM | [link](example/transformers/transformers_int4/internlm) |
| InternLM | [link](example/transformers/transformers_int4/internlm) |
| Whisper | [link](example/transformers/transformers_int4/whisper) |
### Working with `bigdl-llm`

View file

@ -16,7 +16,8 @@ You can use BigDL-LLM to run any Huggingface Transformer models with INT4 optimi
| RedPajama | [link](redpajama) |
| Phoenix | [link](phoenix) |
| StarCoder | [link](starcoder) |
| InternLM | [link](internlm) |
| InternLM | [link](internlm) |
| Whisper | [link](whisper) |
## Recommended Requirements
To run the examples, we recommend using Intel® Xeon® processors (server), or >= 12th Gen Intel® Core™ processor (client).

View file

@ -5,8 +5,8 @@ In this directory, you will find examples on how you could apply BigDL-LLM INT4
## 0. Requirements
To run these examples with BigDL-LLM, we have some recommended requirements for your machine, please refer to [here](../README.md#recommended-requirements) for more information.
## Example: Predict Tokens using `generate()` API
In the example [generate.py](./generate.py), we show a basic use case for a Whisper model to predict the next N tokens using `generate()` API, with BigDL-LLM INT4 optimizations.
## Example: Recognize Tokens using `generate()` API
In the example [generate.py](./generate.py), we show a basic use case for a Whisper model to conduct transcription using `generate()` API, with BigDL-LLM INT4 optimizations.
### 1. Install
We suggest using conda to manage environment:
```bash
@ -18,7 +18,7 @@ pip install bigdl-llm[all] # install bigdl-llm with 'all' option
### 2. Run
```
python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE
python ./recognize.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE
```
Arguments info:
@ -33,7 +33,7 @@ Arguments info:
#### 2.1 Client
On client Windows machine, it is recommended to run directly with full utilization of all cores:
```powershell
python ./generate.py
python ./recognize.py
```
#### 2.2 Server
@ -53,7 +53,7 @@ numactl -C 0-47 -m 0 python ./generate.py
#### [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)
```log
Inference time: 0.23290777206420898 s
Inference time: xxxx s
-------------------- Output --------------------
[" Mr. Quilter is the Apostle of the Middle classes and we're glad to welcome his Gospel."]
```

View file

@ -23,9 +23,9 @@ from transformers import WhisperProcessor
from datasets import load_dataset
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for Whisper model')
parser = argparse.ArgumentParser(description='Recognize Tokens using `generate()` API for Whisper model')
parser.add_argument('--repo-id-or-model-path', type=str, default="openai/whisper-tiny",
help='The huggingface repo id for the whisper model to be downloaded'
help='The huggingface repo id for the Whisper model to be downloaded'
', or the path to the huggingface checkpoint folder')
parser.add_argument('--repo-id-or-data-path', type=str,
default="hf-internal-testing/librispeech_asr_dummy",
@ -45,11 +45,11 @@ if __name__ == '__main__':
load_in_4bit=True)
model.config.forced_decoder_ids = None
# Load tokenizer
# Load processor
processor = WhisperProcessor.from_pretrained(model_path)
forced_decoder_ids = processor.get_decoder_prompt_ids(language=language, task="transcribe")
# load dummy dataset and read audio files
# Load dummy dataset and read audio files
ds = load_dataset(dataset_path, "clean", split="validation")
# Generate predicted tokens