[LLM] Small fixes to the Whisper transformers INT4 example (#8573)
* Small fixes to the whisper example * Small fix * Small fix
This commit is contained in:
parent
7a9fdf74df
commit
cad78740a7
4 changed files with 13 additions and 11 deletions
|
|
@ -29,6 +29,7 @@ We may use any Hugging Face Transfomer models on `bigdl-llm`, and the following
|
||||||
| Phoenix | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/phoenix) |
|
| Phoenix | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/phoenix) |
|
||||||
| StarCoder | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/starcoder) |
|
| StarCoder | [link1](example/transformers/native_int4), [link2](example/transformers/transformers_int4/starcoder) |
|
||||||
| InternLM | [link](example/transformers/transformers_int4/internlm) |
|
| InternLM | [link](example/transformers/transformers_int4/internlm) |
|
||||||
|
| Whisper | [link](example/transformers/transformers_int4/whisper) |
|
||||||
|
|
||||||
|
|
||||||
### Working with `bigdl-llm`
|
### Working with `bigdl-llm`
|
||||||
|
|
|
||||||
|
|
@ -17,6 +17,7 @@ You can use BigDL-LLM to run any Huggingface Transformer models with INT4 optimi
|
||||||
| Phoenix | [link](phoenix) |
|
| Phoenix | [link](phoenix) |
|
||||||
| StarCoder | [link](starcoder) |
|
| StarCoder | [link](starcoder) |
|
||||||
| InternLM | [link](internlm) |
|
| InternLM | [link](internlm) |
|
||||||
|
| Whisper | [link](whisper) |
|
||||||
|
|
||||||
## Recommended Requirements
|
## Recommended Requirements
|
||||||
To run the examples, we recommend using Intel® Xeon® processors (server), or >= 12th Gen Intel® Core™ processor (client).
|
To run the examples, we recommend using Intel® Xeon® processors (server), or >= 12th Gen Intel® Core™ processor (client).
|
||||||
|
|
|
||||||
|
|
@ -5,8 +5,8 @@ In this directory, you will find examples on how you could apply BigDL-LLM INT4
|
||||||
## 0. Requirements
|
## 0. Requirements
|
||||||
To run these examples with BigDL-LLM, we have some recommended requirements for your machine, please refer to [here](../README.md#recommended-requirements) for more information.
|
To run these examples with BigDL-LLM, we have some recommended requirements for your machine, please refer to [here](../README.md#recommended-requirements) for more information.
|
||||||
|
|
||||||
## Example: Predict Tokens using `generate()` API
|
## Example: Recognize Tokens using `generate()` API
|
||||||
In the example [generate.py](./generate.py), we show a basic use case for a Whisper model to predict the next N tokens using `generate()` API, with BigDL-LLM INT4 optimizations.
|
In the example [generate.py](./generate.py), we show a basic use case for a Whisper model to conduct transcription using `generate()` API, with BigDL-LLM INT4 optimizations.
|
||||||
### 1. Install
|
### 1. Install
|
||||||
We suggest using conda to manage environment:
|
We suggest using conda to manage environment:
|
||||||
```bash
|
```bash
|
||||||
|
|
@ -18,7 +18,7 @@ pip install bigdl-llm[all] # install bigdl-llm with 'all' option
|
||||||
|
|
||||||
### 2. Run
|
### 2. Run
|
||||||
```
|
```
|
||||||
python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE
|
python ./recognize.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --repo-id-or-data-path REPO_ID_OR_DATA_PATH --language LANGUAGE
|
||||||
```
|
```
|
||||||
|
|
||||||
Arguments info:
|
Arguments info:
|
||||||
|
|
@ -33,7 +33,7 @@ Arguments info:
|
||||||
#### 2.1 Client
|
#### 2.1 Client
|
||||||
On client Windows machine, it is recommended to run directly with full utilization of all cores:
|
On client Windows machine, it is recommended to run directly with full utilization of all cores:
|
||||||
```powershell
|
```powershell
|
||||||
python ./generate.py
|
python ./recognize.py
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 2.2 Server
|
#### 2.2 Server
|
||||||
|
|
@ -53,7 +53,7 @@ numactl -C 0-47 -m 0 python ./generate.py
|
||||||
#### [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)
|
#### [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)
|
||||||
|
|
||||||
```log
|
```log
|
||||||
Inference time: 0.23290777206420898 s
|
Inference time: xxxx s
|
||||||
-------------------- Output --------------------
|
-------------------- Output --------------------
|
||||||
[" Mr. Quilter is the Apostle of the Middle classes and we're glad to welcome his Gospel."]
|
[" Mr. Quilter is the Apostle of the Middle classes and we're glad to welcome his Gospel."]
|
||||||
```
|
```
|
||||||
|
|
@ -23,9 +23,9 @@ from transformers import WhisperProcessor
|
||||||
from datasets import load_dataset
|
from datasets import load_dataset
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
parser = argparse.ArgumentParser(description='Predict Tokens using `generate()` API for Whisper model')
|
parser = argparse.ArgumentParser(description='Recognize Tokens using `generate()` API for Whisper model')
|
||||||
parser.add_argument('--repo-id-or-model-path', type=str, default="openai/whisper-tiny",
|
parser.add_argument('--repo-id-or-model-path', type=str, default="openai/whisper-tiny",
|
||||||
help='The huggingface repo id for the whisper model to be downloaded'
|
help='The huggingface repo id for the Whisper model to be downloaded'
|
||||||
', or the path to the huggingface checkpoint folder')
|
', or the path to the huggingface checkpoint folder')
|
||||||
parser.add_argument('--repo-id-or-data-path', type=str,
|
parser.add_argument('--repo-id-or-data-path', type=str,
|
||||||
default="hf-internal-testing/librispeech_asr_dummy",
|
default="hf-internal-testing/librispeech_asr_dummy",
|
||||||
|
|
@ -45,11 +45,11 @@ if __name__ == '__main__':
|
||||||
load_in_4bit=True)
|
load_in_4bit=True)
|
||||||
model.config.forced_decoder_ids = None
|
model.config.forced_decoder_ids = None
|
||||||
|
|
||||||
# Load tokenizer
|
# Load processor
|
||||||
processor = WhisperProcessor.from_pretrained(model_path)
|
processor = WhisperProcessor.from_pretrained(model_path)
|
||||||
forced_decoder_ids = processor.get_decoder_prompt_ids(language=language, task="transcribe")
|
forced_decoder_ids = processor.get_decoder_prompt_ids(language=language, task="transcribe")
|
||||||
|
|
||||||
# load dummy dataset and read audio files
|
# Load dummy dataset and read audio files
|
||||||
ds = load_dataset(dataset_path, "clean", split="validation")
|
ds = load_dataset(dataset_path, "clean", split="validation")
|
||||||
|
|
||||||
# Generate predicted tokens
|
# Generate predicted tokens
|
||||||
Loading…
Reference in a new issue