update vllm_online_benchmark script to support long input (#12095)

* update vllm_online_benchmark script to support long input

* update guide
This commit is contained in:
Shaojun Liu 2024-09-20 14:18:30 +08:00 committed by GitHub
parent 9650bf616a
commit 1295898830
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 187 additions and 5 deletions

View file

@ -85,8 +85,9 @@ We can benchmark the api_server to get an estimation about TPS (transactions per
After starting vllm service, Sending reqs through `vllm_online_benchmark.py`
```bash
python vllm_online_benchmark.py $model_name $max_seqs
python vllm_online_benchmark.py $model_name $max_seqs $input_length $output_length
```
If `input_length` and `output_length` are not provided, the script will use the default values of 1024 and 512, respectively.
And it will output like this:
```bash

File diff suppressed because one or more lines are too long