update vllm_online_benchmark script to support long input (#12095)
* update vllm_online_benchmark script to support long input * update guide
This commit is contained in:
parent
9650bf616a
commit
1295898830
2 changed files with 187 additions and 5 deletions
|
|
@ -85,8 +85,9 @@ We can benchmark the api_server to get an estimation about TPS (transactions per
|
||||||
|
|
||||||
After starting vllm service, Sending reqs through `vllm_online_benchmark.py`
|
After starting vllm service, Sending reqs through `vllm_online_benchmark.py`
|
||||||
```bash
|
```bash
|
||||||
python vllm_online_benchmark.py $model_name $max_seqs
|
python vllm_online_benchmark.py $model_name $max_seqs $input_length $output_length
|
||||||
```
|
```
|
||||||
|
If `input_length` and `output_length` are not provided, the script will use the default values of 1024 and 512, respectively.
|
||||||
|
|
||||||
And it will output like this:
|
And it will output like this:
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
File diff suppressed because one or more lines are too long
Loading…
Reference in a new issue