update vllm_online_benchmark script to support long input (#12095)
* update vllm_online_benchmark script to support long input * update guide
This commit is contained in:
		
							parent
							
								
									9650bf616a
								
							
						
					
					
						commit
						1295898830
					
				
					 2 changed files with 187 additions and 5 deletions
				
			
		| 
						 | 
				
			
			@ -85,8 +85,9 @@ We can benchmark the api_server to get an estimation about TPS (transactions per
 | 
			
		|||
 | 
			
		||||
After starting vllm service, Sending reqs through `vllm_online_benchmark.py`
 | 
			
		||||
```bash
 | 
			
		||||
python vllm_online_benchmark.py $model_name $max_seqs
 | 
			
		||||
python vllm_online_benchmark.py $model_name $max_seqs $input_length $output_length
 | 
			
		||||
```
 | 
			
		||||
If `input_length` and `output_length` are not provided, the script will use the default values of 1024 and 512, respectively.
 | 
			
		||||
 | 
			
		||||
And it will output like this:
 | 
			
		||||
```bash
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
										
											
												File diff suppressed because one or more lines are too long
											
										
									
								
							
		Loading…
	
		Reference in a new issue