Commit graph

8 commits

Author SHA1 Message Date
Jun Wang
3700e81977
[fix] vllm-online-benchmark first token latency error (#12271) 2024-10-29 17:54:36 +08:00
Shaojun Liu
657889e3e4
use english prompt by default (#12115) 2024-09-24 17:40:50 +08:00
Shaojun Liu
1295898830
update vllm_online_benchmark script to support long input (#12095)
* update vllm_online_benchmark script to support long input

* update guide
2024-09-20 14:18:30 +08:00
Shaojun Liu
7e1e51d91a
Update vllm setting (#12059)
* revert

* update

* update

* update
2024-09-11 11:45:08 +08:00
Shaojun Liu
52863dd567
fix vllm_online_benchmark.py (#12056) 2024-09-11 09:45:30 +08:00
Shaojun Liu
1e8c87050f
fix model path (#11973) 2024-08-30 13:28:28 +08:00
Wang, Jian4
b119825152
Remove tgi parameter validation (#11688)
* remove validation

* add min warm up

* remove no need source
2024-07-30 16:37:44 +08:00
Wang, Jian4
b7bc1023fb
Add vllm_online_benchmark.py (#11458)
* init

* update and add

* update
2024-06-28 14:59:06 +08:00