Jun Wang
|
3700e81977
|
[fix] vllm-online-benchmark first token latency error (#12271)
|
2024-10-29 17:54:36 +08:00 |
|
Shaojun Liu
|
657889e3e4
|
use english prompt by default (#12115)
|
2024-09-24 17:40:50 +08:00 |
|
Shaojun Liu
|
1295898830
|
update vllm_online_benchmark script to support long input (#12095)
* update vllm_online_benchmark script to support long input
* update guide
|
2024-09-20 14:18:30 +08:00 |
|
Shaojun Liu
|
7e1e51d91a
|
Update vllm setting (#12059)
* revert
* update
* update
* update
|
2024-09-11 11:45:08 +08:00 |
|
Shaojun Liu
|
52863dd567
|
fix vllm_online_benchmark.py (#12056)
|
2024-09-11 09:45:30 +08:00 |
|
Shaojun Liu
|
1e8c87050f
|
fix model path (#11973)
|
2024-08-30 13:28:28 +08:00 |
|
Wang, Jian4
|
b119825152
|
Remove tgi parameter validation (#11688)
* remove validation
* add min warm up
* remove no need source
|
2024-07-30 16:37:44 +08:00 |
|
Wang, Jian4
|
b7bc1023fb
|
Add vllm_online_benchmark.py (#11458)
* init
* update and add
* update
|
2024-06-28 14:59:06 +08:00 |
|