Song Jiaming
|
b8b1b6888b
|
[LLM] Performance test (#8796)
|
2023-08-25 14:31:45 +08:00 |
|
Ruonan Wang
|
e9aa2bd890
|
LLM: reduce GPU 1st token latency and update example (#8763)
* reduce 1st token latency
* update example
* fix
* fix style
* update readme of gpu benchmark
|
2023-08-16 18:01:23 +08:00 |
|
Ruonan Wang
|
8805186f2f
|
LLM: add benchmark tool for gpu (#8760)
* add benchmark tool for gpu
* update
|
2023-08-16 11:22:10 +08:00 |
|
Ruonan Wang
|
64b38e1dc8
|
llm: benchmark tool for transformers int4 (separate 1st token and rest) (#8460)
* add benchmark utils
* fix
* fix bug and add readme
* hidden latency data
|
2023-07-06 09:49:52 +08:00 |
|