Commit graph

107 commits

Author SHA1 Message Date
Xin Qiu
49a39452c6 update benchmark (#8899) 2023-09-06 15:11:43 +08:00
Song Jiaming
7b3ac66e17 [LLM] auto performance test fix specific settings to template (#8876) 2023-09-01 15:49:04 +08:00
Song Jiaming
c06f1ca93e [LLM] auto perf test to output to csv (#8846) 2023-09-01 10:48:00 +08:00
Song Jiaming
b8b1b6888b [LLM] Performance test (#8796) 2023-08-25 14:31:45 +08:00
Ruonan Wang
e9aa2bd890 LLM: reduce GPU 1st token latency and update example (#8763)
* reduce 1st token latency

* update example

* fix

* fix style

* update readme of gpu benchmark
2023-08-16 18:01:23 +08:00
Ruonan Wang
8805186f2f LLM: add benchmark tool for gpu (#8760)
* add benchmark tool for gpu

* update
2023-08-16 11:22:10 +08:00
Ruonan Wang
64b38e1dc8 llm: benchmark tool for transformers int4 (separate 1st token and rest) (#8460)
* add benchmark utils

* fix

* fix bug and add readme

* hidden latency data
2023-07-06 09:49:52 +08:00