Xin Qiu
|
13e61738c5
|
hide detail memory for each token in benchmark_utils.py (#10037)
|
2024-01-30 16:04:17 +08:00 |
|
Xin Qiu
|
6fb3f40f7e
|
fix error for benchmark_util.py running on cpu (#9949)
|
2024-01-22 10:14:40 +08:00 |
|
Xin Qiu
|
610b5226be
|
move reserved memory to benchmark_utils.py (#9907)
* move reserved memory to benchmark_utils.py
* meet code review
|
2024-01-19 09:44:30 +08:00 |
|
Ruonan Wang
|
1363e666fc
|
LLM: update benchmark_util.py for beam search (#9126)
* update reorder_cache
* fix
|
2023-10-11 09:41:53 +08:00 |
|
Ruonan Wang
|
057e77e229
|
LLM: update benchmark_utils.py to handle do_sample=True (#8903)
|
2023-09-07 14:20:47 +08:00 |
|
Xin Qiu
|
49a39452c6
|
update benchmark (#8899)
|
2023-09-06 15:11:43 +08:00 |
|
Song Jiaming
|
b8b1b6888b
|
[LLM] Performance test (#8796)
|
2023-08-25 14:31:45 +08:00 |
|
Ruonan Wang
|
64b38e1dc8
|
llm: benchmark tool for transformers int4 (separate 1st token and rest) (#8460)
* add benchmark utils
* fix
* fix bug and add readme
* hidden latency data
|
2023-07-06 09:49:52 +08:00 |
|