Ruonan Wang
|
4f34557224
|
LLM: support num_beams in all-in-one benchmark (#9141)
* support num_beams
* fix
|
2023-10-12 13:35:12 +08:00 |
|
Ruonan Wang
|
ad7d9231f5
|
LLM: add benchmark script for Max gpu and ipex fp16 gpu (#9112)
* add pvc bash
* meet code review
* rename to run-max-gpu.sh
|
2023-10-10 10:18:41 +08:00 |
|
Kai Huang
|
78ea7ddb1c
|
Combine apply_rotary_pos_emb for gpt-neox (#9074)
|
2023-10-07 16:27:46 +08:00 |
|
Cengguang Zhang
|
ad62c58b33
|
LLM: Enable jemalloc in benchmark scripts. (#9058)
* enable jemalloc.
* fix readme.
|
2023-09-26 15:37:49 +08:00 |
|
Kai Huang
|
6981745fe4
|
Optimize kv_cache for gpt-neox model family (#9015)
* override gptneox
* style
* move to utils
* revert
|
2023-09-20 19:59:19 +08:00 |
|
Cengguang Zhang
|
8299b68fea
|
update readme. (#8996)
|
2023-09-18 17:06:15 +08:00 |
|
Cengguang Zhang
|
cca84b0a64
|
LLM: update llm benchmark scripts. (#8943)
* update llm benchmark scripts.
* change tranformer_bf16 to pytorch_autocast_bf16.
* add autocast in transformer int4.
* revert autocast.
* add "pytorch_autocast_bf16" to doc
* fix comments.
|
2023-09-13 12:23:28 +08:00 |
|
Cengguang Zhang
|
3d2efe9608
|
LLM: update llm latency benchmark. (#8922)
|
2023-09-07 19:00:19 +08:00 |
|
binbin Deng
|
7897eb4b51
|
LLM: add benchmark scripts on GPU (#8916)
|
2023-09-07 18:08:17 +08:00 |
|
Song Jiaming
|
c06f1ca93e
|
[LLM] auto perf test to output to csv (#8846)
|
2023-09-01 10:48:00 +08:00 |
|