WeiguangHan
fd81d66047
LLM: Compress some models to save space ( #10315 )
...
* LLM: compress some models to save space
* add deleted comments
2024-03-04 17:53:03 +08:00
WeiguangHan
9724939499
temporarily disable bloom 2k input ( #10056 )
2024-01-31 17:49:12 +08:00
Yuwen Hu
1eaaace2dc
Update perf test all-in-one config for batch_size arg ( #10012 )
2024-01-26 16:46:36 +08:00
WeiguangHan
100e0a87e5
LLM: add compressed chatglm3 model ( #9892 )
...
* LLM: add compressed chatglm3 model
* small fix
* revert github action
2024-01-18 17:48:15 +08:00
Kai Huang
4d01069302
Temp remove baichuan2-13b 1k from arc perf test ( #9810 )
2023-12-29 12:54:13 +08:00
Kai Huang
40eaf76ae3
Add baichuan2-13b to Arc perf ( #9794 )
...
* add baichuan2-13b
* fix indent
* revert
2023-12-27 19:38:53 +08:00
WeiguangHan
c05d7e1532
LLM: add star_corder_15.5b model ( #9772 )
...
* LLM: add star_corder_15.5b model
* revert llm_performance_tests.yml
2023-12-26 18:55:56 +08:00
WeiguangHan
d4d2ccdd9d
LLM: remove startcorder-15.5b ( #9748 )
2023-12-21 18:52:52 +08:00
WeiguangHan
474c099559
LLM: using separate threads to do inference ( #9727 )
...
* using separate threads to do inference
* resolve some comments
* resolve some comments
* revert llm_performance_tests.yml file
2023-12-21 17:56:43 +08:00
WeiguangHan
3aa8b66bc3
LLM: remove starcoder-15.5b model temporarily ( #9720 )
2023-12-19 20:14:46 +08:00
Kai Huang
4c112ee70c
Rename qwen in model name for arc perf test ( #9712 )
2023-12-18 20:34:31 +08:00
Mingyu Wei
16febc949c
[LLM] Add exclude option in all-in-one performance test ( #9632 )
...
* add exclude option in all-in-one perf test
* update arc-perf-test.yaml
* Exclude in_out_pairs in main function
* fix some bugs
* address Kai's comments
* define excludes at the beginning
* add bloomz:2048 to exclude
2023-12-13 18:13:06 +08:00
Yuwen Hu
1012507a40
[LLM] Fix performance tests ( #9596 )
...
* Fix missing key for cpu_embedding
* Remove 512 as it stuck for now
* Small fix
2023-12-05 10:59:28 +08:00
WeiguangHan
5098bc3544
LLM: enable previous models ( #9505 )
...
* enable previous models
* test mistral model
* for test
* run models separately
* test all models
* for test
* revert the llm_performance_test.yaml
2023-11-28 10:21:07 +08:00
WeiguangHan
0d55bbd9f1
LLM: ajust the order of some models ( #9470 )
2023-11-15 17:04:59 +08:00
WeiguangHan
d109275333
temporarily disable the test of some models ( #9434 )
2023-11-13 18:50:53 +08:00
WeiguangHan
34449cb4bb
LLM: add remaining models to the arc perf test ( #9384 )
...
* add remaining models
* modify the filepath which stores the test result on ftp server
* resolve some comments
2023-11-09 14:28:42 +08:00
WeiguangHan
84ab614aab
LLM: add more models and skip runtime error ( #9349 )
...
* add more models and skip runtime error
* upgrade transformers
* temporarily removed Mistral-7B-v0.1
* temporarily disable the upload of arc perf result
2023-11-08 09:45:53 +08:00
WeiguangHan
9722e811be
LLM: add more models to the arc perf test ( #9297 )
...
* LLM: add more models to the arc perf test
* remove some old models
* install some dependencies
2023-11-01 16:56:32 +08:00
binbin Deng
f597a9d4f5
LLM: update perf test configuration ( #9264 )
2023-10-25 12:35:48 +08:00
WeiguangHan
f87f67ee1c
LLM: arc perf test for some popular models ( #9188 )
2023-10-19 15:56:15 +08:00