Commit graph

29 commits

Author SHA1 Message Date
RyuKosei
2fbd375a94
update several models for nightly perf test (#11643)
Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-07-25 14:06:08 +08:00
Xu, Shuo
64cfed602d
Add new models to benchmark (#11505)
* Add new models to benchmark

* remove Qwen/Qwen-VL-Chat to pass the validation

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-08 10:35:55 +08:00
Xu, Shuo
52519e07df
remove models we no longer need in benchmark. (#11492)
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-02 17:20:48 +08:00
hxsz1997
44f22cba70
add config and default value (#11344)
* add config and default value

* add config in taml

* remove lookahead and max_matching_ngram_size in config

* remove streaming and use_fp16_torch_dtype in test yaml

* update task in readme

* update commit of task
2024-06-18 15:28:57 +08:00
Wenjing Margaret Mao
bca5cbd96c
Modify arc nightly perf to fp16 (#11275)
* change api

* move to pr mode and remove the build

* add batch4 yaml and remove the bigcode

* remove batch4

* revert the starcode

* remove the exclude

* revert

---------

Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-06-17 13:47:22 +08:00
Shaojun Liu
f5ef94046e
exclude dolly-v2-12b for arc perf test (#11315)
* test arc perf

* test

* test

* exclude dolly-v2-12b:2048

* revert changes
2024-06-14 15:35:56 +08:00
Jiao Wang
0a06a6e1d4
Update tests for transformers 4.36 (#10858)
* update unit test

* update

* update

* update

* update

* update

* fix gpu attention test

* update

* update

* update

* update

* update

* update

* update example test

* replace replit code

* update

* update

* update

* update

* set safe_serialization false

* perf test

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* delete

* update

* update

* update

* update

* update

* update

* revert

* update
2024-05-24 10:26:38 +08:00
Kai Huang
1315150e64 Add baichuan2-13b 1k to arc nightly perf (#10406) 2024-03-15 10:29:11 +08:00
WeiguangHan
fd81d66047 LLM: Compress some models to save space (#10315)
* LLM: compress some models to save space

* add deleted comments
2024-03-04 17:53:03 +08:00
WeiguangHan
9724939499 temporarily disable bloom 2k input (#10056) 2024-01-31 17:49:12 +08:00
Yuwen Hu
1eaaace2dc Update perf test all-in-one config for batch_size arg (#10012) 2024-01-26 16:46:36 +08:00
WeiguangHan
100e0a87e5 LLM: add compressed chatglm3 model (#9892)
* LLM: add compressed chatglm3 model

* small fix

* revert github action
2024-01-18 17:48:15 +08:00
Kai Huang
4d01069302 Temp remove baichuan2-13b 1k from arc perf test (#9810) 2023-12-29 12:54:13 +08:00
Kai Huang
40eaf76ae3 Add baichuan2-13b to Arc perf (#9794)
* add baichuan2-13b

* fix indent

* revert
2023-12-27 19:38:53 +08:00
WeiguangHan
c05d7e1532 LLM: add star_corder_15.5b model (#9772)
* LLM: add star_corder_15.5b model

* revert llm_performance_tests.yml
2023-12-26 18:55:56 +08:00
WeiguangHan
d4d2ccdd9d LLM: remove startcorder-15.5b (#9748) 2023-12-21 18:52:52 +08:00
WeiguangHan
474c099559 LLM: using separate threads to do inference (#9727)
* using separate threads to do inference

* resolve some comments

* resolve some comments

* revert llm_performance_tests.yml file
2023-12-21 17:56:43 +08:00
WeiguangHan
3aa8b66bc3 LLM: remove starcoder-15.5b model temporarily (#9720) 2023-12-19 20:14:46 +08:00
Kai Huang
4c112ee70c Rename qwen in model name for arc perf test (#9712) 2023-12-18 20:34:31 +08:00
Mingyu Wei
16febc949c [LLM] Add exclude option in all-in-one performance test (#9632)
* add exclude option in all-in-one perf test

* update arc-perf-test.yaml

* Exclude in_out_pairs in main function

* fix some bugs

* address Kai's comments

* define excludes at the beginning

* add bloomz:2048 to exclude
2023-12-13 18:13:06 +08:00
Yuwen Hu
1012507a40 [LLM] Fix performance tests (#9596)
* Fix missing key for cpu_embedding

* Remove 512 as it stuck for now

* Small fix
2023-12-05 10:59:28 +08:00
WeiguangHan
5098bc3544 LLM: enable previous models (#9505)
* enable previous models

* test mistral model

* for test

* run models separately

* test all models

* for test

* revert the llm_performance_test.yaml
2023-11-28 10:21:07 +08:00
WeiguangHan
0d55bbd9f1 LLM: ajust the order of some models (#9470) 2023-11-15 17:04:59 +08:00
WeiguangHan
d109275333 temporarily disable the test of some models (#9434) 2023-11-13 18:50:53 +08:00
WeiguangHan
34449cb4bb LLM: add remaining models to the arc perf test (#9384)
* add remaining models

* modify the filepath which stores the test result on ftp server

* resolve some comments
2023-11-09 14:28:42 +08:00
WeiguangHan
84ab614aab LLM: add more models and skip runtime error (#9349)
* add more models and skip runtime error

* upgrade transformers

* temporarily removed Mistral-7B-v0.1

* temporarily disable the upload of arc perf result
2023-11-08 09:45:53 +08:00
WeiguangHan
9722e811be LLM: add more models to the arc perf test (#9297)
* LLM: add more models to the arc perf test

* remove some old models

* install some dependencies
2023-11-01 16:56:32 +08:00
binbin Deng
f597a9d4f5 LLM: update perf test configuration (#9264) 2023-10-25 12:35:48 +08:00
WeiguangHan
f87f67ee1c LLM: arc perf test for some popular models (#9188) 2023-10-19 15:56:15 +08:00