Commit graph

217 commits

Author SHA1 Message Date
Yishuo Wang
5f5ac8a856
fix llama related import (#12611) 2024-12-25 16:23:52 +08:00
Yishuo Wang
a608f26cc8
use new fused layer norm (#12553) 2024-12-17 13:52:35 +08:00
Jin, Qiao
d984c0672a
Add MiniCPM-V-2_6 to arc perf test (#12349) 2024-11-06 16:32:28 +08:00
Jin, Qiao
7240c283a3
Add dummy model in iGPU perf (#12341)
* Add dummy model in iGPU perf

* Add dummy model in iGPU perf

* Fix
2024-11-05 17:56:10 +08:00
Zhao Changmin
8e9a3a1158
fix chatglm2 cpu ut (#12336) 2024-11-05 16:43:57 +08:00
Shaojun Liu
aae2490cb8
fix UT (#12247)
* fix ut

* Update test_transformers_api_attention.py

* Update test_transformers_api_mlp.py
2024-10-23 14:13:06 +08:00
Yuwen Hu
5935b25622
Further update windows gpu perf test regarding results integrity check (#12232) 2024-10-18 18:15:13 +08:00
Yuwen Hu
b88c1df324
Add Llama 3.1 & 3.2 to Arc Performance test (#12225)
* Add llama3.1 and llama3.2 in arc perf (#12202)

* Add llama3.1 and llama3.2 in arc perf

* Uninstall trl after arc test on transformers>=4.40

* Fix arc llama3 perf (#12212)

* Fix pip uninstall

* Uninstall trl after test on transformers==4.43.1

* Fix llama3 arc perf (#12218)

---------

Co-authored-by: Jin, Qiao <89779290+JinBridger@users.noreply.github.com>
2024-10-17 21:12:45 +08:00
Yuwen Hu
c9ac39fc1e
Add Llama 3.2 to iGPU performance test (transformers 4.45) (#12209)
* Add Llama 3.2 to iGPU Perf (#12200)

* Add Llama 3.2 to iGPU Perf

* Downgrade accelerate after step

* Temporarily disable model for test

* Temporarily change ERRORLEVEL check (#12201)

* Restore llama3.2 perf (#12206)

* Revert "Temporarily change ERRORLEVEL check"

This reverts commit 909dbbc930ab4283737161a55bb32006e6ca1991.

* Revert "Temporarily disable model for test"

This reverts commit 95322dc3c6429aa836f21bda0b5ba8d9b48592f8.

---------

Co-authored-by: Jin, Qiao <89779290+JinBridger@users.noreply.github.com>
2024-10-15 17:44:46 +08:00
Jin, Qiao
8e35800abe
Add llama 3.1 in igpu perf (#12194) 2024-10-14 15:14:34 +08:00
Yishuo Wang
b1408a1f1c
fix UT (#12005) 2024-09-04 18:02:49 +08:00
Ruonan Wang
4a61f7d20d
update mlp of llama (#11897)
* update mlp of llama

* relax threshold of  mlp test

* revert code
2024-08-22 20:34:53 +08:00
Yuwen Hu
eab6f6dde4
Spr perf small fix (#11874) 2024-08-21 09:35:26 +08:00
Yuwen Hu
0d58c2fdf9
Update performance test regarding updated default transformers==4.37.0 (#11869)
* Update igpu performance from transformers 4.36.2 to 4.37.0 (#11841)

* upgrade arc perf test to transformers 4.37 (#11842)

* fix load low bit com dtype (#11832)

* feat: add mixed_precision argument on ppl longbench evaluation

* fix: delete extra code

* feat: upgrade arc perf test to transformers 4.37

* fix: add missing codes

* fix: keep perf test for qwen-vl-chat in transformers 4.36

* fix: remove extra space

* fix: resolve pr comment

* fix: add empty line

* fix: add pip install for spr and core test

* fix: delete extra comments

* fix: remove python -m for pip

* Revert "fix load low bit com dtype (#11832)"

This reverts commit 6841a9ac8f.

---------

Co-authored-by: Zhao Changmin <changmin.zhao@intel.com>
Co-authored-by: Jinhe Tang <jin.tang1337@gmail.com>

* add transformers==4.36 for qwen vl in igpu-perf (#11846)

* add transformers==4.36.2 for qwen-vl

* Small update

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>

* fix: remove qwen-7b on core test (#11851)

* fix: remove qwen-7b on core test

* fix: change delete to comment

---------

Co-authored-by: Jinhe Tang <jin.tang1337@gmail.com>

* replce filename (#11854)

* fix: remove qwen-7b on core test

* fix: change delete to comment

* fix: replace filename

---------

Co-authored-by: Jinhe Tang <jin.tang1337@gmail.com>

* fix: delete extra comments (#11863)

* Remove transformers installation for temp test purposes

* Small fix

* Small update

---------

Co-authored-by: Chu,Youcheng <70999398+cranechu0131@users.noreply.github.com>
Co-authored-by: Zhao Changmin <changmin.zhao@intel.com>
Co-authored-by: Jinhe Tang <jin.tang1337@gmail.com>
Co-authored-by: Zijie Li <michael20001122@gmail.com>
Co-authored-by: Chu,Youcheng <1340390339@qq.com>
2024-08-20 17:59:28 +08:00
Yuwen Hu
5e8286f72d
Update ipex-llm default transformers version to 4.37.0 (#11859)
* Update default transformers version to 4.37.0

* Add dependency requirements for qwen and qwen-vl

* Temp fix transformers version for these not yet verified models

* Skip qwen test in UT for now as it requires transformers<4.37.0
2024-08-20 17:37:58 +08:00
Yuwen Hu
580c94d0e2
Remove gemma-2-9b-it 3k input from igpu-perf (#11834) 2024-08-17 13:10:05 +08:00
Jin, Qiao
9f17234f3b
Add MiniCPM-V-2_6 to iGPU Perf (#11810)
* Add MiniCPM-V-2_6 to iGPU Perf

* keep last model in yaml

* fix MINICPM_V_IDS

* Restore tested model list

* Small fix

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2024-08-16 18:41:21 +08:00
Yuwen Hu
6543321f04
Remove 4k igpu perf on gemma-2-9b-it (#11820) 2024-08-15 18:06:19 +08:00
Yuwen Hu
ec184af243
Add gemma-2-2b-it and gemma-2-9b-it to igpu nightly performance test (#11778)
* add yaml and modify `concat_csv.py` for `transformers` 4.43.1 (#11758)

* add yaml and modify `concat_csv.py` for `transformers` 4.43.1

* remove 4.43 for arc; fix;

* remove 4096-512 for 4.43

* comment some models

* Small fix

* uncomment models (#11777)

---------

Co-authored-by: Ch1y0q <qiyue2001@gmail.com>
2024-08-13 15:39:56 +08:00
Jinhe
27b4b104ed
Add qwen2-1.5b-instruct into igpu performance (#11735)
* updated qwen1.5B to all transformer==4.37 yaml

* updated qwen1.5B to all transformer==4.37 yaml
2024-08-08 16:42:18 +08:00
Ruonan Wang
00a5574c8a
Use merge_qkv to replace fused_qkv for llama2 (#11727)
* update 4.38

* support new versions

* update

* fix style

* fix style

* update rope

* temp test sdpa

* fix style

* fix cpu ut
2024-08-07 18:04:01 +08:00
SichengStevenLi
985213614b
Removed no longer needed models for Arc nightly perf (#11722)
* removed LLMs that are no longer needed

Removed: 
mistralai/Mistral-7B-v0.1
deepseek-ai/deepseek-coder-6.7b-instruct

* Update arc-perf-test-batch4.yaml

Removed: 
deepseek-ai/deepseek-coder-6.7b-instruct
mistralai/Mistral-7B-v0.1

* Update arc-perf-test.yaml

Removed: 
deepseek-ai/deepseek-coder-6.7b-instruct
mistralai/Mistral-7B-v0.1

* Create arc-perf-transformers-438.yaml

* Moved arc-perf-transformers-438.yaml location

* Create arc-perf-transformers-438-batch2.yaml

* Create arc-perf-transformers-438-batch4.yaml

* Delete python/llm/test/benchmark/arc-perf-transformers-438-batch2.yaml

* Delete python/llm/test/benchmark/arc-perf-transformers-438-batch4.yaml

* Delete python/llm/test/benchmark/arc-perf-transformers-438.yaml
2024-08-06 16:12:00 +08:00
hxsz1997
8ef4caaf5d
add 3k and 4k input of nightly perf test on iGPU (#11701)
* Add 3k&4k input in workflow for iGPU (#11685)

* add 3k&4k input in workflow

* comment for test

* comment models for accelarate test

* remove OOM models

* modify typo

* change test model (#11696)

* reverse test models (#11700)
2024-08-01 14:17:46 +08:00
RyuKosei
2fbd375a94
update several models for nightly perf test (#11643)
Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-07-25 14:06:08 +08:00
Yuwen Hu
2478e2c14b
Add check in iGPU perf workflow for results integrity (#11616)
* Add csv check for igpu benchmark workflow (#11610)

* add csv check for igpu benchmark workflow

* ready to test

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>

* Restore the temporarily removed models in iGPU-perf (#11615)

Co-authored-by: ATMxsp01 <shou.xu@intel.com>

---------

Co-authored-by: Xu, Shuo <100334393+ATMxsp01@users.noreply.github.com>
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-18 14:13:16 +08:00
Xu, Shuo
13a72dc51d
Test MiniCPM performance on iGPU in a more stable way (#11573)
* Test MiniCPM performance on iGPU in a more stable way

* small fix

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-12 17:07:41 +08:00
Xu, Shuo
1355b2ce06
Add model Qwen-VL-Chat to iGPU-perf (#11558)
* Add model Qwen-VL-Chat to iGPU-perf

* small fix

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-11 15:39:02 +08:00
Xu, Shuo
028ad4f63c
Add model phi-3-vision-128k-instruct to iGPU-perf benchmark (#11554)
* try to improve MIniCPM performance

* Add model phi-3-vision-128k-instruct to iGPU-perf benchmark

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-10 17:26:30 +08:00
Xu, Shuo
61613b210c
try to improve MIniCPM performance (#11552)
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-10 16:58:23 +08:00
Yuwen Hu
8982ab73d5
Add Yi-6B and StableLM to iGPU perf test (#11546)
* Add transformer4.38.2 test to igpu benchmark (#11529)

* add transformer4.38.1 test to igpu benchmark

* use transformers4.38.2 & fix csv name error in 4.38 workflow

* add model Yi-6B-Chat & remove temporarily most models

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>

* filter some errorlevel (#11541)

Co-authored-by: ATMxsp01 <shou.xu@intel.com>

* Restore the temporarily removed models in iGPU-perf (#11544)

* filter some errorlevel

* restore the temporarily removed models in iGPU-perf

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>

---------

Co-authored-by: Xu, Shuo <100334393+ATMxsp01@users.noreply.github.com>
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-09 18:51:23 +08:00
Xu, Shuo
f9a199900d
add model RWKV/v5-Eagle-7B-HF to igpu benchmark (#11528)
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-08 15:50:16 +08:00
Jun Wang
5a57e54400
[ADD] add 5 new models for igpu-perf (#11524) 2024-07-08 11:12:15 +08:00
Xu, Shuo
64cfed602d
Add new models to benchmark (#11505)
* Add new models to benchmark

* remove Qwen/Qwen-VL-Chat to pass the validation

---------

Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-08 10:35:55 +08:00
Yuwen Hu
8f376e5192
Change igpu perf to mainly test int4+fp16 (#11513) 2024-07-05 17:12:33 +08:00
Jun Wang
f07937945f
[REMOVE] remove all useless repo-id in benchmark/igpu-perf (#11508) 2024-07-04 16:38:34 +08:00
Xu, Shuo
52519e07df
remove models we no longer need in benchmark. (#11492)
Co-authored-by: ATMxsp01 <shou.xu@intel.com>
2024-07-02 17:20:48 +08:00
Wenjing Margaret Mao
c0e86c523a
Add qwen-moe batch1 to nightly perf (#11369)
* add moe

* reduce 437 models

* rename

* fix syntax

* add moe check result

* add 430 + 437

* all modes

* 4-37-4 exclud

* revert & comment

---------

Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-06-20 14:17:41 +08:00
Wenjing Margaret Mao
b2f62a8561
Add batch 4 perf test (#11355)
* copy files to this branch

* add tasks

* comment one model

* change the model to test the 4.36

* only test batch-4

* typo

* typo

* typo

* typo

* typo

* typo

* add 4.37-batch4

* change the file name

* revet yaml file

* no print

* add batch4 task

* revert

---------

Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-06-20 09:48:52 +08:00
hxsz1997
44f22cba70
add config and default value (#11344)
* add config and default value

* add config in taml

* remove lookahead and max_matching_ngram_size in config

* remove streaming and use_fp16_torch_dtype in test yaml

* update task in readme

* update commit of task
2024-06-18 15:28:57 +08:00
Wenjing Margaret Mao
bca5cbd96c
Modify arc nightly perf to fp16 (#11275)
* change api

* move to pr mode and remove the build

* add batch4 yaml and remove the bigcode

* remove batch4

* revert the starcode

* remove the exclude

* revert

---------

Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-06-17 13:47:22 +08:00
Shaojun Liu
f5ef94046e
exclude dolly-v2-12b for arc perf test (#11315)
* test arc perf

* test

* test

* exclude dolly-v2-12b:2048

* revert changes
2024-06-14 15:35:56 +08:00
Jin Qiao
3682c6a979
add glm4 and qwen2 to igpu perf (#11304) 2024-06-13 16:16:35 +08:00
Yishuo Wang
01fe0fc1a2
refactor chatglm2/3 (#11290) 2024-06-13 12:22:58 +08:00
Wenjing Margaret Mao
b61f6e3ab1
Add update_parent_folder for nightly_perf_test (#11287)
* add update_parent_folder and change the workflow file

* add update_parent_folder and change the workflow file

* move to pr mode and comment the test

* use one model per comfig

* revert

---------

Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-06-12 17:58:13 +08:00
Xin Qiu
592f7aa61e
Refine glm1-4 sdp (#11276)
* chatglm

* update

* update

* change chatglm

* update sdpa

* update

* fix style

* fix

* fix glm

* update glm2-32k

* update glm2-32k

* fix cpu

* update

* change lower_bound
2024-06-12 17:11:56 +08:00
Wenjing Margaret Mao
70b17c87be
Merge multiple batches (#11264)
* add merge steps

* move to pr mode

* remove build + add merge.py

* add tohtml and change cp

* change test_batch folder path

* change merge_temp path

* change to html folder

* revert

* change place

* revert 437

* revert space

---------

Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-06-07 18:38:45 +08:00
Wenjing Margaret Mao
231b968aba
Modify the check_results.py to support batch 2&4 (#11133)
* add batch 2&4 and exclude to perf_test

* modify the perf-test&437 yaml

* modify llm_performance_test.yml

* remove batch 4

* modify check_results.py to support batch 2&4

* change the batch_size format

* remove genxir

* add str(batch_size)

* change actual_test_casese in check_results file to support batch_size

* change html highlight

* less models to test html and html_path

* delete the moe model

* split batch html

* split

* use installing from pypi

* use installing from pypi - batch2

* revert cpp

* revert cpp

* merge two jobs into one, test batch_size in one job

* merge two jobs into one, test batch_size in one job

* change file directory in workflow

* try catch deal with odd file without batch_size

* modify pandas version

* change the dir

* organize the code

* organize the code

* remove Qwen-MOE

* modify based on feedback

* modify based on feedback

* modify based on second round of feedback

* modify based on second round of feedback + change run-arc.sh mode

* modify based on second round of feedback + revert config

* modify based on second round of feedback + revert config

* modify based on second round of feedback + remove comments

* modify based on second round of feedback + remove comments

* modify based on second round of feedback + revert arc-perf-test

* modify based on third round of feedback

* change error type

* change error type

* modify check_results.html

* split batch into two folders

* add all models

* move csv_name

* revert pr test

* revert pr test

---------

Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-06-05 15:04:55 +08:00
Jin Qiao
25b6402315
Add Windows GPU unit test (#11050)
* Add Windows GPU UT

* temporarily remove ut on arc

* retry

* retry

* retry

* fix

* retry

* retry

* fix

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* fix

* retry

* retry

* retry

* retry

* retry

* retry

* merge into single workflow

* retry inference test

* retry

* retrigger

* try to fix inference test

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* retry

* check lower_bound

* retry

* retry

* try example test

* try fix example test

* retry

* fix

* seperate function into shell script

* remove cygpath

* try remove all cygpath

* retry

* retry

* Revert "try remove all cygpath"

This reverts commit 7ceeff3e48f08429062ecef548c1a3ad3488756f.

* Revert "retry"

This reverts commit 40ea2457843bff6991b8db24316cde5de1d35418.

* Revert "retry"

This reverts commit 817d0db3e5aec3bd449d3deaf4fb01d3ecfdc8a3.

* enable ut

* fix

* retrigger

* retrigger

* update download url

* fix

* fix

* retry

* add comment

* fix
2024-05-28 13:29:47 +08:00
Jiao Wang
0a06a6e1d4
Update tests for transformers 4.36 (#10858)
* update unit test

* update

* update

* update

* update

* update

* fix gpu attention test

* update

* update

* update

* update

* update

* update

* update example test

* replace replit code

* update

* update

* update

* update

* set safe_serialization false

* perf test

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* delete

* update

* update

* update

* update

* update

* update

* revert

* update
2024-05-24 10:26:38 +08:00
Yishuo Wang
d830a63bb7
refactor qwen (#11074) 2024-05-20 18:08:37 +08:00