* add batch 2&4 and exclude to perf_test
* modify the perf-test&437 yaml
* modify llm_performance_test.yml
* remove batch 4
* modify check_results.py to support batch 2&4
* change the batch_size format
* remove genxir
* add str(batch_size)
* change actual_test_casese in check_results file to support batch_size
* change html highlight
* less models to test html and html_path
* delete the moe model
* split batch html
* split
* use installing from pypi
* use installing from pypi - batch2
* revert cpp
* revert cpp
* merge two jobs into one, test batch_size in one job
* merge two jobs into one, test batch_size in one job
* change file directory in workflow
* try catch deal with odd file without batch_size
* modify pandas version
* change the dir
* organize the code
* organize the code
* remove Qwen-MOE
* modify based on feedback
* modify based on feedback
* modify based on second round of feedback
* modify based on second round of feedback + change run-arc.sh mode
* modify based on second round of feedback + revert config
* modify based on second round of feedback + revert config
* modify based on second round of feedback + remove comments
* modify based on second round of feedback + remove comments
* modify based on second round of feedback + revert arc-perf-test
* modify based on third round of feedback
* change error type
* change error type
* modify check_results.html
* split batch into two folders
* add all models
* move csv_name
* revert pr test
* revert pr test
---------
Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>