* update bigdl_llm.py
* update the installation of harness
* fix partial function
* import ipex
* force seq len in decrease order
* put func outside class
* move comments
* default 'trust_remote_code' as True
* Update llm-harness-evaluation.yml
* modify output_path as a directory
* schedule nightly at 21 on Friday
* add tasks and models for nightly
* add accuracy regression
* comment out if to test
* mixed fp4
* for test
* add missing delimiter
* remove comma
* fixed golden results
* add mixed 4 golden result
* add more options
* add mistral results
* get golden result of stable lm
* move nightly scripts and results to test folder
* add license
* add fp8 stable lm golden
* run on all available devices
* trigger only when ready for review
* fix new line
* update golden
* add mistral
* add auto triggered acc test
* use llama 7b instead
* fix env
* debug download
* fix download prefix
* add cut dirs
* fix env of model path
* fix dataset download
* full job
* source xpu env vars
* use matrix to trigger model run
* reset batch=1
* remove redirect
* remove some trigger
* add task matrix
* add precision list
* test llama-7b-chat
* use /mnt/disk1 to store model and datasets
* remove installation test
* correct downloading path
* fix HF vars
* add bigdl-llm env vars
* rename file
* fix hf_home
* fix script path
* rename as harness evalution
* rerun