ipex-llm

Author	SHA1	Message	Date
Chen, Zhentao	a55cc91e1f	fix make_csv.py	2024-02-23 20:25:46 +08:00
Chen, Zhentao	a204337cad	Rename results	2024-02-23 17:12:37 +08:00
Chen, Zhentao	4fdf96dc8b	fix ACC_FOLDER	2024-02-23 17:11:03 +08:00
Chen, Zhentao	e838ec9e14	remove dependency	2024-02-23 16:33:40 +08:00
Chen, Zhentao	88f7f56980	rewrite html visualization	2024-02-23 16:33:39 +08:00
Chen, Zhentao	bfa98666a6	fall back to make_table.py	2024-02-23 16:33:38 +08:00
Chen, Zhentao	02cb96e7f6	fix Run Harness job	2024-02-23 16:33:37 +08:00
Chen, Zhentao	e1fcf54a0c	reformat	2024-02-23 16:33:36 +08:00
Chen, Zhentao	5399343adc	fix harness installation	2024-02-23 16:33:35 +08:00
Chen, Zhentao	9c8e349196	remove harness job output	2024-02-23 16:33:34 +08:00
Chen, Zhentao	8472de90e8	use stable lm to test pr	2024-02-23 16:33:34 +08:00
Chen, Zhentao	f315c7f93a	Move harness nightly related files to llm/test folder (#10209 ) * move harness nightly files to test folder * change workflow file path accordingly * use arc01 when pr * fix path * fix fp16 csv path	2024-02-23 11:12:36 +08:00
hxsz1997	5b387bb71a	Change the nightly test time of ppl and harness (#10198 ) * remove include and language option, select the corresponding dataset based on the model name in Run * change the nightly test time * change the nightly test time of harness and ppl	2024-02-21 17:39:33 +08:00
yb-peng	b1a97b71a9	Harness eval: Add is_last parameter and fix logical operator in highlight_vals (#10192 ) * Add is_last parameter and fix logical operator in highlight_vals * Add script to update HTML files in parent folder * Add running update_html_in_parent_folder.py in summarize step * Add licence info * Remove update_html_in_parent_folder.py in Summarize the results for pull request	2024-02-21 14:45:32 +08:00
Chen, Zhentao	39d37bd042	upgrade harness package version in workflow (#10188 ) * upgrade harness * update readme	2024-02-21 11:21:30 +08:00
yb-peng	de3dc609ee	Modify harness evaluation workflow (#10174 ) * Modify table head in harness * Specify the file path of fp16.csv * change run to run nightly and run pr to debug * Modify the way to get fp16.csv to downloading from github * Change the method to calculate diff in html table * Change the method to calculate diff in html table * Re-arrange job order * Re-arrange job order * Change limit * Change fp16.csv path * Change highlight rules * Change limit	2024-02-20 18:55:43 +08:00
hxsz1997	6e10d98a8d	Fix some typos (#10175 ) * add llm-ppl workflow * update the DATASET_DIR * test multiple precisions * modify nightly test * match the updated ppl code * add matrix.include * fix the include error * update the include * add more model * update the precision of include * update nightly time and add more models * fix the workflow_dispatch description, change default model of pr and modify the env * modify workflow_dispatch language options * modify options * modify language options * modeify workflow_dispatch type * modify type * modify the type of language * change seq_len type * fix some typos * revert changes to stress_test.txt	2024-02-20 14:14:53 +08:00
yb-peng	50fa004ba5	Specify the version of pandas in harness evaluation workflow (#10159 ) * Specify the version of pandas in harness evaluation workflow * Specify the version of pandas in harness evaluation workflow	2024-02-19 16:27:08 +08:00
Shaojun Liu	7a3a20cf5b	Fix: GitHub-owned GitHubAction not pinned by hash (#10152 )	2024-02-18 16:49:28 +08:00
Shaojun Liu	c3daacec6d	Fix Token Permission issues (#10151 ) Co-authored-by: Your Name <Your Email>	2024-02-18 13:23:54 +08:00
yb-peng	b7c5104d98	remove limit in harness run (#10139 )	2024-02-09 11:20:53 +08:00
yb-peng	b4dc33def6	In harness-evaluation workflow, add statistical tables (#10118 ) * chnage storage * fix typo * change label * change label to arc03 * change needs in the last step * add generate csv in harness/make_table_results.py * modify needs in the last job * add csv to html * mfix path issue in llm-harness-summary-nightly * modify output_path * modify args in make_table_results.py * modify make table command in summary * change pr env label * remove irrelevant code in summary; add set output path step; add limit in harness run * re-organize code structure * modify limit in run harness * modify csv_to_html input path * modify needs in summary-nightly	2024-02-08 19:01:05 +08:00
pengyb2001	f63eba6c5a	change pr test machine	2024-02-06 23:35:18 +08:00
pengyb2001	e627727b4b	change download path	2024-02-06 21:12:51 +08:00
pengyb2001	2c4e610743	remove irrelevant code	2024-02-06 20:12:10 +08:00
pengyb2001	d11ef0d117	remove retry in llm install part	2024-02-06 14:25:26 +08:00
pengyb2001	94723bb0b1	add retry in run llm install part;test arc05 with llama2	2024-02-06 14:09:14 +08:00
pengyb2001	2c75b5b981	remove mistral in pr job	2024-02-06 13:51:57 +08:00
pengyb2001	5edefe7d8e	remove nightly summary job	2024-02-06 13:50:38 +08:00
pengyb2001	bc92dbf7be	remove stableml;change schedule;change storage method	2024-02-06 11:20:37 +08:00
yb-peng	738275761d	In llm-harness-evaluation, add new models and change schedule to nightly (#10072 ) * add new models and change schedule to nightly * correct syntax error * modify env set up and job * change label and schedule time * change schedule time * change label	2024-02-04 13:12:09 +08:00
Chen, Zhentao	cad5c2f516	fixed harness deps version (#9854 ) * fixed harness deps version * fix typo	2024-01-08 15:22:42 +08:00
Chen, Zhentao	4a98bfa5ae	fix harness manual run env typo (#9763 )	2023-12-22 18:42:35 +08:00
Chen, Zhentao	86a69e289c	fix harness runner label of manual trigger (#9754 ) * fix runner * update golden	2023-12-22 15:09:22 +08:00
Chen, Zhentao	b3647507c0	Fix harness workflow (#9704 ) * error when larger than 0.001 * fix env setup * fix typo * fix typo	2023-12-18 15:42:10 +08:00
Chen, Zhentao	972cdb9992	gsm8k OOM workaround (#9597 ) * update bigdl_llm.py * update the installation of harness * fix partial function * import ipex * force seq len in decrease order * put func outside class * move comments * default 'trust_remote_code' as True * Update llm-harness-evaluation.yml	2023-12-08 18:47:25 +08:00
Chen, Zhentao	8c8a27ded7	Add harness summary job (#9457 ) * format yml * add make_table_results * add summary job * add a job to print single result * upload full directory	2023-12-05 10:04:10 +08:00
Chen, Zhentao	29d5bb8df4	Harness workflow dispatch (#9591 ) * add set-matrix job * add workflow_dispatch * fix context * fix manual run * rename step * add quotes * add runner option * not required labels * add runner label to output * use double quote	2023-12-04 15:53:29 +08:00
Chen, Zhentao	9557aa9c21	Fix harness nightly (#9586 ) * update golden * loose the restriction of diff * only compare results when scheduled	2023-12-04 11:45:00 +08:00
Chen, Zhentao	cb228c70ea	Add harness nightly (#9552 ) * modify output_path as a directory * schedule nightly at 21 on Friday * add tasks and models for nightly * add accuracy regression * comment out if to test * mixed fp4 * for test * add missing delimiter * remove comma * fixed golden results * add mixed 4 golden result * add more options * add mistral results * get golden result of stable lm * move nightly scripts and results to test folder * add license * add fp8 stable lm golden * run on all available devices * trigger only when ready for review * fix new line * update golden * add mistral	2023-12-01 14:16:35 +08:00
Chen, Zhentao	4d7d5d4c59	Add 3 leaderboard tasks (#9566 ) * update leaderboard map * download model and dataset without overwritten * fix task drop * run on all available devices	2023-12-01 14:01:14 +08:00
Chen, Zhentao	c8e0c2ed48	Fixed dumped logs in harness (#9549 ) * install transformers==4.34.0 * modify output_path as a directory * add device and task to output dir parents	2023-11-30 12:47:56 +08:00
Chen, Zhentao	d19ca21957	patch bigdl-llm model to harness by binding instead of patch file (#9420 ) * add run_llb.py * fix args interpret * modify outputs * update workflow * add license * test mixed 4 bit * update readme * use autotokenizer * add timeout * refactor workflow file * fix working directory * fix env * throw exception if some jobs failed * improve terminal outputs * Disable var which cause the run stuck * fix unknown precision * fix key error * directly output config instead * rm harness submodule	2023-11-14 12:51:39 +08:00
Chen, Zhentao	f36d7b2d59	Fix harness stuck (#9435 ) * remove env to avoid being stuck * use small model for test	2023-11-13 15:29:53 +08:00
Chen, Zhentao	298b64217e	add auto triggered acc test (#9364 ) * add auto triggered acc test * use llama 7b instead * fix env * debug download * fix download prefix * add cut dirs * fix env of model path * fix dataset download * full job * source xpu env vars * use matrix to trigger model run * reset batch=1 * remove redirect * remove some trigger * add task matrix * add precision list * test llama-7b-chat * use /mnt/disk1 to store model and datasets * remove installation test * correct downloading path * fix HF vars * add bigdl-llm env vars * rename file * fix hf_home * fix script path * rename as harness evalution * rerun	2023-11-08 10:22:27 +08:00

45 commits