Commit graph

146 commits

Author SHA1 Message Date
Ruonan Wang
d6af4877dd
LLM: remove ipex.optimize for gpt-j (#10606)
* remove ipex.optimize

* fix

* fix
2024-04-01 12:21:49 +08:00
WeiguangHan
fbeb10c796
LLM: Set different env based on different Linux kernels (#10566) 2024-03-27 17:56:33 +08:00
Ruonan Wang
ea4bc450c4
LLM: add esimd sdp for pvc (#10543)
* add esimd sdp for pvc

* update

* fix

* fix batch
2024-03-26 19:04:40 +08:00
Shaojun Liu
c563b41491
add nightly_build workflow (#10533)
* add nightly_build workflow

* add create-job-status-badge action

* update

* update

* update

* update setup.py

* release

* revert
2024-03-26 12:47:38 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang (#30) 2024-03-25 10:06:02 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm

* rm python/llm/src/bigdl

* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
binbin Deng
85ef3f1d99 LLM: add empty cache in deepspeed autotp benchmark script (#10488) 2024-03-21 10:51:23 +08:00
Xiangyu Tian
5a5fd5af5b LLM: Add speculative benchmark on CPU/XPU (#10464)
Add speculative benchmark on CPU/XPU.
2024-03-21 09:51:06 +08:00
Xiangyu Tian
cbe24cc7e6 LLM: Enable BigDL IPEX Int8 (#10480)
Enable BigDL IPEX Int8
2024-03-20 15:59:54 +08:00
Jin Qiao
e41d556436 LLM: change fp16 benchmark to model.half (#10477)
* LLM: change fp16 benchmark to model.half

* fix
2024-03-20 13:38:39 +08:00
Jin Qiao
e9055c32f9 LLM: fix fp16 mem record in benchmark (#10461)
* LLM: fix fp16 mem record in benchmark

* change style
2024-03-19 16:17:23 +08:00
Jin Qiao
0451103a43 LLM: add int4+fp16 benchmark script for windows benchmarking (#10449)
* LLM: add fp16 for benchmark script

* remove transformer_int4_fp16_loadlowbit_gpu_win
2024-03-19 11:11:25 +08:00
Yuxuan Xia
f36224aac4 Fix ceval run.sh (#10410) 2024-03-14 10:57:25 +08:00
Wang, Jian4
0193f29411 LLM : Enable gguf float16 and Yuan2 model (#10372)
* enable float16

* add yun files

* enable yun

* enable set low_bit on yuan2

* update

* update license

* update generate

* update readme

* update python style

* update
2024-03-13 10:19:18 +08:00
Xiangyu Tian
0ded0b4b13 LLM: Enable BigDL IPEX optimization for int4 (#10319)
Enable BigDL IPEX optimization for int4
2024-03-12 17:08:50 +08:00
Lilac09
5809a3f5fe Add run-hbm.sh & add user guide for spr and hbm (#10357)
* add run-hbm.sh

* add spr and hbm guide

* only support quad mode

* only support quad mode

* update special cases

* update special cases
2024-03-12 16:15:27 +08:00
binbin Deng
5d996a5caf LLM: add benchmark script for deepspeed autotp on gpu (#10380) 2024-03-12 15:19:57 +08:00
WeiguangHan
17bdb1a60b LLM: add whisper models into nightly test (#10193)
* LLM: add whisper models into nightly test

* small fix

* small fix

* add more whisper models

* test all cases

* test specific cases

* collect the csv

* store the resut

* to html

* small fix

* small test

* test all cases

* modify whisper_csv_to_html
2024-03-11 20:00:47 +08:00
Yuxuan Xia
0c8d3c9830 Add C-Eval HTML report (#10294)
* Add C-Eval HTML report

* Fix C-Eval workflow pr trigger path

* Fix C-Eval workflow typos

* Add permissions to C-Eval workflow

* Fix C-Eval workflow typo

* Add pandas dependency

* Fix C-Eval workflow typo
2024-03-07 16:44:49 +08:00
Shaojun Liu
178eea5009 upload bigdl-llm wheel to sourceforge for backup (#10321)
* test: upload to sourceforge

* update scripts

* revert
2024-03-05 16:36:01 +08:00
WeiguangHan
fd81d66047 LLM: Compress some models to save space (#10315)
* LLM: compress some models to save space

* add deleted comments
2024-03-04 17:53:03 +08:00
Yuwen Hu
27d9a14989 [LLM] all-on-one update: memory optimize and streaming output (#10302)
* Memory saving for continous in-out pair run and add support for streaming output on MTL iGPU

* Small fix

* Small fix

* Add things back
2024-03-01 18:02:30 +08:00
Keyan (Kyrie) Zhang
59861f73e5 Add Deepseek-6.7B (#9991)
* Add new example Deepseek

* Add new example Deepseek

* Add new example Deepseek

* Add new example Deepseek

* Add new example Deepseek

* modify deepseek

* modify deepseek

* Add verified model in README

* Turn cpu_embedding=True in Deepseek example

---------

Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
2024-02-28 11:36:39 +08:00
hxsz1997
cba61a2909 Add html report of ppl (#10218)
* remove include and language option, select the corresponding dataset based on the model name in Run

* change the nightly test time

* change the nightly test time of harness and ppl

* save the ppl result to json file

* generate csv file and print table result

* generate html

* modify the way to get parent folder

* update html in parent folder

* add llm-ppl-summary and llm-ppl-summary-html

* modify echo single result

* remove download fp16.csv

* change model name of PR

* move ppl nightly related files to llm/test folder

* reformat

* seperate make_table from make_table_and_csv.py

* separate make_csv from make_table_and_csv.py

* update llm-ppl-html

* remove comment

* add Download fp16.results
2024-02-27 17:37:08 +08:00
Chen, Zhentao
213ef06691 fix readme 2024-02-24 00:38:08 +08:00
Chen, Zhentao
6fe5344fa6 separate make_csv from the file 2024-02-23 16:33:38 +08:00
Chen, Zhentao
bfa98666a6 fall back to make_table.py 2024-02-23 16:33:38 +08:00
Chen, Zhentao
f315c7f93a Move harness nightly related files to llm/test folder (#10209)
* move harness nightly files to test folder

* change workflow file path accordingly

* use arc01 when pr

* fix path

* fix fp16 csv path
2024-02-23 11:12:36 +08:00
Yuwen Hu
21de2613ce [LLM] Add model loading time record for all-in-one benchmark (#10201)
* Add model loading time record in csv for all-in-one benchmark

* Small fix

* Small fix to number after .
2024-02-22 13:57:18 +08:00
Yuxuan Xia
7cbc2429a6 Fix C-Eval ChatGLM loading issue (#10206)
* Add c-eval workflow and modify running files

* Modify the chatglm evaluator file

* Modify the ceval workflow for triggering test

* Modify the ceval workflow file

* Modify the ceval workflow file

* Modify ceval workflow

* Adjust the ceval dataset download

* Add ceval workflow dependencies

* Modify ceval workflow dataset download

* Add ceval test dependencies

* Add ceval test dependencies

* Correct the result print

* Fix the nightly test trigger time

* Fix ChatGLM loading issue
2024-02-22 10:00:43 +08:00
yb-peng
b1a97b71a9 Harness eval: Add is_last parameter and fix logical operator in highlight_vals (#10192)
* Add is_last parameter and fix logical operator in highlight_vals

* Add script to update HTML files in parent folder

* Add running update_html_in_parent_folder.py in summarize step

* Add licence info

* Remove update_html_in_parent_folder.py in Summarize the results for pull request
2024-02-21 14:45:32 +08:00
Chen, Zhentao
39d37bd042 upgrade harness package version in workflow (#10188)
* upgrade harness

* update readme
2024-02-21 11:21:30 +08:00
Yuwen Hu
001c13243e [LLM] Add support for low_low_bit benchmark on Windows GPU (#10167)
* Add support for low_low_bit performance test on Windows GPU

* Small fix

* Small fix

* Save memory during converting model process

* Drop the results for first time when loading in low bit on mtl igpu for better performance

* Small fix
2024-02-21 10:51:52 +08:00
yb-peng
de3dc609ee Modify harness evaluation workflow (#10174)
* Modify table head in harness

* Specify the file path of fp16.csv

* change run to run nightly and run pr to debug

* Modify the way to get fp16.csv to downloading from github

* Change the method to calculate diff in html table

* Change the method to calculate diff in html table

* Re-arrange job order

* Re-arrange job order

* Change limit

* Change fp16.csv  path

* Change highlight rules

* Change limit
2024-02-20 18:55:43 +08:00
hxsz1997
6e10d98a8d Fix some typos (#10175)
* add llm-ppl workflow

* update the DATASET_DIR

* test multiple precisions

* modify nightly test

* match the updated ppl code

* add matrix.include

* fix the include error

* update the include

* add more model

* update the precision of include

* update nightly time and add more models

* fix the workflow_dispatch description, change default model of pr and modify the env

* modify workflow_dispatch language options

* modify options

* modify language options

* modeify workflow_dispatch type

* modify type

* modify the type of language

* change seq_len type

* fix some typos

* revert changes to stress_test.txt
2024-02-20 14:14:53 +08:00
yb-peng
e31210ba00 Modify html table style and add fp16.csv in harness (#10169)
* Specify the version of pandas in harness evaluation workflow

* Specify the version of pandas in harness evaluation workflow

* Modify html table style and add fp16.csv in harness

* Modify comments
2024-02-19 18:13:40 +08:00
Yuxuan Xia
209122559a Add Ceval workflow and modify the result printing (#10140)
* Add c-eval workflow and modify running files

* Modify the chatglm evaluator file

* Modify the ceval workflow for triggering test

* Modify the ceval workflow file

* Modify the ceval workflow file

* Modify ceval workflow

* Adjust the ceval dataset download

* Add ceval workflow dependencies

* Modify ceval workflow dataset download

* Add ceval test dependencies

* Add ceval test dependencies

* Correct the result print
2024-02-19 17:06:53 +08:00
yb-peng
b4dc33def6 In harness-evaluation workflow, add statistical tables (#10118)
* chnage storage

* fix typo

* change label

* change label to arc03

* change needs in the last step

* add generate csv in harness/make_table_results.py

* modify needs in the last job

* add csv to html

* mfix path issue in llm-harness-summary-nightly

* modify output_path

* modify args in make_table_results.py

* modify make table command in summary

* change pr env label

* remove irrelevant code in summary; add set output path step; add limit in harness run

* re-organize code structure

* modify limit in run harness

* modify csv_to_html input path

* modify needs in summary-nightly
2024-02-08 19:01:05 +08:00
Yuxuan Xia
3832eb0ce0 Add ChatGLM C-Eval Evaluator (#10095)
* Add ChatGLM ceval evaluator

* Modify ChatGLM Evaluator Reference
2024-02-07 11:27:06 +08:00
Ovo233
2aaa21c41d LLM: Update ppl tests (#10092)
* update ppl tests

* use load_dataset api

* add exception handling

* add language argument

* address comments
2024-02-06 17:31:48 +08:00
dingbaorong
36c9442c6d Arc Stable version test (#10087)
* add batch_size in stable version test

* add batch_size in excludes

* add excludes for batch_size

* fix ci

* triger regression test

* fix xpu version

* disable ci

* address kai's comment

---------

Co-authored-by: Ariadne <wyn2000330@126.com>
2024-02-06 10:23:50 +08:00
WeiguangHan
c2e562d037 LLM: add batch_size to the csv and html (#10080)
* LLM: add batch_size to the csv and html

* small fix
2024-02-04 16:35:44 +08:00
WeiguangHan
d2d3f6b091 LLM: ensure the result of daily arc perf test (#10016)
* ensure the result of daily arc perf test

* small fix

* small fix

* small fix

* small fix

* small fix

* small fix

* small fix

* small fix

* small fix

* small fix

* concat more csvs

* small fix

* revert some files
2024-01-31 18:26:21 +08:00
Ovo233
226f398c2a fix ppl test errors (#10036) 2024-01-30 16:26:21 +08:00
Xin Qiu
13e61738c5 hide detail memory for each token in benchmark_utils.py (#10037) 2024-01-30 16:04:17 +08:00
Xin Qiu
7952bbc919 add conf batch_size to run_model (#10010) 2024-01-26 15:48:48 +08:00
Chen, Zhentao
762adc4f9d Reformat summary table (#9942)
* reformat the table

* refactor the file

* read result.json only
2024-01-25 23:49:00 +08:00
Ziteng Zhang
8b08ad408b Add batch_size in all_in_one (#9999)
Add batch_size in all_in_one, except run_native_int4
2024-01-25 17:43:49 +08:00
Chen, Zhentao
86055d76d5 fix optimize_model not working (#9995) 2024-01-25 16:39:05 +08:00
Chen, Zhentao
301425e377 harness tests on pvc multiple xpus (#9908)
* add run_multi_llb.py

* update readme

* add job hint
2024-01-23 13:20:37 +08:00