Yuxuan Xia
209122559a
Add Ceval workflow and modify the result printing ( #10140 )
...
* Add c-eval workflow and modify running files
* Modify the chatglm evaluator file
* Modify the ceval workflow for triggering test
* Modify the ceval workflow file
* Modify the ceval workflow file
* Modify ceval workflow
* Adjust the ceval dataset download
* Add ceval workflow dependencies
* Modify ceval workflow dataset download
* Add ceval test dependencies
* Add ceval test dependencies
* Correct the result print
2024-02-19 17:06:53 +08:00
yb-peng
50fa004ba5
Specify the version of pandas in harness evaluation workflow ( #10159 )
...
* Specify the version of pandas in harness evaluation workflow
* Specify the version of pandas in harness evaluation workflow
2024-02-19 16:27:08 +08:00
Zhao Changmin
f8730e8dc1
Skip rescale rwkv linear when load_low_bit ( #10164 )
...
* rwkv_ld
2024-02-19 15:56:42 +08:00
Heyang Sun
3e2af5ec0a
Fix IPEX Baichuan Speculative ( #10162 )
...
* Fix IPEX Baichuan Speculative
* compatible with 13B
* Update speculative.py
2024-02-19 15:27:34 +08:00
Cheen Hau, 俊豪
6952847f68
GPU install doc - add pip install oneAPI for windows ( #10157 )
...
* Add instructions for pip install oneAPI for windows
* Improve clarity
* Format fix
* Fix
* Fix in runtime configuration
2024-02-19 14:46:08 +08:00
Yina Chen
23c91cdce6
[LLM] Add min_step_draft in speculative decoding ( #10142 )
...
* Fix gptj kvcache & position id
* Add min_draft_tokens in speculative decoding
* fix style
* update
2024-02-19 14:31:41 +08:00
Chen, Zhentao
14ba2c5135
Harness: remove deprecated files ( #10165 )
2024-02-19 14:27:49 +08:00
Wang, Jian4
d3591383d5
LLM : Add CPU chatglm3 speculative example ( #10004 )
...
* init chatglm
* update
* update
2024-02-19 13:38:52 +08:00
Wang, Jian4
f2417e083c
LLM: enable chatglm3-6b target_model ipex ( #10085 )
...
* init
* always make casual_mask
* not return last tensor
* update
* optimize_model = False
* enable optimized=False
* enable optimized_model=true
* speed_up ipex target_model
* remove if True
* use group_size
* update python style
* update
* update
2024-02-19 13:38:32 +08:00
Heyang Sun
177273c1a4
IPEX Speculative Support for Baichuan2 7B ( #10112 )
...
* IPEX Speculative Support for Baichuan2 7B
* fix license problems
* refine
2024-02-19 09:12:57 +08:00
Jason Dai
6f38e604de
Fix README.md ( #10156 )
2024-02-18 21:51:40 +08:00
Shaojun Liu
7a3a20cf5b
Fix: GitHub-owned GitHubAction not pinned by hash ( #10152 )
2024-02-18 16:49:28 +08:00
Shaojun Liu
c3daacec6d
Fix Token Permission issues ( #10151 )
...
Co-authored-by: Your Name <Your Email>
2024-02-18 13:23:54 +08:00
Yina Chen
1508d6b089
Fix gptj kvcache & position id ( #10141 )
2024-02-18 10:02:49 +08:00
Kai Huang
7400401706
Update gpu pip install oneapi doc ( #10137 )
...
* fix link
* fix
* fix
* minor
2024-02-09 11:27:40 +08:00
yb-peng
b7c5104d98
remove limit in harness run ( #10139 )
2024-02-09 11:20:53 +08:00
yb-peng
b4dc33def6
In harness-evaluation workflow, add statistical tables ( #10118 )
...
* chnage storage
* fix typo
* change label
* change label to arc03
* change needs in the last step
* add generate csv in harness/make_table_results.py
* modify needs in the last job
* add csv to html
* mfix path issue in llm-harness-summary-nightly
* modify output_path
* modify args in make_table_results.py
* modify make table command in summary
* change pr env label
* remove irrelevant code in summary; add set output path step; add limit in harness run
* re-organize code structure
* modify limit in run harness
* modify csv_to_html input path
* modify needs in summary-nightly
2024-02-08 19:01:05 +08:00
Shaojun Liu
c2378a9546
Fix code scanning issues ( #10129 )
...
* Fix code scanning issues
* update oneccl_bind_pt link
* update
* update
---------
Co-authored-by: Your Name <Your Email>
2024-02-08 17:19:44 +08:00
Yishuo Wang
4d33aac7f9
quick fix qwen2 fp8 kv cache ( #10135 )
2024-02-08 17:04:59 +08:00
Cengguang Zhang
39d90839aa
LLM: add quantize kv cache for llama. ( #10086 )
...
* feat: add quantize kv cache for llama.
* fix style.
* add quantized attention forward function.
* revert style.
* fix style.
* fix style.
* update quantized kv cache and add quantize_qkv
* fix style.
* fix style.
* optimize quantize kv cache.
* fix style.
2024-02-08 16:49:22 +08:00
Yishuo Wang
d848efe17c
add quantize kv cache support for qwen2 ( #10134 )
2024-02-08 16:17:21 +08:00
SONG Ge
3f79128ed7
[LLM] Enable kv_cache optimization for Qwen2 on transformers-v4.37.0 ( #10131 )
...
* add support for kv_cache optimization on transformers-v4.37.0
* enable attention forward
* style fix
* disable rotary for now
2024-02-08 14:20:26 +08:00
Ruonan Wang
063dc145ac
LLM: basic support for q2k ( #10132 )
...
* basic support for q2k
* fix style
2024-02-08 13:52:01 +08:00
binbin Deng
11fe5a87ec
LLM: add Modelscope model example ( #10126 )
2024-02-08 11:18:07 +08:00
Cengguang Zhang
0cf6a12691
LLM: add default torch_dtype for fp16. ( #10124 )
...
* set default torch_dtype for fp16.
* fix style.
* bug fix.
* update bug fix.
2024-02-08 10:24:16 +08:00
Yishuo Wang
1aa0c623ce
disable fused layer norm on UHD ( #10130 )
2024-02-08 10:20:01 +08:00
Yuwen Hu
a8450fc300
[LLM] Support MLP optimization for Qwen1.5 ( #10123 )
2024-02-08 09:15:34 +08:00
Yuwen Hu
81ed65fbe7
[LLM] Add qwen1.5-7B in iGPU perf ( #10127 )
...
* Add qwen1.5 test config yaml with transformers 4.37.0
* Update for yaml file
2024-02-07 22:31:20 +08:00
Cheen Hau, 俊豪
a7f9a13f6e
Enhance gpu doc with PIP install oneAPI ( #10109 )
...
* Add pip install oneapi instructions
* Fixes
* Add instruction for oneapi2023
* Runtime config
* Fixes
* Remove "Currently, oneAPI installed with .. "
* Add pip package version for oneAPI 2024
* Reviewer comments
* Fix errors
2024-02-07 21:14:15 +08:00
hxsz1997
b4c327ea78
Llm ppl workflow bug fix ( #10128 )
...
* add llm-ppl workflow
* update the DATASET_DIR
* test multiple precisions
* modify nightly test
* match the updated ppl code
* add matrix.include
* fix the include error
* update the include
* add more model
* update the precision of include
* update nightly time and add more models
* fix the workflow_dispatch description, change default model of pr and modify the env
* modify workflow_dispatch language options
* modify options
* modify language options
* modeify workflow_dispatch type
* modify type
* modify the type of language
* change seq_len type
2024-02-07 18:48:14 +08:00
hxsz1997
76bd792ff1
Fix llm ppl workflow workflow_dispatch bugs ( #10125 )
...
* add llm-ppl workflow
* update the DATASET_DIR
* test multiple precisions
* modify nightly test
* match the updated ppl code
* add matrix.include
* fix the include error
* update the include
* add more model
* update the precision of include
* update nightly time and add more models
* fix the workflow_dispatch description, change default model of pr and modify the env
* modify workflow_dispatch language options
* modify options
* modify language options
2024-02-07 17:41:44 +08:00
Jin Qiao
0fcfbfaf6f
LLM: add rwkv5 eagle GPU HF example ( #10122 )
...
* LLM: add rwkv5 eagle example
* fix
* fix link
2024-02-07 16:58:29 +08:00
Shaojun Liu
9f5a86f9db
fix OpenSSF Token-Permissions issues ( #10121 )
...
Co-authored-by: Your Name <Your Email>
2024-02-07 16:51:10 +08:00
binbin Deng
925f82107e
LLM: support models hosted by modelscope ( #10106 )
2024-02-07 16:46:36 +08:00
hxsz1997
1710ecb990
Add llm-ppl workflow ( #10074 )
...
* add llm-ppl workflow
* update the DATASET_DIR
* test multiple precisions
* modify nightly test
* match the updated ppl code
* add matrix.include
* fix the include error
* update the include
* add more model
* update the precision of include
* update nightly time and add more models
* fix the workflow_dispatch description, change default model of pr and modify the env
2024-02-07 16:29:57 +08:00
binbin Deng
c1ec3d8921
LLM: update FAQ about too many open files ( #10119 )
2024-02-07 15:02:24 +08:00
Keyan (Kyrie) Zhang
2e80701f58
Unit test on final logits and the logits of the last attention layer ( #10093 )
...
* Add unit test on final logits and attention
* Add unit test on final logits and attention
* Modify unit test on final logits and attention
2024-02-07 14:25:36 +08:00
Yuxuan Xia
3832eb0ce0
Add ChatGLM C-Eval Evaluator ( #10095 )
...
* Add ChatGLM ceval evaluator
* Modify ChatGLM Evaluator Reference
2024-02-07 11:27:06 +08:00
Shaojun Liu
5e9710cec4
Update threshold for cpu stable version tests ( #10108 )
...
* update threshold
* update
* test
* update
* update
* revert
* revert
---------
Co-authored-by: Your Name <Your Email>
2024-02-07 11:21:23 +08:00
Jin Qiao
63050c954d
fix ( #10117 )
2024-02-07 11:05:11 +08:00
Jin Qiao
d3d2ee1b63
LLM: add speech T5 GPU example ( #10090 )
...
* add speech t5 example
* fix
* fix
2024-02-07 10:50:02 +08:00
Jin Qiao
2f4c754759
LLM: add bark gpu example ( #10091 )
...
* add bark gpu example
* fix
* fix license
* add bark
* add example
* fix
* another way
2024-02-07 10:47:11 +08:00
Xiangyu Tian
8953acd7d6
[LLM] Fix log condition for BIGDL_OPT_IPEX ( #10115 )
...
Fix log condition for BIGDL_OPT_IPEX
2024-02-07 10:27:10 +08:00
yb-peng
3f60e9df89
Merge pull request #10101 from pengyb2001/eval_stat
...
Modify harness evaluation workflow
2024-02-07 00:02:57 +08:00
pengyb2001
f63eba6c5a
change pr test machine
2024-02-06 23:35:18 +08:00
pengyb2001
e627727b4b
change download path
2024-02-06 21:12:51 +08:00
pengyb2001
2c4e610743
remove irrelevant code
2024-02-06 20:12:10 +08:00
Jason Dai
e2233dddef
Update README ( #10111 )
2024-02-06 19:29:07 +08:00
SONG Ge
0eccb94d75
remove text-generation-webui from bigdl repo ( #10107 )
2024-02-06 17:46:52 +08:00
Ovo233
2aaa21c41d
LLM: Update ppl tests ( #10092 )
...
* update ppl tests
* use load_dataset api
* add exception handling
* add language argument
* address comments
2024-02-06 17:31:48 +08:00