hxsz1997
b6234eb4e2
Add task in allinone ( #11226 )
...
* add task
* update prompt
* modify typos
* add more cases in summarize
* Make the summarize & QA prompt preprocessing as a util function
2024-06-06 17:22:40 +08:00
Wenjing Margaret Mao
231b968aba
Modify the check_results.py to support batch 2&4 ( #11133 )
...
* add batch 2&4 and exclude to perf_test
* modify the perf-test&437 yaml
* modify llm_performance_test.yml
* remove batch 4
* modify check_results.py to support batch 2&4
* change the batch_size format
* remove genxir
* add str(batch_size)
* change actual_test_casese in check_results file to support batch_size
* change html highlight
* less models to test html and html_path
* delete the moe model
* split batch html
* split
* use installing from pypi
* use installing from pypi - batch2
* revert cpp
* revert cpp
* merge two jobs into one, test batch_size in one job
* merge two jobs into one, test batch_size in one job
* change file directory in workflow
* try catch deal with odd file without batch_size
* modify pandas version
* change the dir
* organize the code
* organize the code
* remove Qwen-MOE
* modify based on feedback
* modify based on feedback
* modify based on second round of feedback
* modify based on second round of feedback + change run-arc.sh mode
* modify based on second round of feedback + revert config
* modify based on second round of feedback + revert config
* modify based on second round of feedback + remove comments
* modify based on second round of feedback + remove comments
* modify based on second round of feedback + revert arc-perf-test
* modify based on third round of feedback
* change error type
* change error type
* modify check_results.html
* split batch into two folders
* add all models
* move csv_name
* revert pr test
* revert pr test
---------
Co-authored-by: Yishuo Wang <yishuo.wang@intel.com>
2024-06-05 15:04:55 +08:00
Kai Huang
f93664147c
Update config.yaml ( #11208 )
...
* update config.yaml
* fix
* minor
* style
2024-06-04 19:58:18 +08:00
Yina Chen
711fa0199e
Fix fp6k phi3 ppl core dump ( #11204 )
2024-06-04 16:44:27 +08:00
Cengguang Zhang
3eb13ccd8c
LLM: fix input length condition in deepspeed all-in-one benchmark. ( #11185 )
2024-06-03 10:05:43 +08:00
ZehuaCao
4127b99ed6
Fix null pointer dereferences error. ( #11125 )
...
* delete unused function on tgi_server
* update
* update
* fix style
2024-05-30 16:16:10 +08:00
hxsz1997
62b2d8af6b
Add lookahead in all-in-one ( #11142 )
...
* add lookahead in allinone
* delete save to csv in run_transformer_int4_gpu
* change lookup to lookahead
* fix the error of add model.peak_memory
* Set transformer_int4_gpu as the default option
* add comment of transformer_int4_fp16_lookahead_gpu
2024-05-28 15:39:58 +08:00
Zhao Changmin
15d906a97b
Update linux igpu run script ( #11098 )
...
* update run script
2024-05-22 17:18:07 +08:00
Kai Huang
f63172ef63
Align ppl with llama.cpp ( #11055 )
...
* update script
* remove
* add header
* update readme
2024-05-22 16:43:11 +08:00
Wang, Jian4
74950a152a
Fix tgi_api_server error file name ( #11075 )
2024-05-20 16:48:40 +08:00
Wang, Jian4
d9f71f1f53
Update benchmark util for example using ( #11027 )
...
* mv benchmark_util.py to utils/
* remove
* update
2024-05-15 14:16:35 +08:00
binbin Deng
4053a6ef94
Update environment variable setting in AutoTP with arc ( #11018 )
2024-05-15 10:23:58 +08:00
Shaojun Liu
7f8c5b410b
Quickstart: Run PyTorch Inference on Intel GPU using Docker (on Linux or WSL) ( #10970 )
...
* add entrypoint.sh
* add quickstart
* remove entrypoint
* update
* Install related library of benchmarking
* update
* print out results
* update docs
* minor update
* update
* update quickstart
* update
* update
* update
* update
* update
* update
* add chat & example section
* add more details
* minor update
* rename quickstart
* update
* minor update
* update
* update config.yaml
* update readme
* use --gpu
* add tips
* minor update
* update
2024-05-14 12:58:31 +08:00
ZehuaCao
99255fe36e
fix ppl ( #10996 )
2024-05-13 13:57:19 +08:00
Xin Qiu
dfa3147278
update ( #10944 )
2024-05-08 14:28:05 +08:00
Cengguang Zhang
0edef1f94c
LLM: add min_new_tokens to all in one benchmark. ( #10911 )
2024-05-06 09:32:59 +08:00
Yuwen Hu
1a8a93d5e0
Further fix nightly perf ( #10901 )
2024-04-28 10:18:58 +08:00
Yuwen Hu
ddfdaec137
Fix nightly perf ( #10899 )
...
* Fix nightly perf by adding default value in benchmark for use_fp16_torch_dtype
* further fixes
2024-04-28 09:39:29 +08:00
binbin Deng
f51bf018eb
Add benchmark script for pipeline parallel inference ( #10873 )
2024-04-26 15:28:11 +08:00
Cengguang Zhang
cd369c2715
LLM: add device id to benchmark utils. ( #10877 )
2024-04-25 14:01:51 +08:00
Cengguang Zhang
eb39c61607
LLM: add min new token to perf test. ( #10869 )
2024-04-24 14:32:02 +08:00
yb-peng
c9dee6cd0e
Update 8192.txt ( #10824 )
...
* Update 8192.txt
* Update 8192.txt with original text
2024-04-23 14:02:09 +08:00
Wang, Jian4
23c6a52fb0
LLM: Fix ipex torchscript=True error ( #10832 )
...
* remove
* update
* remove torchscript
2024-04-22 15:53:09 +08:00
Kai Huang
053ec30737
Transformers ppl evaluation on wikitext ( #10784 )
...
* tranformers code
* cache
2024-04-18 15:27:18 +08:00
hxsz1997
0d518aab8d
Merge pull request #10697 from MargarettMao/ceval
...
combine english and chinese, remove nan
2024-04-12 14:37:47 +08:00
jenniew
cdbb1de972
Mark Color Modification
2024-04-12 14:00:50 +08:00
yb-peng
2685c41318
Modify all-in-one benchmark ( #10726 )
...
* Update 8192 prompt in all-in-one
* Add cpu_embedding param for linux api
* Update run.py
* Update README.md
2024-04-11 13:38:50 +08:00
Wenjing Margaret Mao
289cc99cd6
Update README.md ( #10700 )
...
Edit "summarize the results"
2024-04-09 16:01:12 +08:00
Wenjing Margaret Mao
d3116de0db
Update README.md ( #10701 )
...
edit "summarize the results"
2024-04-09 15:50:25 +08:00
Chen, Zhentao
d59e0cce5c
Migrate harness to ipexllm ( #10703 )
...
* migrate to ipexlm
* fix workflow
* fix run_multi
* fix precision map
* rename ipexlm to ipexllm
* rename bigdl to ipex in comments
2024-04-09 15:48:53 +08:00
jenniew
591bae092c
combine english and chinese, remove nan
2024-04-08 19:37:51 +08:00
yb-peng
2d88bb9b4b
add test api transformer_int4_fp16_gpu ( #10627 )
...
* add test api transformer_int4_fp16_gpu
* update config.yaml and README.md in all-in-one
* modify run.py in all-in-one
* re-order test-api
* re-order test-api in config
* modify README.md in all-in-one
* modify README.md in all-in-one
* modify config.yaml
---------
Co-authored-by: pengyb2001 <arda@arda-arc21.sh.intel.com>
Co-authored-by: ivy-lv11 <zhicunlv@gmail.com>
2024-04-07 15:47:17 +08:00
Wang, Jian4
9ad4b29697
LLM: CPU benchmark using tcmalloc ( #10675 )
2024-04-07 14:17:01 +08:00
binbin Deng
d9a1153b4e
LLM: upgrade deepspeed in AutoTP on GPU ( #10647 )
2024-04-07 14:05:19 +08:00
binbin Deng
27be448920
LLM: add cpu_embedding and peak memory record for deepspeed autotp script ( #10621 )
2024-04-02 17:32:50 +08:00
Shaojun Liu
a10f5a1b8d
add python style check ( #10620 )
...
* add python style check
* fix style checks
* update runner
* add ipex-llm-finetune-qlora-cpu-k8s to manually_build workflow
* update tag to 2.1.0-SNAPSHOT
2024-04-02 16:17:56 +08:00
Ruonan Wang
d6af4877dd
LLM: remove ipex.optimize for gpt-j ( #10606 )
...
* remove ipex.optimize
* fix
* fix
2024-04-01 12:21:49 +08:00
WeiguangHan
fbeb10c796
LLM: Set different env based on different Linux kernels ( #10566 )
2024-03-27 17:56:33 +08:00
Ruonan Wang
ea4bc450c4
LLM: add esimd sdp for pvc ( #10543 )
...
* add esimd sdp for pvc
* update
* fix
* fix batch
2024-03-26 19:04:40 +08:00
Shaojun Liu
c563b41491
add nightly_build workflow ( #10533 )
...
* add nightly_build workflow
* add create-job-status-badge action
* update
* update
* update
* update setup.py
* release
* revert
2024-03-26 12:47:38 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang ( #30 )
2024-03-25 10:06:02 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm ( #24 )
...
* Rename bigdl/llm to ipex_llm
* rm python/llm/src/bigdl
* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
binbin Deng
85ef3f1d99
LLM: add empty cache in deepspeed autotp benchmark script ( #10488 )
2024-03-21 10:51:23 +08:00
Xiangyu Tian
5a5fd5af5b
LLM: Add speculative benchmark on CPU/XPU ( #10464 )
...
Add speculative benchmark on CPU/XPU.
2024-03-21 09:51:06 +08:00
Xiangyu Tian
cbe24cc7e6
LLM: Enable BigDL IPEX Int8 ( #10480 )
...
Enable BigDL IPEX Int8
2024-03-20 15:59:54 +08:00
Jin Qiao
e41d556436
LLM: change fp16 benchmark to model.half ( #10477 )
...
* LLM: change fp16 benchmark to model.half
* fix
2024-03-20 13:38:39 +08:00
Jin Qiao
e9055c32f9
LLM: fix fp16 mem record in benchmark ( #10461 )
...
* LLM: fix fp16 mem record in benchmark
* change style
2024-03-19 16:17:23 +08:00
Jin Qiao
0451103a43
LLM: add int4+fp16 benchmark script for windows benchmarking ( #10449 )
...
* LLM: add fp16 for benchmark script
* remove transformer_int4_fp16_loadlowbit_gpu_win
2024-03-19 11:11:25 +08:00
Yuxuan Xia
f36224aac4
Fix ceval run.sh ( #10410 )
2024-03-14 10:57:25 +08:00
Wang, Jian4
0193f29411
LLM : Enable gguf float16 and Yuan2 model ( #10372 )
...
* enable float16
* add yun files
* enable yun
* enable set low_bit on yuan2
* update
* update license
* update generate
* update readme
* update python style
* update
2024-03-13 10:19:18 +08:00