Commit graph

1652 commits

Author SHA1 Message Date
binbin Deng
97316bbb66 LLM: highlight transformers version requirement in mistral examples (#9380) 2023-11-08 16:05:03 +08:00
Ruonan Wang
7e8fb29b7c LLM: optimize QLoRA by reducing convert time (#9370) 2023-11-08 13:14:34 +08:00
Chen, Zhentao
298b64217e add auto triggered acc test (#9364)
* add auto triggered acc test

* use llama 7b instead

* fix env

* debug download

* fix download prefix

* add cut dirs

* fix env of model path

* fix dataset download

* full job

* source xpu env vars

* use matrix to trigger model run

* reset batch=1

* remove redirect

* remove some trigger

* add task matrix

* add precision list

* test llama-7b-chat

* use /mnt/disk1 to store model and datasets

* remove installation test

* correct downloading path

* fix HF vars

* add bigdl-llm env vars

* rename file

* fix hf_home

* fix script path

* rename as harness evalution

* rerun
2023-11-08 10:22:27 +08:00
Yishuo Wang
bfd9f88f0d [LLM] Use fp32 as dtype when batch_size <=8 and qtype is q4_0/q8_0/fp8 (#9365) 2023-11-08 09:54:53 +08:00
WeiguangHan
84ab614aab LLM: add more models and skip runtime error (#9349)
* add more models and skip runtime error

* upgrade transformers

* temporarily removed Mistral-7B-v0.1

* temporarily disable the upload of arc perf result
2023-11-08 09:45:53 +08:00
Heyang Sun
fae6db3ddc [LLM] refactor cpu low-bit forward logic (#9366)
* [LLM] refactor cpu low-bit forward logic

* fix style

* Update low_bit_linear.py

* Update low_bit_linear.py

* refine
2023-11-07 15:09:16 +08:00
Heyang Sun
af94058203 [LLM] Support CPU deepspeed distributed inference (#9259)
* [LLM] Support CPU Deepspeed distributed inference

* Update run_deepspeed.py

* Rename

* fix style

* add new codes

* refine

* remove annotated codes

* refine

* Update README.md

* refine doc and example code
2023-11-06 17:56:42 +08:00
Jin Qiao
f9bf5382ff Fix: add aquila2 in README (#9362) 2023-11-06 16:37:57 +08:00
Jin Qiao
e6b6afa316 LLM: add aquila2 model example (#9356) 2023-11-06 15:47:39 +08:00
Xin Qiu
1420e45cc0 Chatglm2 rope optimization on xpu (#9350) 2023-11-06 13:56:34 +08:00
Shaojun Liu
833e4dbc8d fix llm-performance-test-on-arc bug (#9357) 2023-11-06 10:00:25 +08:00
Yining Wang
9377b9c5d7 add CodeShell CPU example (#9345)
* add CodeShell CPU example

* fix some problems
2023-11-03 13:15:54 +08:00
Jason Dai
11a05641a4 Update readme (#9348) 2023-11-03 11:27:07 +08:00
ZehuaCao
ef83c3302e Use to test llm-performance on spr-perf (#9316)
* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update action.yml

* Create cpu-perf-test.yaml

* Update action.yml

* Update action.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml

* Update llm_performance_tests.yml
2023-11-03 11:17:16 +08:00
Yuwen Hu
a0150bb205 [LLM] Move embedding layer to CPU for iGPU inference (#9343)
* Move embedding layer to CPU for iGPU llm inference

* Empty cache after to cpu

* Remove empty cache as it seems to have some negative effect to first token
2023-11-03 11:13:45 +08:00
Cheen Hau, 俊豪
8f23fb04dc Add inference test for Whisper model on Arc (#9330)
* Add inference test for Whisper model

* Remove unnecessary inference time measurement
2023-11-03 10:15:52 +08:00
Zheng, Yi
63411dff75 Add cpu examples of WizardCoder (#9344)
* Add wizardcoder example

* Minor fixes
2023-11-02 20:22:43 +08:00
Lilac09
74a8ad32dc Add entry point to llm-serving-xpu (#9339)
* add entry point to llm-serving-xpu

* manually build

* manually build

* add entry point to llm-serving-xpu

* manually build

* add entry point to llm-serving-xpu

* add entry point to llm-serving-xpu

* add entry point to llm-serving-xpu
2023-11-02 16:31:07 +08:00
Ziteng Zhang
4df66f5cbc Update llm-finetune-lora-cpu dockerfile and readme
* Update README.md

* Update Dockerfile
2023-11-02 16:26:24 +08:00
dingbaorong
2e3bfbfe1f Add internlm_xcomposer cpu examples (#9337)
* add internlm-xcomposer cpu examples

* use chat

* some fixes

* add license

* address shengsheng's comments

* use demo.jpg
2023-11-02 15:50:02 +08:00
Jin Qiao
97a38958bd LLM: add CodeLlama CPU and GPU examples (#9338)
* LLM: add codellama CPU pytorch examples

* LLM: add codellama CPU transformers examples

* LLM: add codellama GPU transformers examples

* LLM: add codellama GPU pytorch examples

* LLM: add codellama in readme

* LLM: add LLaVA link
2023-11-02 15:34:25 +08:00
Chen, Zhentao
d4dffbdb62 Merge harness (#9319)
* add harness patch and llb script

* add readme

* add license

* use patch instead

* update readme

* rename tests to evaluation

* fix typo

* remove nano dependency

* add original harness link

* rename title of usage

* rename BigDLGPULM as BigDLLM

* empty commit to rerun job
2023-11-02 15:14:19 +08:00
Zheng, Yi
63b2556ce2 Add cpu examples of skywork (#9340) 2023-11-02 15:10:45 +08:00
dingbaorong
f855a864ef add llava gpu example (#9324)
* add llava gpu example

* use 7b model

* fix typo

* add in README
2023-11-02 14:48:29 +08:00
Ziteng Zhang
dd3cf2f153 LLM: Add python 3.10 & 3.11 UT
LLM: Add python 3.10 & 3.11 UT
2023-11-02 14:09:29 +08:00
Wang, Jian4
149146004f LLM: Add qlora finetunning CPU example (#9275)
* add qlora finetunning example

* update readme

* update example

* remove merge.py and update readme
2023-11-02 09:45:42 +08:00
Jasonzzt
d1bdc0ef72 spr & arc ut with python 3.9 & 3.10 & 3.11 2023-11-01 22:57:48 +08:00
Jasonzzt
687da21467 test 3.11 2023-11-01 19:14:53 +08:00
WeiguangHan
9722e811be LLM: add more models to the arc perf test (#9297)
* LLM: add more models to the arc perf test

* remove some old models

* install some dependencies
2023-11-01 16:56:32 +08:00
Jasonzzt
3c3329010d add conda update -n base conda 2023-11-01 16:36:35 +08:00
Jasonzzt
2fff0e8c21 use runner avx2 with linux 2023-11-01 16:28:29 +08:00
Jasonzzt
964a8e6dc1 update conda 2023-11-01 16:20:19 +08:00
Jin Qiao
6a128aee32 LLM: add ui for portable-zip (#9262) 2023-11-01 15:36:59 +08:00
Jasonzzt
cb7ef38e86 rerun 2023-11-01 15:30:34 +08:00
Jasonzzt
8f6e979fad test again 2023-11-01 15:10:11 +08:00
Jasonzzt
b66584f23b test 2023-11-01 14:51:23 +08:00
Jasonzzt
ba148ff3ff test py311 2023-11-01 14:08:49 +08:00
Yishuo Wang
726203d778 [LLM] Replace Embedding layer to fix it on CPU (#9254) 2023-11-01 13:58:10 +08:00
Jasonzzt
6f1cee90a4 test 2023-11-01 13:58:03 +08:00
Jasonzzt
d51821e264 test 2023-11-01 13:49:32 +08:00
Jasonzzt
7c7a7f2ec1 spr & arc ut with python3,9&3.10&3.11 2023-11-01 13:17:13 +08:00
Yang Wang
e1bc18f8eb fix import ipex problem (#9323)
* fix import ipex problem

* fix style
2023-10-31 20:31:34 -07:00
Cengguang Zhang
9f3d4676c6 LLM: Add qwen-vl gpu example (#9290)
* create qwen-vl gpu example.

* add readme.

* fix.

* change input figure and update outputs.

* add qwen-vl pytorch model gpu example.

* fix.

* add readme.
2023-11-01 11:01:39 +08:00
Ruonan Wang
7e73c354a6 LLM: decoupling bigdl-llm and bigdl-nano (#9306) 2023-11-01 11:00:54 +08:00
Yina Chen
2262ae4d13 Support MoFQ4 on arc (#9301)
* init

* update

* fix style

* fix style

* fix style

* meet comments
2023-11-01 10:59:46 +08:00
Jasonzzt
4f9fd0dffd arc-ut with 3.10 & 3.11 2023-11-01 10:51:57 +08:00
binbin Deng
8ef8e25178 LLM: improve response speed in multi-turn chat (#9299)
* update

* fix stop word and add chatglm2 support

* remove system prompt
2023-11-01 10:30:44 +08:00
Cengguang Zhang
d4ab5904ef LLM: Add python 3.10 llm UT (#9302)
* add py310 test for llm-unit-test.

* add py310 llm-unit-tests

* add llm-cpp-build-py310

* test

* test

* test.

* test

* test

* fix deactivate.

* fix

* fix.

* fix

* test

* test

* test

* add build chatglm for win.

* test.

* fix
2023-11-01 10:15:32 +08:00
WeiguangHan
03aa368776 LLM: add the comparison between latest arc perf test and last one (#9296)
* add the comparison between latest test and last one to html

* resolve some comments

* modify some code logics
2023-11-01 09:53:02 +08:00
Jin Qiao
96f8158fe2 LLM: adjust dolly v2 GPU example README (#9318) 2023-11-01 09:50:22 +08:00