Commit graph

1613 commits

Author SHA1 Message Date
Jasonzzt
cb7ef38e86 rerun 2023-11-01 15:30:34 +08:00
Jasonzzt
8f6e979fad test again 2023-11-01 15:10:11 +08:00
Jasonzzt
b66584f23b test 2023-11-01 14:51:23 +08:00
Jasonzzt
ba148ff3ff test py311 2023-11-01 14:08:49 +08:00
Jasonzzt
6f1cee90a4 test 2023-11-01 13:58:03 +08:00
Jasonzzt
d51821e264 test 2023-11-01 13:49:32 +08:00
Jasonzzt
7c7a7f2ec1 spr & arc ut with python3,9&3.10&3.11 2023-11-01 13:17:13 +08:00
Jasonzzt
4f9fd0dffd arc-ut with 3.10 & 3.11 2023-11-01 10:51:57 +08:00
Cengguang Zhang
d4ab5904ef LLM: Add python 3.10 llm UT (#9302)
* add py310 test for llm-unit-test.

* add py310 llm-unit-tests

* add llm-cpp-build-py310

* test

* test

* test.

* test

* test

* fix deactivate.

* fix

* fix.

* fix

* test

* test

* test

* add build chatglm for win.

* test.

* fix
2023-11-01 10:15:32 +08:00
WeiguangHan
03aa368776 LLM: add the comparison between latest arc perf test and last one (#9296)
* add the comparison between latest test and last one to html

* resolve some comments

* modify some code logics
2023-11-01 09:53:02 +08:00
Jin Qiao
96f8158fe2 LLM: adjust dolly v2 GPU example README (#9318) 2023-11-01 09:50:22 +08:00
Jin Qiao
c44c6dc43a LLM: add chatglm3 examples (#9305) 2023-11-01 09:50:05 +08:00
Xin Qiu
06447a3ef6 add malloc and intel openmp to llm deps (#9322) 2023-11-01 09:47:45 +08:00
Cheen Hau, 俊豪
d638b93dfe Add test script and workflow for qlora fine-tuning (#9295)
* Add test script and workflow for qlora fine-tuning

* Test fix export model

* Download dataset

* Fix export model issue

* Reduce number of training steps

* Rename script

* Correction
2023-11-01 09:39:53 +08:00
Lilac09
2c2bc959ad add tools into previously built images (#9317)
* modify Dockerfile

* manually build

* modify Dockerfile

* add chat.py into inference-xpu

* add benchmark into inference-cpu

* manually build

* add benchmark into inference-cpu

* add benchmark into inference-cpu

* add benchmark into inference-cpu

* add chat.py into inference-xpu

* add chat.py into inference-xpu

* change ADD to COPY in dockerfile

* fix dependency issue

* temporarily remove run-spr in llm-cpu

* temporarily remove run-spr in llm-cpu
2023-10-31 16:35:18 +08:00
Ruonan Wang
d383ee8efb LLM: update QLoRA example about accelerate version(#9314) 2023-10-31 13:54:38 +08:00
Lilac09
030edeecac Ubuntu upgrade: fix installation error (#9309)
* upgrade ubuntu version in llm-inference cpu image

* fix installation issue

* fix installation issue

* fix installation issue
2023-10-31 09:55:15 +08:00
Lilac09
5842f7530e upgrade ubuntu version in llm-inference cpu image (#9307) 2023-10-30 16:51:38 +08:00
Cheen Hau, 俊豪
cee9eaf542 [LLM] Fix llm arc ut oom (#9300)
* Move model to cpu after testing so that gpu memory is deallocated

* Add code comment

---------

Co-authored-by: sgwhat <ge.song@intel.com>
2023-10-30 14:38:34 +08:00
dingbaorong
ee5becdd61 use coco image in Qwen-VL (#9298)
* use coco image

* add output

* address yuwen's comments
2023-10-30 14:32:35 +08:00
Yang Wang
163d033616 Support qlora in CPU (#9233)
* support qlora in CPU

* revert example

* fix style
2023-10-27 14:01:15 -07:00
Yang Wang
8838707009 Add deepspeed autotp example readme (#9289)
* Add deepspeed autotp example readme

* change word
2023-10-27 13:04:38 -07:00
dingbaorong
f053688cad add cpu example of LLaVA (#9269)
* add LLaVA cpu example

* Small text updates

* update link

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2023-10-27 18:59:20 +08:00
Zheng, Yi
7f2ad182fd Minor Fixes of README (#9294) 2023-10-27 18:25:46 +08:00
Zheng, Yi
1bff54a378 Display demo.jpg n the README.md of HuggingFace Transformers Agent (#9293)
* Display demo.jpg

* remove demo.jpg
2023-10-27 18:00:03 +08:00
Zheng, Yi
a4a1dec064 Add a cpu example of HuggingFace Transformers Agent (use vicuna-7b-v1.5) (#9284)
* Add examples of HF Agent

* Modify folder structure and add link of demo.jpg

* Fixes of readme

* Merge applications and Applications
2023-10-27 17:14:12 +08:00
Guoqiong Song
aa319de5e8 Add streaming-llm using llama2 on CPU (#9265)
Enable streaming-llm to let model take infinite inputs, tested on desktop and SPR10
2023-10-27 01:30:39 -07:00
Yuwen Hu
21631209a9 [LLM] Skip CPU performance test for now (#9291)
* Skip llm cpu performance test for now

* Add install for wheel package
2023-10-27 12:55:04 +08:00
Ziteng Zhang
46ab0419b8 Merge pull request #9279 from Jasonzzt/main
Add bigdl-llm-finetune-cpu to manually_build to upload image on hub
2023-10-27 09:55:08 +08:00
Cheen Hau, 俊豪
6c9ae420a5 Add regression test for optimize_model on gpu (#9268)
* Add MPT model to transformer API test

* Add regression test for optimize_model on gpu.

---------

Co-authored-by: sgwhat <ge.song@intel.com>
2023-10-27 09:23:19 +08:00
Yuwen Hu
733df28a2b [LLM] Migrate Arc UT to another runner (#9286)
* Separate arc llm ut to another runner

* Add dependency for einops
2023-10-26 19:08:57 +08:00
Cengguang Zhang
44b5fcc190 LLM: fix pretraining_tp argument issue. (#9281) 2023-10-26 18:43:58 +08:00
WeiguangHan
6b2a32eba2 LLM: add missing function for PyTorch InternLM model (#9285) 2023-10-26 18:05:23 +08:00
Yina Chen
f879c48f98 fp8 convert use ggml code (#9277) 2023-10-26 17:03:29 +08:00
Ziteng Zhang
916ccc0779 Update manually_build_for_testing.yml 2023-10-26 16:26:14 +08:00
Ziteng Zhang
14a23015f8 Update manually_build.yml 2023-10-26 16:24:03 +08:00
Jasonzzt
37b1708d16 Add bigdl-llm-finetune-cpu to manually_build 2023-10-26 15:53:44 +08:00
Jasonzzt
f2d1f5349c Merge branch 'main' of https://github.com/Jasonzzt/BigDL 2023-10-26 15:46:50 +08:00
Lilac09
4ed7f066d3 add bigdl-llm-finetune-xpu to manually_build (#9278) 2023-10-26 15:30:05 +08:00
Yina Chen
e2264e8845 Support arc fp4 (#9266)
* support arc fp4

* fix style

* fix style
2023-10-25 15:42:48 +08:00
Cheen Hau, 俊豪
ab40607b87 Enable unit test workflow on Arc (#9213)
* Add gpu workflow and a transformers API inference test

* Set device-specific env variables in script instead of workflow

* Fix status message

---------

Co-authored-by: sgwhat <ge.song@intel.com>
2023-10-25 15:17:18 +08:00
SONG Ge
160a1e5ee7 [WIP] Add UT for Mistral Optimized Model (#9248)
* add ut for mistral model

* update

* fix model path

* upgrade transformers version for mistral model

* refactor correctness ut for mustral model

* refactor mistral correctness ut

* revert test_optimize_model back

* remove mistral from test_optimize_model

* add to revert transformers version back to 4.31.0
2023-10-25 15:14:17 +08:00
Yang Wang
067c7e8098 Support deepspeed AutoTP (#9230)
* Support deepspeed

* add test script

* refactor convert

* refine example

* refine

* refine example

* fix style

* refine example and adapte latest ipex

* fix style
2023-10-24 23:46:28 -07:00
Yining Wang
a6a8afc47e Add qwen vl CPU example (#9221)
* eee

* add examples on CPU and GPU

* fix

* fix

* optimize model examples

* add Qwen-VL-Chat CPU example

* Add Qwen-VL CPU example

* fix optimize problem

* fix error

* Have updated, benchmark fix removed from this PR

* add generate API example

* Change formats in qwen-vl example

* Add CPU transformer int4 example for qwen-vl

* fix repo-id problem and add Readme

* change picture url

* Remove unnecessary file

---------

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2023-10-25 13:22:12 +08:00
binbin Deng
f597a9d4f5 LLM: update perf test configuration (#9264) 2023-10-25 12:35:48 +08:00
binbin Deng
770ac70b00 LLM: add low_bit option in benchmark scripts (#9257) 2023-10-25 10:27:48 +08:00
WeiguangHan
ec9195da42 LLM: using html to visualize the perf result for Arc (#9228)
* LLM: using html to visualize the perf result for Arc

* deploy the html file

* add python license

* reslove some comments
2023-10-24 18:05:25 +08:00
Jin Qiao
90162264a3 LLM: replace torch.float32 with auto type (#9261) 2023-10-24 17:12:13 +08:00
SONG Ge
bd5215d75b [LLM] Reimplement chatglm fuse rms optimization (#9260)
* re-implement chatglm rope rms

* update
2023-10-24 16:35:12 +08:00
dingbaorong
5a2ce421af add cpu and gpu examples of flan-t5 (#9171)
* add cpu and gpu examples of flan-t5

* address yuwen's comments
* Add explanation  why we add modules to not convert
* Refine prompt and add a translation example
* Add a empty line at the end of files

* add examples of flan-t5 using optimize_mdoel api

* address bin's comments

* address binbin's comments

* add flan-t5 in readme
2023-10-24 15:24:01 +08:00