Jason Dai
b3178d449f
Update README.md ( #9525 )
2023-11-23 21:45:20 +08:00
Jason Dai
82898a4203
Update GPU example README ( #9524 )
2023-11-23 21:20:26 +08:00
Jason Dai
064848028f
Update README.md ( #9523 )
2023-11-23 21:16:21 +08:00
Guancheng Fu
bf579507c2
Integrate vllm ( #9310 )
...
* done
* Rename structure
* add models
* Add structure/sampling_params,sequence
* add input_metadata
* add outputs
* Add policy,logger
* add and update
* add parallelconfig back
* core/scheduler.py
* Add llm_engine.py
* Add async_llm_engine.py
* Add tested entrypoint
* fix minor error
* Fix everything
* fix kv cache view
* fix
* fix
* fix
* format&refine
* remove logger from repo
* try to add token latency
* remove logger
* Refine config.py
* finish worker.py
* delete utils.py
* add license
* refine
* refine sequence.py
* remove sampling_params.py
* finish
* add license
* format
* add license
* refine
* refine
* Refine line too long
* remove exception
* so dumb style-check
* refine
* refine
* refine
* refine
* refine
* refine
* add README
* refine README
* add warning instead error
* fix padding
* add license
* format
* format
* format fix
* Refine vllm dependency (#1 )
vllm dependency clear
* fix licence
* fix format
* fix format
* fix
* adapt LLM engine
* fix
* add license
* fix format
* fix
* Moving README.md to the correct position
* Fix readme.md
* done
* guide for adding models
* fix
* Fix README.md
* Add new model readme
* remove ray-logic
* refactor arg_utils.py
* remove distributed_init_method logic
* refactor entrypoints
* refactor input_metadata
* refactor model_loader
* refactor utils.py
* refactor models
* fix api server
* remove vllm.stucture
* revert by txy 1120
* remove utils
* format
* fix license
* add bigdl model
* Refer to a specfic commit
* Change code base
* add comments
* add async_llm_engine comment
* refine
* formatted
* add worker comments
* add comments
* add comments
* fix style
* add changes
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com>
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com>
2023-11-23 16:46:45 +08:00
Heyang Sun
48fbb1eb94
support ccl (MPI) distributed mode in alpaca_qlora_finetuning_cpu ( #9507 )
2023-11-23 10:58:09 +08:00
Heyang Sun
11fa5a8a0e
Fix QLoRA CPU dispatch_model issue about accelerate ( #9506 )
2023-11-23 08:41:25 +08:00
Heyang Sun
1453046938
install bigdl-llm in deepspeed cpu inference example ( #9508 )
2023-11-23 08:39:21 +08:00
binbin Deng
86743fb57b
LLM: fix transformers version in CPU finetuning example ( #9511 )
2023-11-22 15:53:07 +08:00
binbin Deng
1a2129221d
LLM: support resume from checkpoint in Alpaca QLoRA ( #9502 )
2023-11-22 13:49:14 +08:00
Ruonan Wang
076d106ef5
LLM: GPU QLoRA update to bf16 to accelerate gradient checkpointing ( #9499 )
...
* update to bf16 to accelerate gradient checkpoint
* add utils and fix ut
2023-11-21 17:08:36 +08:00
binbin Deng
b7ae572ac3
LLM: update Alpaca QLoRA finetuning example on GPU ( #9492 )
2023-11-21 14:22:19 +08:00
Wang, Jian4
c5cb3ab82e
LLM : Add CPU alpaca qlora example ( #9469 )
...
* init
* update xpu to cpu
* update
* update readme
* update example
* update
* add refer
* add guide to train different datasets
* update readme
* update
2023-11-21 09:19:58 +08:00
binbin Deng
96fd26759c
LLM: fix QLoRA finetuning example on CPU ( #9489 )
2023-11-20 14:31:24 +08:00
binbin Deng
3dac21ac7b
LLM: add more example usages about alpaca qlora on different hardware ( #9458 )
2023-11-17 09:56:43 +08:00
Heyang Sun
921b263d6a
update deepspeed install and run guide in README ( #9441 )
2023-11-17 09:11:39 +08:00
Yina Chen
d5263e6681
Add awq load support ( #9453 )
...
* Support directly loading GPTQ models from huggingface
* fix style
* fix tests
* change example structure
* address comments
* fix style
* init
* address comments
* add examples
* fix style
* fix style
* fix style
* fix style
* update
* remove
* meet comments
* fix style
---------
Co-authored-by: Yang Wang <yang3.wang@intel.com>
2023-11-16 14:06:25 +08:00
Ruonan Wang
0f82b8c3a0
LLM: update qlora example ( #9454 )
...
* update qlora example
* fix loss=0
2023-11-15 09:24:15 +08:00
Yang Wang
51d07a9fd8
Support directly loading gptq models from huggingface ( #9391 )
...
* Support directly loading GPTQ models from huggingface
* fix style
* fix tests
* change example structure
* address comments
* fix style
* address comments
2023-11-13 20:48:12 -08:00
Heyang Sun
da6bbc8c11
fix deepspeed dependencies to install ( #9400 )
...
* remove reductant parameter from deepspeed install
* Update install.sh
* Update install.sh
2023-11-13 16:42:50 +08:00
Zheng, Yi
9b5d0e9c75
Add examples for Yi-6B ( #9421 )
2023-11-13 10:53:15 +08:00
Wang, Jian4
ac7fbe77e2
Update qlora readme ( #9416 )
2023-11-12 19:29:29 +08:00
Zheng, Yi
0674146cfb
Add cpu and gpu examples of distil-whisper ( #9374 )
...
* Add distil-whisper examples
* Fixes based on comments
* Minor fixes
---------
Co-authored-by: Ariadne330 <wyn2000330@126.com>
2023-11-10 16:09:55 +08:00
Ziteng Zhang
ad81b5d838
Update qlora README.md ( #9422 )
2023-11-10 15:19:25 +08:00
Heyang Sun
b23b91407c
fix llm-init on deepspeed missing lib ( #9419 )
2023-11-10 13:51:24 +08:00
dingbaorong
36fbe2144d
Add CPU examples of fuyu ( #9393 )
...
* add fuyu cpu examples
* add gpu example
* add comments
* add license
* remove gpu example
* fix inference time
2023-11-09 15:29:19 +08:00
binbin Deng
54d95e4907
LLM: add alpaca qlora finetuning example ( #9276 )
2023-11-08 16:25:17 +08:00
binbin Deng
97316bbb66
LLM: highlight transformers version requirement in mistral examples ( #9380 )
2023-11-08 16:05:03 +08:00
Heyang Sun
af94058203
[LLM] Support CPU deepspeed distributed inference ( #9259 )
...
* [LLM] Support CPU Deepspeed distributed inference
* Update run_deepspeed.py
* Rename
* fix style
* add new codes
* refine
* remove annotated codes
* refine
* Update README.md
* refine doc and example code
2023-11-06 17:56:42 +08:00
Jin Qiao
e6b6afa316
LLM: add aquila2 model example ( #9356 )
2023-11-06 15:47:39 +08:00
Yining Wang
9377b9c5d7
add CodeShell CPU example ( #9345 )
...
* add CodeShell CPU example
* fix some problems
2023-11-03 13:15:54 +08:00
Zheng, Yi
63411dff75
Add cpu examples of WizardCoder ( #9344 )
...
* Add wizardcoder example
* Minor fixes
2023-11-02 20:22:43 +08:00
dingbaorong
2e3bfbfe1f
Add internlm_xcomposer cpu examples ( #9337 )
...
* add internlm-xcomposer cpu examples
* use chat
* some fixes
* add license
* address shengsheng's comments
* use demo.jpg
2023-11-02 15:50:02 +08:00
Jin Qiao
97a38958bd
LLM: add CodeLlama CPU and GPU examples ( #9338 )
...
* LLM: add codellama CPU pytorch examples
* LLM: add codellama CPU transformers examples
* LLM: add codellama GPU transformers examples
* LLM: add codellama GPU pytorch examples
* LLM: add codellama in readme
* LLM: add LLaVA link
2023-11-02 15:34:25 +08:00
Zheng, Yi
63b2556ce2
Add cpu examples of skywork ( #9340 )
2023-11-02 15:10:45 +08:00
dingbaorong
f855a864ef
add llava gpu example ( #9324 )
...
* add llava gpu example
* use 7b model
* fix typo
* add in README
2023-11-02 14:48:29 +08:00
Wang, Jian4
149146004f
LLM: Add qlora finetunning CPU example ( #9275 )
...
* add qlora finetunning example
* update readme
* update example
* remove merge.py and update readme
2023-11-02 09:45:42 +08:00
Cengguang Zhang
9f3d4676c6
LLM: Add qwen-vl gpu example ( #9290 )
...
* create qwen-vl gpu example.
* add readme.
* fix.
* change input figure and update outputs.
* add qwen-vl pytorch model gpu example.
* fix.
* add readme.
2023-11-01 11:01:39 +08:00
Jin Qiao
96f8158fe2
LLM: adjust dolly v2 GPU example README ( #9318 )
2023-11-01 09:50:22 +08:00
Jin Qiao
c44c6dc43a
LLM: add chatglm3 examples ( #9305 )
2023-11-01 09:50:05 +08:00
Ruonan Wang
d383ee8efb
LLM: update QLoRA example about accelerate version( #9314 )
2023-10-31 13:54:38 +08:00
dingbaorong
ee5becdd61
use coco image in Qwen-VL ( #9298 )
...
* use coco image
* add output
* address yuwen's comments
2023-10-30 14:32:35 +08:00
Yang Wang
8838707009
Add deepspeed autotp example readme ( #9289 )
...
* Add deepspeed autotp example readme
* change word
2023-10-27 13:04:38 -07:00
dingbaorong
f053688cad
add cpu example of LLaVA ( #9269 )
...
* add LLaVA cpu example
* Small text updates
* update link
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2023-10-27 18:59:20 +08:00
Zheng, Yi
7f2ad182fd
Minor Fixes of README ( #9294 )
2023-10-27 18:25:46 +08:00
Zheng, Yi
1bff54a378
Display demo.jpg n the README.md of HuggingFace Transformers Agent ( #9293 )
...
* Display demo.jpg
* remove demo.jpg
2023-10-27 18:00:03 +08:00
Zheng, Yi
a4a1dec064
Add a cpu example of HuggingFace Transformers Agent (use vicuna-7b-v1.5) ( #9284 )
...
* Add examples of HF Agent
* Modify folder structure and add link of demo.jpg
* Fixes of readme
* Merge applications and Applications
2023-10-27 17:14:12 +08:00
Guoqiong Song
aa319de5e8
Add streaming-llm using llama2 on CPU ( #9265 )
...
Enable streaming-llm to let model take infinite inputs, tested on desktop and SPR10
2023-10-27 01:30:39 -07:00
Yang Wang
067c7e8098
Support deepspeed AutoTP ( #9230 )
...
* Support deepspeed
* add test script
* refactor convert
* refine example
* refine
* refine example
* fix style
* refine example and adapte latest ipex
* fix style
2023-10-24 23:46:28 -07:00
Yining Wang
a6a8afc47e
Add qwen vl CPU example ( #9221 )
...
* eee
* add examples on CPU and GPU
* fix
* fix
* optimize model examples
* add Qwen-VL-Chat CPU example
* Add Qwen-VL CPU example
* fix optimize problem
* fix error
* Have updated, benchmark fix removed from this PR
* add generate API example
* Change formats in qwen-vl example
* Add CPU transformer int4 example for qwen-vl
* fix repo-id problem and add Readme
* change picture url
* Remove unnecessary file
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
2023-10-25 13:22:12 +08:00
dingbaorong
5a2ce421af
add cpu and gpu examples of flan-t5 ( #9171 )
...
* add cpu and gpu examples of flan-t5
* address yuwen's comments
* Add explanation why we add modules to not convert
* Refine prompt and add a translation example
* Add a empty line at the end of files
* add examples of flan-t5 using optimize_mdoel api
* address bin's comments
* address binbin's comments
* add flan-t5 in readme
2023-10-24 15:24:01 +08:00