Commit graph

105 commits

Author SHA1 Message Date
binbin Deng
2fe38b4b9b LLM: add mixtral GPU examples (#9661) 2023-12-12 20:26:36 +08:00
ZehuaCao
45721f3473 verfiy llava (#9649) 2023-12-11 14:26:05 +08:00
Heyang Sun
9f02f96160 [LLM] support for Yi AWQ model (#9648) 2023-12-11 14:07:34 +08:00
Yina Chen
70f5e7bf0d Support peft LoraConfig (#9636)
* support peft loraconfig

* use testcase to test

* fix style

* meet comments
2023-12-08 16:13:03 +08:00
binbin Deng
499100daf1 LLM: Add solution to fix oneccl related error (#9630) 2023-12-08 10:51:55 +08:00
Heyang Sun
3811cf43c9 [LLM] update AWQ documents (#9623)
* [LLM] update AWQ and verified models' documents

* refine

* refine links

* refine
2023-12-07 16:02:20 +08:00
Jason Dai
51b668f229 Update GGUF readme (#9611) 2023-12-06 18:21:54 +08:00
dingbaorong
a7bc89b3a1 remove q4_1 in gguf example (#9610)
* remove q4_1

* fixes
2023-12-06 16:00:05 +08:00
Yina Chen
404e101ded QALora example (#9551)
* Support qa-lora

* init

* update

* update

* update

* update

* update

* update merge

* update

* fix style & update scripts

* update

* address comments

* fix typo

* fix typo

---------

Co-authored-by: Yang Wang <yang3.wang@intel.com>
2023-12-06 15:36:21 +08:00
dingbaorong
89069d6173 Add gpu gguf example (#9603)
* add gpu gguf example

* some fixes

* address kai's comments

* address json's comments
2023-12-06 15:17:54 +08:00
Zheng, Yi
d154b38bf9 Add llama2 gpu low memory example (#9514)
* Add low memory example

* Minor fixes

* Update readme.md
2023-12-05 17:29:48 +08:00
Jason Dai
06febb5fa7 Update readme for FP8/FP4 inference examples (#9601) 2023-12-05 15:59:03 +08:00
dingbaorong
a66fbedd7e add gpu more data types example (#9592)
* add gpu more data types example

* add int8
2023-12-05 15:45:38 +08:00
Jinyi Wan
b721138132 Add cpu and gpu examples for BlueLM (#9589)
* Add cpu int4 example for BlueLM

* addexample optimize_model cpu for bluelm

* add example gpu int4 blueLM

* add example optimiza_model GPU for bluelm

* Fixing naming issues and BigDL package version.

* Fixing naming issues...

* Add BlueLM in README.md "Verified Models"
2023-12-05 13:59:02 +08:00
Guancheng Fu
8b00653039 fix doc (#9599) 2023-12-05 13:49:31 +08:00
Wang, Jian4
ed0dc57c6e LLM: Add cpu qlora support other models guide (#9567)
* use bf16 flag

* add using baichuan model

* update merge

* remove

* update
2023-12-01 11:18:04 +08:00
binbin Deng
4ff2ca9d0d LLM: fix loss error on Arc (#9550) 2023-11-29 15:16:18 +08:00
Guancheng Fu
963a5c8d79 Add vLLM-XPU version's README/examples (#9536)
* test

* test

* fix last kv cache

* add xpu readme

* remove numactl for xpu example

* fix link error

* update max_num_batched_tokens logic

* add explaination

* add xpu environement version requirement

* refine gpu memory

* fix

* fix style
2023-11-28 09:44:03 +08:00
binbin Deng
2b9c7d2a59 LLM: quick fix alpaca qlora finetuning script (#9534) 2023-11-27 11:04:27 +08:00
binbin Deng
6bec0faea5 LLM: support Mistral AWQ models (#9520) 2023-11-24 16:20:22 +08:00
Jason Dai
82898a4203 Update GPU example README (#9524) 2023-11-23 21:20:26 +08:00
binbin Deng
1a2129221d LLM: support resume from checkpoint in Alpaca QLoRA (#9502) 2023-11-22 13:49:14 +08:00
Ruonan Wang
076d106ef5 LLM: GPU QLoRA update to bf16 to accelerate gradient checkpointing (#9499)
* update to bf16 to accelerate gradient checkpoint

* add utils and fix ut
2023-11-21 17:08:36 +08:00
binbin Deng
b7ae572ac3 LLM: update Alpaca QLoRA finetuning example on GPU (#9492) 2023-11-21 14:22:19 +08:00
binbin Deng
3dac21ac7b LLM: add more example usages about alpaca qlora on different hardware (#9458) 2023-11-17 09:56:43 +08:00
Yina Chen
d5263e6681 Add awq load support (#9453)
* Support directly loading GPTQ models from huggingface

* fix style

* fix tests

* change example structure

* address comments

* fix style

* init

* address comments

* add examples

* fix style

* fix style

* fix style

* fix style

* update

* remove

* meet comments

* fix style

---------

Co-authored-by: Yang Wang <yang3.wang@intel.com>
2023-11-16 14:06:25 +08:00
Ruonan Wang
0f82b8c3a0 LLM: update qlora example (#9454)
* update qlora example

* fix loss=0
2023-11-15 09:24:15 +08:00
Yang Wang
51d07a9fd8 Support directly loading gptq models from huggingface (#9391)
* Support directly loading GPTQ models from huggingface

* fix style

* fix tests

* change example structure

* address comments

* fix style

* address comments
2023-11-13 20:48:12 -08:00
Zheng, Yi
9b5d0e9c75 Add examples for Yi-6B (#9421) 2023-11-13 10:53:15 +08:00
Zheng, Yi
0674146cfb Add cpu and gpu examples of distil-whisper (#9374)
* Add distil-whisper examples

* Fixes based on comments

* Minor fixes

---------

Co-authored-by: Ariadne330 <wyn2000330@126.com>
2023-11-10 16:09:55 +08:00
binbin Deng
54d95e4907 LLM: add alpaca qlora finetuning example (#9276) 2023-11-08 16:25:17 +08:00
binbin Deng
97316bbb66 LLM: highlight transformers version requirement in mistral examples (#9380) 2023-11-08 16:05:03 +08:00
Jin Qiao
e6b6afa316 LLM: add aquila2 model example (#9356) 2023-11-06 15:47:39 +08:00
Jin Qiao
97a38958bd LLM: add CodeLlama CPU and GPU examples (#9338)
* LLM: add codellama CPU pytorch examples

* LLM: add codellama CPU transformers examples

* LLM: add codellama GPU transformers examples

* LLM: add codellama GPU pytorch examples

* LLM: add codellama in readme

* LLM: add LLaVA link
2023-11-02 15:34:25 +08:00
dingbaorong
f855a864ef add llava gpu example (#9324)
* add llava gpu example

* use 7b model

* fix typo

* add in README
2023-11-02 14:48:29 +08:00
Cengguang Zhang
9f3d4676c6 LLM: Add qwen-vl gpu example (#9290)
* create qwen-vl gpu example.

* add readme.

* fix.

* change input figure and update outputs.

* add qwen-vl pytorch model gpu example.

* fix.

* add readme.
2023-11-01 11:01:39 +08:00
Jin Qiao
96f8158fe2 LLM: adjust dolly v2 GPU example README (#9318) 2023-11-01 09:50:22 +08:00
Jin Qiao
c44c6dc43a LLM: add chatglm3 examples (#9305) 2023-11-01 09:50:05 +08:00
Ruonan Wang
d383ee8efb LLM: update QLoRA example about accelerate version(#9314) 2023-10-31 13:54:38 +08:00
Yang Wang
8838707009 Add deepspeed autotp example readme (#9289)
* Add deepspeed autotp example readme

* change word
2023-10-27 13:04:38 -07:00
Yang Wang
067c7e8098 Support deepspeed AutoTP (#9230)
* Support deepspeed

* add test script

* refactor convert

* refine example

* refine

* refine example

* fix style

* refine example and adapte latest ipex

* fix style
2023-10-24 23:46:28 -07:00
dingbaorong
5a2ce421af add cpu and gpu examples of flan-t5 (#9171)
* add cpu and gpu examples of flan-t5

* address yuwen's comments
* Add explanation  why we add modules to not convert
* Refine prompt and add a translation example
* Add a empty line at the end of files

* add examples of flan-t5 using optimize_mdoel api

* address bin's comments

* address binbin's comments

* add flan-t5 in readme
2023-10-24 15:24:01 +08:00
Yining Wang
4a19f50d16 phi-1_5 CPU and GPU examples (#9173)
* eee

* add examples on CPU and GPU

* fix

* fix

* optimize model examples

* have updated

* Warmup and configs added

* Update two tables
2023-10-24 15:08:04 +08:00
Xin Qiu
0c5055d38c add position_ids and fuse embedding for falcon (#9242)
* add position_ids for falcon

* add cpu

* add cpu

* add license
2023-10-24 09:58:20 +08:00
Jin Qiao
a3b664ed03 LLM: add GPU More-Data-Types and Save/Load example (#9199) 2023-10-18 13:13:45 +08:00
Ruonan Wang
c0497ab41b LLM: support kv_cache optimization for Qwen-VL-Chat (#9193)
* dupport qwen_vl_chat

* fix style
2023-10-17 13:33:56 +08:00
Yang Wang
7a2de00b48 Fixes for xpu Bf16 training (#9156)
* Support bf16 training

* Use a stable transformer version

* remove env

* fix style
2023-10-14 21:28:59 -07:00
Jin Qiao
db7f938fdc LLM: add replit and starcoder to gpu pytorch model example (#9154) 2023-10-13 15:44:17 +08:00
Jin Qiao
797b156a0d LLM: add dolly-v1 and dolly-v2 to gpu pytorch model example (#9153) 2023-10-13 15:43:35 +08:00
Jin Qiao
f754ab3e60 LLM: add baichuan and baichuan2 to gpu pytorch model example (#9152) 2023-10-13 13:44:31 +08:00
JIN Qiao
1a1ddc4144 LLM: Add Replit CPU and GPU example (#9028) 2023-10-12 13:42:14 +08:00
JIN Qiao
d74834ff4c LLM: add gpu pytorch-models example llama2 and chatglm2 (#9142) 2023-10-12 13:41:48 +08:00
binbin Deng
995b0f119f LLM: update some gpu examples (#9136) 2023-10-11 14:23:56 +08:00
binbin Deng
2ad67a18b1 LLM: add mistral examples (#9121) 2023-10-11 13:38:15 +08:00
binbin Deng
5e9962b60e LLM: update example layout (#9046) 2023-10-09 15:36:39 +08:00