binbin Deng
2fe38b4b9b
LLM: add mixtral GPU examples ( #9661 )
2023-12-12 20:26:36 +08:00
ZehuaCao
45721f3473
verfiy llava ( #9649 )
2023-12-11 14:26:05 +08:00
Heyang Sun
9f02f96160
[LLM] support for Yi AWQ model ( #9648 )
2023-12-11 14:07:34 +08:00
Yina Chen
70f5e7bf0d
Support peft LoraConfig ( #9636 )
...
* support peft loraconfig
* use testcase to test
* fix style
* meet comments
2023-12-08 16:13:03 +08:00
binbin Deng
499100daf1
LLM: Add solution to fix oneccl related error ( #9630 )
2023-12-08 10:51:55 +08:00
Heyang Sun
3811cf43c9
[LLM] update AWQ documents ( #9623 )
...
* [LLM] update AWQ and verified models' documents
* refine
* refine links
* refine
2023-12-07 16:02:20 +08:00
Jason Dai
51b668f229
Update GGUF readme ( #9611 )
2023-12-06 18:21:54 +08:00
dingbaorong
a7bc89b3a1
remove q4_1 in gguf example ( #9610 )
...
* remove q4_1
* fixes
2023-12-06 16:00:05 +08:00
Yina Chen
404e101ded
QALora example ( #9551 )
...
* Support qa-lora
* init
* update
* update
* update
* update
* update
* update merge
* update
* fix style & update scripts
* update
* address comments
* fix typo
* fix typo
---------
Co-authored-by: Yang Wang <yang3.wang@intel.com>
2023-12-06 15:36:21 +08:00
dingbaorong
89069d6173
Add gpu gguf example ( #9603 )
...
* add gpu gguf example
* some fixes
* address kai's comments
* address json's comments
2023-12-06 15:17:54 +08:00
Zheng, Yi
d154b38bf9
Add llama2 gpu low memory example ( #9514 )
...
* Add low memory example
* Minor fixes
* Update readme.md
2023-12-05 17:29:48 +08:00
Jason Dai
06febb5fa7
Update readme for FP8/FP4 inference examples ( #9601 )
2023-12-05 15:59:03 +08:00
dingbaorong
a66fbedd7e
add gpu more data types example ( #9592 )
...
* add gpu more data types example
* add int8
2023-12-05 15:45:38 +08:00
Jinyi Wan
b721138132
Add cpu and gpu examples for BlueLM ( #9589 )
...
* Add cpu int4 example for BlueLM
* addexample optimize_model cpu for bluelm
* add example gpu int4 blueLM
* add example optimiza_model GPU for bluelm
* Fixing naming issues and BigDL package version.
* Fixing naming issues...
* Add BlueLM in README.md "Verified Models"
2023-12-05 13:59:02 +08:00
Guancheng Fu
8b00653039
fix doc ( #9599 )
2023-12-05 13:49:31 +08:00
Wang, Jian4
ed0dc57c6e
LLM: Add cpu qlora support other models guide ( #9567 )
...
* use bf16 flag
* add using baichuan model
* update merge
* remove
* update
2023-12-01 11:18:04 +08:00
binbin Deng
4ff2ca9d0d
LLM: fix loss error on Arc ( #9550 )
2023-11-29 15:16:18 +08:00
Guancheng Fu
963a5c8d79
Add vLLM-XPU version's README/examples ( #9536 )
...
* test
* test
* fix last kv cache
* add xpu readme
* remove numactl for xpu example
* fix link error
* update max_num_batched_tokens logic
* add explaination
* add xpu environement version requirement
* refine gpu memory
* fix
* fix style
2023-11-28 09:44:03 +08:00
binbin Deng
2b9c7d2a59
LLM: quick fix alpaca qlora finetuning script ( #9534 )
2023-11-27 11:04:27 +08:00
binbin Deng
6bec0faea5
LLM: support Mistral AWQ models ( #9520 )
2023-11-24 16:20:22 +08:00
Jason Dai
82898a4203
Update GPU example README ( #9524 )
2023-11-23 21:20:26 +08:00
binbin Deng
1a2129221d
LLM: support resume from checkpoint in Alpaca QLoRA ( #9502 )
2023-11-22 13:49:14 +08:00
Ruonan Wang
076d106ef5
LLM: GPU QLoRA update to bf16 to accelerate gradient checkpointing ( #9499 )
...
* update to bf16 to accelerate gradient checkpoint
* add utils and fix ut
2023-11-21 17:08:36 +08:00
binbin Deng
b7ae572ac3
LLM: update Alpaca QLoRA finetuning example on GPU ( #9492 )
2023-11-21 14:22:19 +08:00
binbin Deng
3dac21ac7b
LLM: add more example usages about alpaca qlora on different hardware ( #9458 )
2023-11-17 09:56:43 +08:00
Yina Chen
d5263e6681
Add awq load support ( #9453 )
...
* Support directly loading GPTQ models from huggingface
* fix style
* fix tests
* change example structure
* address comments
* fix style
* init
* address comments
* add examples
* fix style
* fix style
* fix style
* fix style
* update
* remove
* meet comments
* fix style
---------
Co-authored-by: Yang Wang <yang3.wang@intel.com>
2023-11-16 14:06:25 +08:00
Ruonan Wang
0f82b8c3a0
LLM: update qlora example ( #9454 )
...
* update qlora example
* fix loss=0
2023-11-15 09:24:15 +08:00
Yang Wang
51d07a9fd8
Support directly loading gptq models from huggingface ( #9391 )
...
* Support directly loading GPTQ models from huggingface
* fix style
* fix tests
* change example structure
* address comments
* fix style
* address comments
2023-11-13 20:48:12 -08:00
Zheng, Yi
9b5d0e9c75
Add examples for Yi-6B ( #9421 )
2023-11-13 10:53:15 +08:00
Zheng, Yi
0674146cfb
Add cpu and gpu examples of distil-whisper ( #9374 )
...
* Add distil-whisper examples
* Fixes based on comments
* Minor fixes
---------
Co-authored-by: Ariadne330 <wyn2000330@126.com>
2023-11-10 16:09:55 +08:00
binbin Deng
54d95e4907
LLM: add alpaca qlora finetuning example ( #9276 )
2023-11-08 16:25:17 +08:00
binbin Deng
97316bbb66
LLM: highlight transformers version requirement in mistral examples ( #9380 )
2023-11-08 16:05:03 +08:00
Jin Qiao
e6b6afa316
LLM: add aquila2 model example ( #9356 )
2023-11-06 15:47:39 +08:00
Jin Qiao
97a38958bd
LLM: add CodeLlama CPU and GPU examples ( #9338 )
...
* LLM: add codellama CPU pytorch examples
* LLM: add codellama CPU transformers examples
* LLM: add codellama GPU transformers examples
* LLM: add codellama GPU pytorch examples
* LLM: add codellama in readme
* LLM: add LLaVA link
2023-11-02 15:34:25 +08:00
dingbaorong
f855a864ef
add llava gpu example ( #9324 )
...
* add llava gpu example
* use 7b model
* fix typo
* add in README
2023-11-02 14:48:29 +08:00
Cengguang Zhang
9f3d4676c6
LLM: Add qwen-vl gpu example ( #9290 )
...
* create qwen-vl gpu example.
* add readme.
* fix.
* change input figure and update outputs.
* add qwen-vl pytorch model gpu example.
* fix.
* add readme.
2023-11-01 11:01:39 +08:00
Jin Qiao
96f8158fe2
LLM: adjust dolly v2 GPU example README ( #9318 )
2023-11-01 09:50:22 +08:00
Jin Qiao
c44c6dc43a
LLM: add chatglm3 examples ( #9305 )
2023-11-01 09:50:05 +08:00
Ruonan Wang
d383ee8efb
LLM: update QLoRA example about accelerate version( #9314 )
2023-10-31 13:54:38 +08:00
Yang Wang
8838707009
Add deepspeed autotp example readme ( #9289 )
...
* Add deepspeed autotp example readme
* change word
2023-10-27 13:04:38 -07:00
Yang Wang
067c7e8098
Support deepspeed AutoTP ( #9230 )
...
* Support deepspeed
* add test script
* refactor convert
* refine example
* refine
* refine example
* fix style
* refine example and adapte latest ipex
* fix style
2023-10-24 23:46:28 -07:00
dingbaorong
5a2ce421af
add cpu and gpu examples of flan-t5 ( #9171 )
...
* add cpu and gpu examples of flan-t5
* address yuwen's comments
* Add explanation why we add modules to not convert
* Refine prompt and add a translation example
* Add a empty line at the end of files
* add examples of flan-t5 using optimize_mdoel api
* address bin's comments
* address binbin's comments
* add flan-t5 in readme
2023-10-24 15:24:01 +08:00
Yining Wang
4a19f50d16
phi-1_5 CPU and GPU examples ( #9173 )
...
* eee
* add examples on CPU and GPU
* fix
* fix
* optimize model examples
* have updated
* Warmup and configs added
* Update two tables
2023-10-24 15:08:04 +08:00
Xin Qiu
0c5055d38c
add position_ids and fuse embedding for falcon ( #9242 )
...
* add position_ids for falcon
* add cpu
* add cpu
* add license
2023-10-24 09:58:20 +08:00
Jin Qiao
a3b664ed03
LLM: add GPU More-Data-Types and Save/Load example ( #9199 )
2023-10-18 13:13:45 +08:00
Ruonan Wang
c0497ab41b
LLM: support kv_cache optimization for Qwen-VL-Chat ( #9193 )
...
* dupport qwen_vl_chat
* fix style
2023-10-17 13:33:56 +08:00
Yang Wang
7a2de00b48
Fixes for xpu Bf16 training ( #9156 )
...
* Support bf16 training
* Use a stable transformer version
* remove env
* fix style
2023-10-14 21:28:59 -07:00
Jin Qiao
db7f938fdc
LLM: add replit and starcoder to gpu pytorch model example ( #9154 )
2023-10-13 15:44:17 +08:00
Jin Qiao
797b156a0d
LLM: add dolly-v1 and dolly-v2 to gpu pytorch model example ( #9153 )
2023-10-13 15:43:35 +08:00
Jin Qiao
f754ab3e60
LLM: add baichuan and baichuan2 to gpu pytorch model example ( #9152 )
2023-10-13 13:44:31 +08:00
JIN Qiao
1a1ddc4144
LLM: Add Replit CPU and GPU example ( #9028 )
2023-10-12 13:42:14 +08:00
JIN Qiao
d74834ff4c
LLM: add gpu pytorch-models example llama2 and chatglm2 ( #9142 )
2023-10-12 13:41:48 +08:00
binbin Deng
995b0f119f
LLM: update some gpu examples ( #9136 )
2023-10-11 14:23:56 +08:00
binbin Deng
2ad67a18b1
LLM: add mistral examples ( #9121 )
2023-10-11 13:38:15 +08:00
binbin Deng
5e9962b60e
LLM: update example layout ( #9046 )
2023-10-09 15:36:39 +08:00