Wang, Jian4
								
							 
						 | 
						
							
							
							
							
								
							
							
								ed0dc57c6e
								
							
						 | 
						
							
							
								
								LLM: Add cpu qlora support other models guide (#9567)
							
							
							
							
							
							
							
							* use bf16 flag
* add using baichuan model
* update merge
* remove
* update 
							
						 | 
						
							2023-12-01 11:18:04 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
							
							
								
							
							
								bda404fc8f
								
							
						 | 
						
							
							
								
								Update readme (#9575)
							
							
							
							
							
						 | 
						
							2023-11-30 22:45:52 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								66f5b45f57
								
							
						 | 
						
							
							
								
								[LLM] add a llama2 gguf example (#9553)
							
							
							
							
							
						 | 
						
							2023-11-30 16:37:17 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
							
							
								
							
							
								a0a80d232e
								
							
						 | 
						
							
							
								
								LLM: Add qlora cpu distributed readme (#9561)
							
							
							
							
							
							
							
							* init readme
* add distributed guide
* update 
							
						 | 
						
							2023-11-30 13:42:30 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Qiyuan Gong
								
							 
						 | 
						
							
							
							
							
								
							
							
								d85a430a8c
								
							
						 | 
						
							
							
								
								Uing bigdl-llm-init instead of bigdl-nano-init (#9558)
							
							
							
							
							
							
							
							* Replace `bigdl-nano-init` with `bigdl-llm-init`.
* Install `bigdl-llm` instead of `bigdl-nano`.
* Remove nano in README. 
							
						 | 
						
							2023-11-30 10:10:29 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								4ff2ca9d0d
								
							
						 | 
						
							
							
								
								LLM: fix loss error on Arc (#9550)
							
							
							
							
							
						 | 
						
							2023-11-29 15:16:18 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
							
							
								
							
							
								b824754256
								
							
						 | 
						
							
							
								
								LLM: Update for cpu qlora mpirun (#9548)
							
							
							
							
							
						 | 
						
							2023-11-29 10:56:17 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
							
							
								
							
							
								963a5c8d79
								
							
						 | 
						
							
							
								
								Add vLLM-XPU version's README/examples (#9536)
							
							
							
							
							
							
							
							* test
* test
* fix last kv cache
* add xpu readme
* remove numactl for xpu example
* fix link error
* update max_num_batched_tokens logic
* add explaination
* add xpu environement version requirement
* refine gpu memory
* fix
* fix style 
							
						 | 
						
							2023-11-28 09:44:03 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
							
							
								
							
							
								b6c3520748
								
							
						 | 
						
							
							
								
								Remove xformers from vLLM-CPU (#9535)
							
							
							
							
							
						 | 
						
							2023-11-27 11:21:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								2b9c7d2a59
								
							
						 | 
						
							
							
								
								LLM: quick fix alpaca qlora finetuning script (#9534)
							
							
							
							
							
						 | 
						
							2023-11-27 11:04:27 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								6bec0faea5
								
							
						 | 
						
							
							
								
								LLM: support Mistral AWQ models (#9520)
							
							
							
							
							
						 | 
						
							2023-11-24 16:20:22 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
							
							
								
							
							
								b3178d449f
								
							
						 | 
						
							
							
								
								Update README.md (#9525)
							
							
							
							
							
						 | 
						
							2023-11-23 21:45:20 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
							
							
								
							
							
								82898a4203
								
							
						 | 
						
							
							
								
								Update GPU example README (#9524)
							
							
							
							
							
						 | 
						
							2023-11-23 21:20:26 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
							
							
								
							
							
								064848028f
								
							
						 | 
						
							
							
								
								Update README.md (#9523)
							
							
							
							
							
						 | 
						
							2023-11-23 21:16:21 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guancheng Fu
								
							 
						 | 
						
							
							
							
							
								
							
							
								bf579507c2
								
							
						 | 
						
							
							
								
								Integrate vllm (#9310)
							
							
							
							
							
							
							
							* done
* Rename structure
* add models
* Add structure/sampling_params,sequence
* add input_metadata
* add outputs
* Add policy,logger
* add and update
* add parallelconfig back
* core/scheduler.py
* Add llm_engine.py
* Add async_llm_engine.py
* Add tested entrypoint
* fix minor error
* Fix everything
* fix kv cache view
* fix
* fix
* fix
* format&refine
* remove logger from repo
* try to add token latency
* remove logger
* Refine config.py
* finish worker.py
* delete utils.py
* add license
* refine
* refine sequence.py
* remove sampling_params.py
* finish
* add license
* format
* add license
* refine
* refine
* Refine line too long
* remove exception
* so dumb style-check
* refine
* refine
* refine
* refine
* refine
* refine
* add README
* refine README
* add warning instead error
* fix padding
* add license
* format
* format
* format fix
* Refine vllm dependency (#1)
vllm dependency clear
* fix licence
* fix format
* fix format
* fix
* adapt LLM engine
* fix
* add license
* fix format
* fix
* Moving README.md to the correct position
* Fix readme.md
* done
* guide for adding models
* fix
* Fix README.md
* Add new model readme
* remove ray-logic
* refactor arg_utils.py
* remove distributed_init_method logic
* refactor entrypoints
* refactor input_metadata
* refactor model_loader
* refactor utils.py
* refactor models
* fix api server
* remove vllm.stucture
* revert by txy 1120
* remove utils
* format
* fix license
* add bigdl model
* Refer to a specfic commit
* Change code base
* add comments
* add async_llm_engine comment
* refine
* formatted
* add worker comments
* add comments
* add comments
* fix style
* add changes
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com>
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com> 
							
						 | 
						
							2023-11-23 16:46:45 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Heyang Sun
								
							 
						 | 
						
							
							
							
							
								
							
							
								48fbb1eb94
								
							
						 | 
						
							
							
								
								support ccl (MPI) distributed mode in alpaca_qlora_finetuning_cpu (#9507)
							
							
							
							
							
						 | 
						
							2023-11-23 10:58:09 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Heyang Sun
								
							 
						 | 
						
							
							
							
							
								
							
							
								11fa5a8a0e
								
							
						 | 
						
							
							
								
								Fix QLoRA CPU dispatch_model issue about accelerate (#9506)
							
							
							
							
							
						 | 
						
							2023-11-23 08:41:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Heyang Sun
								
							 
						 | 
						
							
							
							
							
								
							
							
								1453046938
								
							
						 | 
						
							
							
								
								install bigdl-llm in deepspeed cpu inference example (#9508)
							
							
							
							
							
						 | 
						
							2023-11-23 08:39:21 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								86743fb57b
								
							
						 | 
						
							
							
								
								LLM: fix transformers version in CPU finetuning example (#9511)
							
							
							
							
							
						 | 
						
							2023-11-22 15:53:07 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								1a2129221d
								
							
						 | 
						
							
							
								
								LLM: support resume from checkpoint in Alpaca QLoRA (#9502)
							
							
							
							
							
						 | 
						
							2023-11-22 13:49:14 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								076d106ef5
								
							
						 | 
						
							
							
								
								LLM: GPU QLoRA update to bf16 to accelerate gradient checkpointing (#9499)
							
							
							
							
							
							
							
							* update to bf16 to accelerate gradient checkpoint
* add utils and fix ut 
							
						 | 
						
							2023-11-21 17:08:36 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								b7ae572ac3
								
							
						 | 
						
							
							
								
								LLM: update Alpaca QLoRA finetuning example on GPU (#9492)
							
							
							
							
							
						 | 
						
							2023-11-21 14:22:19 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
							
							
								
							
							
								c5cb3ab82e
								
							
						 | 
						
							
							
								
								LLM : Add CPU alpaca qlora example (#9469)
							
							
							
							
							
							
							
							* init
* update xpu to cpu
* update
* update readme
* update example
* update
* add refer
* add guide to train different datasets
* update readme
* update 
							
						 | 
						
							2023-11-21 09:19:58 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								96fd26759c
								
							
						 | 
						
							
							
								
								LLM: fix QLoRA finetuning example on CPU (#9489)
							
							
							
							
							
						 | 
						
							2023-11-20 14:31:24 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								3dac21ac7b
								
							
						 | 
						
							
							
								
								LLM: add more example usages about alpaca qlora on different hardware (#9458)
							
							
							
							
							
						 | 
						
							2023-11-17 09:56:43 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Heyang Sun
								
							 
						 | 
						
							
							
							
							
								
							
							
								921b263d6a
								
							
						 | 
						
							
							
								
								update deepspeed install and run guide in README (#9441)
							
							
							
							
							
						 | 
						
							2023-11-17 09:11:39 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
							
							
								
							
							
								d5263e6681
								
							
						 | 
						
							
							
								
								Add awq load support (#9453)
							
							
							
							
							
							
							
							* Support directly loading GPTQ models from huggingface
* fix style
* fix tests
* change example structure
* address comments
* fix style
* init
* address comments
* add examples
* fix style
* fix style
* fix style
* fix style
* update
* remove
* meet comments
* fix style
---------
Co-authored-by: Yang Wang <yang3.wang@intel.com> 
							
						 | 
						
							2023-11-16 14:06:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								0f82b8c3a0
								
							
						 | 
						
							
							
								
								LLM: update qlora example (#9454)
							
							
							
							
							
							
							
							* update qlora example
* fix loss=0 
							
						 | 
						
							2023-11-15 09:24:15 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yang Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								51d07a9fd8
								
							
						 | 
						
							
							
								
								Support directly loading gptq models from huggingface (#9391)
							
							
							
							
							
							
							
							* Support directly loading GPTQ models from huggingface
* fix style
* fix tests
* change example structure
* address comments
* fix style
* address comments 
							
						 | 
						
							2023-11-13 20:48:12 -08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Heyang Sun
								
							 
						 | 
						
							
							
							
							
								
							
							
								da6bbc8c11
								
							
						 | 
						
							
							
								
								fix deepspeed dependencies to install (#9400)
							
							
							
							
							
							
							
							* remove reductant parameter from deepspeed install
* Update install.sh
* Update install.sh 
							
						 | 
						
							2023-11-13 16:42:50 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zheng, Yi
								
							 
						 | 
						
							
							
							
							
								
							
							
								9b5d0e9c75
								
							
						 | 
						
							
							
								
								Add examples for Yi-6B (#9421)
							
							
							
							
							
						 | 
						
							2023-11-13 10:53:15 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
							
							
								
							
							
								ac7fbe77e2
								
							
						 | 
						
							
							
								
								Update qlora readme (#9416)
							
							
							
							
							
						 | 
						
							2023-11-12 19:29:29 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zheng, Yi
								
							 
						 | 
						
							
							
							
							
								
							
							
								0674146cfb
								
							
						 | 
						
							
							
								
								Add cpu and gpu examples of distil-whisper (#9374)
							
							
							
							
							
							
							
							* Add distil-whisper examples
* Fixes based on comments
* Minor fixes
---------
Co-authored-by: Ariadne330 <wyn2000330@126.com> 
							
						 | 
						
							2023-11-10 16:09:55 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ziteng Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								ad81b5d838
								
							
						 | 
						
							
							
								
								Update qlora README.md (#9422)
							
							
							
							
							
						 | 
						
							2023-11-10 15:19:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Heyang Sun
								
							 
						 | 
						
							
							
							
							
								
							
							
								b23b91407c
								
							
						 | 
						
							
							
								
								fix llm-init on deepspeed missing lib (#9419)
							
							
							
							
							
						 | 
						
							2023-11-10 13:51:24 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									dingbaorong
								
							 
						 | 
						
							
							
							
							
								
							
							
								36fbe2144d
								
							
						 | 
						
							
							
								
								Add CPU examples of fuyu (#9393)
							
							
							
							
							
							
							
							* add fuyu cpu examples
* add gpu example
* add comments
* add license
* remove gpu example
* fix inference time 
							
						 | 
						
							2023-11-09 15:29:19 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								54d95e4907
								
							
						 | 
						
							
							
								
								LLM: add alpaca qlora finetuning example (#9276)
							
							
							
							
							
						 | 
						
							2023-11-08 16:25:17 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								97316bbb66
								
							
						 | 
						
							
							
								
								LLM: highlight transformers version requirement in mistral examples (#9380)
							
							
							
							
							
						 | 
						
							2023-11-08 16:05:03 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Heyang Sun
								
							 
						 | 
						
							
							
							
							
								
							
							
								af94058203
								
							
						 | 
						
							
							
								
								[LLM] Support CPU deepspeed distributed inference (#9259)
							
							
							
							
							
							
							
							* [LLM] Support CPU Deepspeed distributed inference
* Update run_deepspeed.py
* Rename
* fix style
* add new codes
* refine
* remove annotated codes
* refine
* Update README.md
* refine doc and example code 
							
						 | 
						
							2023-11-06 17:56:42 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jin Qiao
								
							 
						 | 
						
							
							
							
							
								
							
							
								e6b6afa316
								
							
						 | 
						
							
							
								
								LLM: add aquila2 model example (#9356)
							
							
							
							
							
						 | 
						
							2023-11-06 15:47:39 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yining Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								9377b9c5d7
								
							
						 | 
						
							
							
								
								add CodeShell CPU example (#9345)
							
							
							
							
							
							
							
							* add CodeShell CPU example
* fix some problems 
							
						 | 
						
							2023-11-03 13:15:54 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zheng, Yi
								
							 
						 | 
						
							
							
							
							
								
							
							
								63411dff75
								
							
						 | 
						
							
							
								
								Add cpu examples of WizardCoder (#9344)
							
							
							
							
							
							
							
							* Add wizardcoder example
* Minor fixes 
							
						 | 
						
							2023-11-02 20:22:43 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									dingbaorong
								
							 
						 | 
						
							
							
							
							
								
							
							
								2e3bfbfe1f
								
							
						 | 
						
							
							
								
								Add internlm_xcomposer cpu examples (#9337)
							
							
							
							
							
							
							
							* add internlm-xcomposer cpu examples
* use chat
* some fixes
* add license
* address shengsheng's comments
* use demo.jpg 
							
						 | 
						
							2023-11-02 15:50:02 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jin Qiao
								
							 
						 | 
						
							
							
							
							
								
							
							
								97a38958bd
								
							
						 | 
						
							
							
								
								LLM: add CodeLlama CPU and GPU examples (#9338)
							
							
							
							
							
							
							
							* LLM: add codellama CPU pytorch examples
* LLM: add codellama CPU transformers examples
* LLM: add codellama GPU transformers examples
* LLM: add codellama GPU pytorch examples
* LLM: add codellama in readme
* LLM: add LLaVA link 
							
						 | 
						
							2023-11-02 15:34:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zheng, Yi
								
							 
						 | 
						
							
							
							
							
								
							
							
								63b2556ce2
								
							
						 | 
						
							
							
								
								Add cpu examples of skywork (#9340)
							
							
							
							
							
						 | 
						
							2023-11-02 15:10:45 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									dingbaorong
								
							 
						 | 
						
							
							
							
							
								
							
							
								f855a864ef
								
							
						 | 
						
							
							
								
								add llava gpu example (#9324)
							
							
							
							
							
							
							
							* add llava gpu example
* use 7b model
* fix typo
* add in README 
							
						 | 
						
							2023-11-02 14:48:29 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Wang, Jian4
								
							 
						 | 
						
							
							
							
							
								
							
							
								149146004f
								
							
						 | 
						
							
							
								
								LLM: Add qlora finetunning CPU example (#9275)
							
							
							
							
							
							
							
							* add qlora finetunning example
* update readme
* update example
* remove merge.py and update readme 
							
						 | 
						
							2023-11-02 09:45:42 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								9f3d4676c6
								
							
						 | 
						
							
							
								
								LLM: Add qwen-vl gpu example (#9290)
							
							
							
							
							
							
							
							* create qwen-vl gpu example.
* add readme.
* fix.
* change input figure and update outputs.
* add qwen-vl pytorch model gpu example.
* fix.
* add readme. 
							
						 | 
						
							2023-11-01 11:01:39 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jin Qiao
								
							 
						 | 
						
							
							
							
							
								
							
							
								96f8158fe2
								
							
						 | 
						
							
							
								
								LLM: adjust dolly v2 GPU example README (#9318)
							
							
							
							
							
						 | 
						
							2023-11-01 09:50:22 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jin Qiao
								
							 
						 | 
						
							
							
							
							
								
							
							
								c44c6dc43a
								
							
						 | 
						
							
							
								
								LLM: add chatglm3 examples (#9305)
							
							
							
							
							
						 | 
						
							2023-11-01 09:50:05 +08:00 | 
						
						
							
							
							
								
							
							
						 |