Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								984697afe2 
								
							 
						 
						
							
							
								
								LLM: Add bloom gguf support ( #9734 )  
							
							 
							
							... 
							
							
							
							* init
* update bloom add merges
* update
* update readme
* update for llama error
* update 
							
						 
						
							2023-12-21 14:06:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								1fa7793fc0 
								
							 
						 
						
							
							
								
								Load Mixtral GGUF Model ( #9690 )  
							
							 
							
							... 
							
							
							
							* Load Mixtral GGUF Model
* refactor
* fix empty tensor when to cpu
* update gpu and cpu readmes
* add dtype when set tensor into module 
							
						 
						
							2023-12-19 13:54:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								12df70953e 
								
							 
						 
						
							
							
								
								LLM: add resume_from_checkpoint related section ( #9705 )  
							
							 
							
							
							
						 
						
							2023-12-18 12:27:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								b8437a1c1e 
								
							 
						 
						
							
							
								
								LLM: Add gguf mistral model support ( #9691 )  
							
							 
							
							... 
							
							
							
							* add mistral support
* need to upgrade transformers version
* update 
							
						 
						
							2023-12-15 13:37:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								496bb2e845 
								
							 
						 
						
							
							
								
								LLM: Support load BaiChuan model family gguf model ( #9685 )  
							
							 
							
							... 
							
							
							
							* support baichuan model family gguf model
* update gguf generate.py
* add verify models
* add support model_family
* update
* update style
* update type
* update readme
* update
* remove support model_family 
							
						 
						
							2023-12-15 13:34:33 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Lilac09 
								
							 
						 
						
							
							
							
							
								
							
							
								3afed99216 
								
							 
						 
						
							
							
								
								fix path issue ( #9696 )  
							
							 
							
							
							
						 
						
							2023-12-15 11:21:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								37f509bb95 
								
							 
						 
						
							
							
								
								Update readme ( #9692 )  
							
							 
							
							
							
						 
						
							2023-12-14 19:50:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ziteng Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								21c7503a42 
								
							 
						 
						
							
							
								
								[LLM] Correct prompt format of Qwen in generate.py ( #9678 )  
							
							 
							
							... 
							
							
							
							* Change qwen prompt format to chatml 
							
						 
						
							2023-12-14 14:01:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
							
							
								
							
							
								223c9622f7 
								
							 
						 
						
							
							
								
								[LLM] Mixtral CPU examples ( #9673 )  
							
							 
							
							... 
							
							
							
							* Mixtral CPU PyTorch and hugging face examples, based on #9661  and #9671  
							
						 
						
							2023-12-14 10:35:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
							
							
								
							
							
								877229f3be 
								
							 
						 
						
							
							
								
								[LLM]Add Yi-34B-AWQ to verified AWQ model. ( #9676 )  
							
							 
							
							... 
							
							
							
							* verfiy Yi-34B-AWQ
* update 
							
						 
						
							2023-12-14 09:55:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								68a4be762f 
								
							 
						 
						
							
							
								
								remove disco mixtral, update oneapi version ( #9671 )  
							
							 
							
							
							
						 
						
							2023-12-13 23:24:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
							
							
								
							
							
								503880809c 
								
							 
						 
						
							
							
								
								verfiy codeLlama ( #9668 )  
							
							 
							
							
							
						 
						
							2023-12-13 15:39:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								c64e2248ef 
								
							 
						 
						
							
							
								
								fix str returned by get_int_from_str rather than expected int ( #9667 )  
							
							 
							
							
							
						 
						
							2023-12-13 11:01:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								bf1bcf4a14 
								
							 
						 
						
							
							
								
								add official Mixtral model support ( #9663 )  
							
							 
							
							
							
						 
						
							2023-12-12 22:27:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								2fe38b4b9b 
								
							 
						 
						
							
							
								
								LLM: add mixtral GPU examples ( #9661 )  
							
							 
							
							
							
						 
						
							2023-12-12 20:26:36 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
							
							
								
							
							
								45721f3473 
								
							 
						 
						
							
							
								
								verfiy llava ( #9649 )  
							
							 
							
							
							
						 
						
							2023-12-11 14:26:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								9f02f96160 
								
							 
						 
						
							
							
								
								[LLM] support for Yi AWQ model ( #9648 )  
							
							 
							
							
							
						 
						
							2023-12-11 14:07:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
							
							
								
							
							
								70f5e7bf0d 
								
							 
						 
						
							
							
								
								Support peft LoraConfig ( #9636 )  
							
							 
							
							... 
							
							
							
							* support peft loraconfig
* use testcase to test
* fix style
* meet comments 
							
						 
						
							2023-12-08 16:13:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								499100daf1 
								
							 
						 
						
							
							
								
								LLM: Add solution to fix oneccl related error ( #9630 )  
							
							 
							
							
							
						 
						
							2023-12-08 10:51:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									ZehuaCao 
								
							 
						 
						
							
							
							
							
								
							
							
								6eca8a8bb5 
								
							 
						 
						
							
							
								
								update transformer version ( #9631 )  
							
							 
							
							
							
						 
						
							2023-12-08 09:36:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								3811cf43c9 
								
							 
						 
						
							
							
								
								[LLM] update AWQ documents ( #9623 )  
							
							 
							
							... 
							
							
							
							* [LLM] update AWQ and verified models' documents
* refine
* refine links
* refine 
							
						 
						
							2023-12-07 16:02:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								51b668f229 
								
							 
						 
						
							
							
								
								Update GGUF readme ( #9611 )  
							
							 
							
							
							
						 
						
							2023-12-06 18:21:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								a7bc89b3a1 
								
							 
						 
						
							
							
								
								remove q4_1 in gguf example ( #9610 )  
							
							 
							
							... 
							
							
							
							* remove q4_1
* fixes 
							
						 
						
							2023-12-06 16:00:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
							
							
								
							
							
								404e101ded 
								
							 
						 
						
							
							
								
								QALora example ( #9551 )  
							
							 
							
							... 
							
							
							
							* Support qa-lora
* init
* update
* update
* update
* update
* update
* update merge
* update
* fix style & update scripts
* update
* address comments
* fix typo
* fix typo
---------
Co-authored-by: Yang Wang <yang3.wang@intel.com> 
							
						 
						
							2023-12-06 15:36:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								89069d6173 
								
							 
						 
						
							
							
								
								Add gpu gguf example ( #9603 )  
							
							 
							
							... 
							
							
							
							* add gpu gguf example
* some fixes
* address kai's comments
* address json's comments 
							
						 
						
							2023-12-06 15:17:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ziteng Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								aeb77b2ab1 
								
							 
						 
						
							
							
								
								Add minimum Qwen model version ( #9606 )  
							
							 
							
							
							
						 
						
							2023-12-06 11:49:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								4e70e33934 
								
							 
						 
						
							
							
								
								[LLM] code and document for distributed qlora ( #9585 )  
							
							 
							
							... 
							
							
							
							* [LLM] code and document for distributed qlora
* doc
* refine for gradient checkpoint
* refine
* Update alpaca_qlora_finetuning_cpu.py
* Update alpaca_qlora_finetuning_cpu.py
* Update alpaca_qlora_finetuning_cpu.py
* add link in doc 
							
						 
						
							2023-12-06 09:23:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zheng, Yi 
								
							 
						 
						
							
							
							
							
								
							
							
								d154b38bf9 
								
							 
						 
						
							
							
								
								Add llama2 gpu low memory example ( #9514 )  
							
							 
							
							... 
							
							
							
							* Add low memory example
* Minor fixes
* Update readme.md 
							
						 
						
							2023-12-05 17:29:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								06febb5fa7 
								
							 
						 
						
							
							
								
								Update readme for FP8/FP4 inference examples ( #9601 )  
							
							 
							
							
							
						 
						
							2023-12-05 15:59:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								a66fbedd7e 
								
							 
						 
						
							
							
								
								add gpu more data types example ( #9592 )  
							
							 
							
							... 
							
							
							
							* add gpu more data types example
* add int8 
							
						 
						
							2023-12-05 15:45:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jinyi Wan 
								
							 
						 
						
							
							
							
							
								
							
							
								b721138132 
								
							 
						 
						
							
							
								
								Add cpu and gpu examples for BlueLM ( #9589 )  
							
							 
							
							... 
							
							
							
							* Add cpu int4 example for BlueLM
* addexample optimize_model cpu for bluelm
* add example gpu int4 blueLM
* add example optimiza_model GPU for bluelm
* Fixing naming issues and BigDL package version.
* Fixing naming issues...
* Add BlueLM in README.md "Verified Models" 
							
						 
						
							2023-12-05 13:59:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								8b00653039 
								
							 
						 
						
							
							
								
								fix doc ( #9599 )  
							
							 
							
							
							
						 
						
							2023-12-05 13:49:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								ed0dc57c6e 
								
							 
						 
						
							
							
								
								LLM: Add cpu qlora support other models guide ( #9567 )  
							
							 
							
							... 
							
							
							
							* use bf16 flag
* add using baichuan model
* update merge
* remove
* update 
							
						 
						
							2023-12-01 11:18:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								bda404fc8f 
								
							 
						 
						
							
							
								
								Update readme ( #9575 )  
							
							 
							
							
							
						 
						
							2023-11-30 22:45:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								66f5b45f57 
								
							 
						 
						
							
							
								
								[LLM] add a llama2 gguf example ( #9553 )  
							
							 
							
							
							
						 
						
							2023-11-30 16:37:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								a0a80d232e 
								
							 
						 
						
							
							
								
								LLM: Add qlora cpu distributed readme ( #9561 )  
							
							 
							
							... 
							
							
							
							* init readme
* add distributed guide
* update 
							
						 
						
							2023-11-30 13:42:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Qiyuan Gong 
								
							 
						 
						
							
							
							
							
								
							
							
								d85a430a8c 
								
							 
						 
						
							
							
								
								Uing bigdl-llm-init instead of bigdl-nano-init ( #9558 )  
							
							 
							
							... 
							
							
							
							* Replace `bigdl-nano-init` with `bigdl-llm-init`.
* Install `bigdl-llm` instead of `bigdl-nano`.
* Remove nano in README. 
							
						 
						
							2023-11-30 10:10:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								4ff2ca9d0d 
								
							 
						 
						
							
							
								
								LLM: fix loss error on Arc ( #9550 )  
							
							 
							
							
							
						 
						
							2023-11-29 15:16:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								b824754256 
								
							 
						 
						
							
							
								
								LLM: Update for cpu qlora mpirun ( #9548 )  
							
							 
							
							
							
						 
						
							2023-11-29 10:56:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								963a5c8d79 
								
							 
						 
						
							
							
								
								Add vLLM-XPU version's README/examples ( #9536 )  
							
							 
							
							... 
							
							
							
							* test
* test
* fix last kv cache
* add xpu readme
* remove numactl for xpu example
* fix link error
* update max_num_batched_tokens logic
* add explaination
* add xpu environement version requirement
* refine gpu memory
* fix
* fix style 
							
						 
						
							2023-11-28 09:44:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								b6c3520748 
								
							 
						 
						
							
							
								
								Remove xformers from vLLM-CPU ( #9535 )  
							
							 
							
							
							
						 
						
							2023-11-27 11:21:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								2b9c7d2a59 
								
							 
						 
						
							
							
								
								LLM: quick fix alpaca qlora finetuning script ( #9534 )  
							
							 
							
							
							
						 
						
							2023-11-27 11:04:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								6bec0faea5 
								
							 
						 
						
							
							
								
								LLM: support Mistral AWQ models ( #9520 )  
							
							 
							
							
							
						 
						
							2023-11-24 16:20:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								b3178d449f 
								
							 
						 
						
							
							
								
								Update README.md ( #9525 )  
							
							 
							
							
							
						 
						
							2023-11-23 21:45:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								82898a4203 
								
							 
						 
						
							
							
								
								Update GPU example README ( #9524 )  
							
							 
							
							
							
						 
						
							2023-11-23 21:20:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								064848028f 
								
							 
						 
						
							
							
								
								Update README.md ( #9523 )  
							
							 
							
							
							
						 
						
							2023-11-23 21:16:21 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								bf579507c2 
								
							 
						 
						
							
							
								
								Integrate vllm ( #9310 )  
							
							 
							
							... 
							
							
							
							* done
* Rename structure
* add models
* Add structure/sampling_params,sequence
* add input_metadata
* add outputs
* Add policy,logger
* add and update
* add parallelconfig back
* core/scheduler.py
* Add llm_engine.py
* Add async_llm_engine.py
* Add tested entrypoint
* fix minor error
* Fix everything
* fix kv cache view
* fix
* fix
* fix
* format&refine
* remove logger from repo
* try to add token latency
* remove logger
* Refine config.py
* finish worker.py
* delete utils.py
* add license
* refine
* refine sequence.py
* remove sampling_params.py
* finish
* add license
* format
* add license
* refine
* refine
* Refine line too long
* remove exception
* so dumb style-check
* refine
* refine
* refine
* refine
* refine
* refine
* add README
* refine README
* add warning instead error
* fix padding
* add license
* format
* format
* format fix
* Refine vllm dependency (#1 )
vllm dependency clear
* fix licence
* fix format
* fix format
* fix
* adapt LLM engine
* fix
* add license
* fix format
* fix
* Moving README.md to the correct position
* Fix readme.md
* done
* guide for adding models
* fix
* Fix README.md
* Add new model readme
* remove ray-logic
* refactor arg_utils.py
* remove distributed_init_method logic
* refactor entrypoints
* refactor input_metadata
* refactor model_loader
* refactor utils.py
* refactor models
* fix api server
* remove vllm.stucture
* revert by txy 1120
* remove utils
* format
* fix license
* add bigdl model
* Refer to a specfic commit
* Change code base
* add comments
* add async_llm_engine comment
* refine
* formatted
* add worker comments
* add comments
* add comments
* fix style
* add changes
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com>
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com> 
							
						 
						
							2023-11-23 16:46:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								48fbb1eb94 
								
							 
						 
						
							
							
								
								support ccl (MPI) distributed mode in alpaca_qlora_finetuning_cpu ( #9507 )  
							
							 
							
							
							
						 
						
							2023-11-23 10:58:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								11fa5a8a0e 
								
							 
						 
						
							
							
								
								Fix QLoRA CPU dispatch_model issue about accelerate ( #9506 )  
							
							 
							
							
							
						 
						
							2023-11-23 08:41:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								1453046938 
								
							 
						 
						
							
							
								
								install bigdl-llm in deepspeed cpu inference example ( #9508 )  
							
							 
							
							
							
						 
						
							2023-11-23 08:39:21 +08:00