Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								4f34557224
								
							
						 | 
						
							
							
								
								LLM: support num_beams in all-in-one benchmark (#9141)
							
							
							
							
							
							
							
							* support num_beams
* fix 
							
						 | 
						
							2023-10-12 13:35:12 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								62ac7ae444
								
							
						 | 
						
							
							
								
								LLM: fix inaccurate input / output tokens of current all-in-one benchmark (#9137)
							
							
							
							
							
							
							
							* first fix
* fix all apis
* fix 
							
						 | 
						
							2023-10-11 17:13:34 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								eb3fb18eb4
								
							
						 | 
						
							
							
								
								LLM: improve PyTorch API doc (#9128)
							
							
							
							
							
						 | 
						
							2023-10-11 15:03:39 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								995b0f119f
								
							
						 | 
						
							
							
								
								LLM: update some gpu examples (#9136)
							
							
							
							
							
						 | 
						
							2023-10-11 14:23:56 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								1c8d5da362
								
							
						 | 
						
							
							
								
								LLM: fix llama tokenizer for all-in-one benchmark (#9129)
							
							
							
							
							
							
							
							* fix tokenizer for gpu benchmark
* fix ipex fp16
* meet code review
* fix 
							
						 | 
						
							2023-10-11 13:39:39 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								2ad67a18b1
								
							
						 | 
						
							
							
								
								LLM: add mistral examples (#9121)
							
							
							
							
							
						 | 
						
							2023-10-11 13:38:15 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								1363e666fc
								
							
						 | 
						
							
							
								
								LLM: update benchmark_util.py for beam search (#9126)
							
							
							
							
							
							
							
							* update reorder_cache
* fix 
							
						 | 
						
							2023-10-11 09:41:53 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Guoqiong Song
								
							 
						 | 
						
							
							
							
							
								
							
							
								e8c5645067
								
							
						 | 
						
							
							
								
								add LLM example of aquila on GPU (#9056)
							
							
							
							
							
							
							
							* aquila, dolly-v1, dolly-v2, vacuna 
							
						 | 
						
							2023-10-10 17:01:35 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								388f688ef3
								
							
						 | 
						
							
							
								
								LLM: update setup.py to add bigdl-core-xe package (#9122)
							
							
							
							
							
						 | 
						
							2023-10-10 15:02:48 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zhao Changmin
								
							 
						 | 
						
							
							
							
							
								
							
							
								1709beba5b
								
							
						 | 
						
							
							
								
								LLM: Explicitly close pickle file pointer before removing temporary directory (#9120)
							
							
							
							
							
							
							
							* fp close 
							
						 | 
						
							2023-10-10 14:57:23 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yuwen Hu
								
							 
						 | 
						
							
							
							
							
								
							
							
								0e09dd926b
								
							
						 | 
						
							
							
								
								[LLM] Fix example test (#9118)
							
							
							
							
							
							
							
							* Update llm example test link due to example layout change
* Add better change detect 
							
						 | 
						
							2023-10-10 13:24:18 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								ad7d9231f5
								
							
						 | 
						
							
							
								
								LLM: add benchmark script for Max gpu and ipex fp16 gpu (#9112)
							
							
							
							
							
							
							
							* add pvc bash
* meet code review
* rename to run-max-gpu.sh 
							
						 | 
						
							2023-10-10 10:18:41 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								e4d1457a70
								
							
						 | 
						
							
							
								
								LLM: improve transformers style API doc (#9113)
							
							
							
							
							
						 | 
						
							2023-10-10 09:31:00 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yuwen Hu
								
							 
						 | 
						
							
							
							
							
								
							
							
								65212451cc
								
							
						 | 
						
							
							
								
								[LLM] Small update to performance tests (#9106)
							
							
							
							
							
							
							
							* small updates to llm performance tests regarding model handling
* Small fix 
							
						 | 
						
							2023-10-09 16:55:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zhao Changmin
								
							 
						 | 
						
							
							
							
							
								
							
							
								edccfb2ed3
								
							
						 | 
						
							
							
								
								LLM: Check model device type (#9092)
							
							
							
							
							
							
							
							* check model device 
							
						 | 
						
							2023-10-09 15:49:15 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								5e9962b60e
								
							
						 | 
						
							
							
								
								LLM: update example layout (#9046)
							
							
							
							
							
						 | 
						
							2023-10-09 15:36:39 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
							
							
								
							
							
								4c4f8d1663
								
							
						 | 
						
							
							
								
								[LLM]Fix Arc falcon abnormal output issue (#9096)
							
							
							
							
							
							
							
							* update
* update
* fix error & style
* fix style
* update train
* to input_seq_size 
							
						 | 
						
							2023-10-09 15:09:37 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Zhao Changmin
								
							 
						 | 
						
							
							
							
							
								
							
							
								548e4dd5fe
								
							
						 | 
						
							
							
								
								LLM: Adapt transformers models for optimize model SL (#9022)
							
							
							
							
							
							
							
							* LLM: Adapt transformers model for SL 
							
						 | 
						
							2023-10-09 11:13:44 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								f64257a093
								
							
						 | 
						
							
							
								
								LLM: basic api support for esimd fp16 (#9067)
							
							
							
							
							
							
							
							* basic api support for fp16
* fix style
* fix
* fix error and style
* fix style
* meet code review
* update based on comments 
							
						 | 
						
							2023-10-09 11:05:17 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									JIN Qiao
								
							 
						 | 
						
							
							
							
							
								
							
							
								65373d2a8b
								
							
						 | 
						
							
							
								
								LLM: adjust portable zip content (#9054)
							
							
							
							
							
							
							
							* LLM: adjust portable zip content
* LLM: adjust portable zip README 
							
						 | 
						
							2023-10-09 10:51:19 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xin Qiu
								
							 
						 | 
						
							
							
							
							
								
							
							
								b3e94a32d4
								
							
						 | 
						
							
							
								
								change log4error import (#9098)
							
							
							
							
							
						 | 
						
							2023-10-08 09:23:28 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Kai Huang
								
							 
						 | 
						
							
							
							
							
								
							
							
								78ea7ddb1c
								
							
						 | 
						
							
							
								
								Combine apply_rotary_pos_emb for gpt-neox (#9074)
							
							
							
							
							
						 | 
						
							2023-10-07 16:27:46 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yang Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								36dd4afd61
								
							
						 | 
						
							
							
								
								Fix llama when rope scaling is not None (#9086)
							
							
							
							
							
							
							
							* Fix llama when rope scaling is not None
* fix style
* fix style 
							
						 | 
						
							2023-10-06 13:27:37 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yang Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								fcb1c618a0
								
							
						 | 
						
							
							
								
								using bigdl-llm fused rope for llama (#9066)
							
							
							
							
							
							
							
							* optimize llama xpu rope
* fix bug
* fix style
* refine append cache
* remove check
* do not cache cos sin
* remove unnecessary changes
* clean up
* fix style
* check for training 
							
						 | 
						
							2023-10-06 09:57:29 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jiao Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								aefa5a5bfe
								
							
						 | 
						
							
							
								
								Qwen kv cache (#9079)
							
							
							
							
							
							
							
							* qwen and aquila
* update
* update
* style 
							
						 | 
						
							2023-10-05 11:59:17 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jiao Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								d5ca1f32b6
								
							
						 | 
						
							
							
								
								Aquila KV cache optimization (#9080)
							
							
							
							
							
							
							
							* update
* update
* style 
							
						 | 
						
							2023-10-05 11:10:57 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yang Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								88565c76f6
								
							
						 | 
						
							
							
								
								add export merged model example (#9018)
							
							
							
							
							
							
							
							* add export merged model example
* add sources
* add script
* fix style 
							
						 | 
						
							2023-10-04 21:18:52 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yang Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								0cd8f1c79c
								
							
						 | 
						
							
							
								
								Use ipex fused rms norm for llama (#9081)
							
							
							
							
							
							
							
							* also apply rmsnorm
* fix cpu 
							
						 | 
						
							2023-10-04 21:04:55 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								fb883100e7
								
							
						 | 
						
							
							
								
								LLM: support chatglm-18b convert attention forward in benchmark scripts. (#9072)
							
							
							
							
							
							
							
							* add chatglm-18b convert.
* fix if statement.
* fix 
							
						 | 
						
							2023-09-28 14:04:52 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yishuo Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								6de2189e90
								
							
						 | 
						
							
							
								
								[LLM] fix chatglm main choice (#9073)
							
							
							
							
							
						 | 
						
							2023-09-28 11:23:37 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								ad62c58b33
								
							
						 | 
						
							
							
								
								LLM: Enable jemalloc in benchmark scripts. (#9058)
							
							
							
							
							
							
							
							* enable jemalloc.
* fix readme. 
							
						 | 
						
							2023-09-26 15:37:49 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								b4a1266ef0
								
							
						 | 
						
							
							
								
								[WIP] LLM: add kv cache support for internlm. (#9036)
							
							
							
							
							
							
							
							* LLM: add kv cache support for internlm
* add internlm apply_rotary_pos_emb
* fix.
* fix style. 
							
						 | 
						
							2023-09-25 14:16:59 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								975da86e00
								
							
						 | 
						
							
							
								
								LLM: fix gptneox kv cache (#9044)
							
							
							
							
							
						 | 
						
							2023-09-25 13:03:57 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								26213a5829
								
							
						 | 
						
							
							
								
								LLM: Change benchmark bf16 load format. (#9035)
							
							
							
							
							
							
							
							* LLM: Change benchmark bf16 load format.
* comment on bf16 chatglm.
* fix. 
							
						 | 
						
							2023-09-22 17:38:38 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									JinBridge
								
							 
						 | 
						
							
							
							
							
								
							
							
								023555fb1f
								
							
						 | 
						
							
							
								
								LLM: Add one-click installer for Windows (#8999)
							
							
							
							
							
							
							
							* LLM: init one-click installer for windows
* LLM: fix typo in one-click installer readme
* LLM: one-click installer try except logic
* LLM: one-click installer add dependency
* LLM: one-click installer adjust README.md
* LLM: one-click installer split README and add zip compress in setup.bat
* LLM: one-click installer verified internlm and llama2 and replace gif
* LLM: remove one-click installer images
* LLM: finetune the one-click installer README.md
* LLM: fix typo in one-click installer README.md
* LLM: rename one-click installer to protable executable
* LLM: rename other places to protable executable
* LLM: rename the zip filename to executable
* LLM: update .gitignore
* LLM: add colorama to setup.bat 
							
						 | 
						
							2023-09-22 14:46:30 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jiao Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								028a6d9383
								
							
						 | 
						
							
							
								
								MPT model optimize for long sequence (#9020)
							
							
							
							
							
							
							
							* mpt_long_seq
* update
* update
* update
* style
* style2
* update 
							
						 | 
						
							2023-09-21 21:27:23 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								b943d73844
								
							
						 | 
						
							
							
								
								LLM: refactor kv cache (#9030)
							
							
							
							
							
							
							
							* refactor utils
* meet code review; update all models
* small fix 
							
						 | 
						
							2023-09-21 21:28:03 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								868511cf02
								
							
						 | 
						
							
							
								
								LLM: fix kv cache issue of bloom and falcon. (#9029)
							
							
							
							
							
						 | 
						
							2023-09-21 18:12:20 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								bf51ec40b2
								
							
						 | 
						
							
							
								
								LLM: Fix empty cache (#9024)
							
							
							
							
							
							
							
							* fix
* fix
* update example 
							
						 | 
						
							2023-09-21 17:16:07 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yina Chen
								
							 
						 | 
						
							
							
							
							
								
							
							
								714884414e
								
							
						 | 
						
							
							
								
								fix error (#9025)
							
							
							
							
							
						 | 
						
							2023-09-21 16:42:11 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								edb225530b
								
							
						 | 
						
							
							
								
								add bark (#9016)
							
							
							
							
							
						 | 
						
							2023-09-21 12:24:58 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									SONG Ge
								
							 
						 | 
						
							
							
							
							
								
							
							
								fa47967583
								
							
						 | 
						
							
							
								
								[LLM] Optimize kv_cache for gptj model family (#9010)
							
							
							
							
							
							
							
							* optimize gptj model family attention
* add license and comment for dolly-model
* remove xpu mentioned
* remove useless info
* code sytle
* style fix
* code style in gptj fix
* remove gptj arch
* move apply_rotary_pos_emb into utils
* kv_seq_length update
* use hidden_states instead of query layer to reach batch size 
							
						 | 
						
							2023-09-21 10:42:08 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								b3cad7de57
								
							
						 | 
						
							
							
								
								LLM: add bloom kv cache support (#9012)
							
							
							
							
							
							
							
							* LLM: add bloom kv cache support
* fix style. 
							
						 | 
						
							2023-09-20 21:10:53 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Kai Huang
								
							 
						 | 
						
							
							
							
							
								
							
							
								156af15d1e
								
							
						 | 
						
							
							
								
								Add NF3 (#9008)
							
							
							
							
							
							
							
							* add nf3
* grammar 
							
						 | 
						
							2023-09-20 20:03:07 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Kai Huang
								
							 
						 | 
						
							
							
							
							
								
							
							
								6981745fe4
								
							
						 | 
						
							
							
								
								Optimize kv_cache for gpt-neox model family (#9015)
							
							
							
							
							
							
							
							* override gptneox
* style
* move to utils
* revert 
							
						 | 
						
							2023-09-20 19:59:19 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									JinBridge
								
							 
						 | 
						
							
							
							
							
								
							
							
								48b503c630
								
							
						 | 
						
							
							
								
								LLM: add example of aquila (#9006)
							
							
							
							
							
							
							
							* LLM: add example of aquila
* LLM: replace AquilaChat with Aquila
* LLM: shorten prompt of aquila example 
							
						 | 
						
							2023-09-20 15:52:56 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								735a17f7b4
								
							
						 | 
						
							
							
								
								LLM: add kv cache to falcon family. (#8995)
							
							
							
							
							
							
							
							* add kv cache to falcon family.
* fix: import error.
* refactor
* update comments.
* add two version falcon attention forward.
* fix
* fix.
* fix.
* fix.
* fix style.
* fix style. 
							
						 | 
						
							2023-09-20 15:36:30 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								94a7f8917b
								
							
						 | 
						
							
							
								
								LLM: fix optimized kv cache for baichuan-13b (#9009)
							
							
							
							
							
							
							
							* fix baichuan 13b
* fix style
* fix
* fix style 
							
						 | 
						
							2023-09-20 15:30:14 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Yang Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								c88f6ec457
								
							
						 | 
						
							
							
								
								Experiment XPU QLora Finetuning (#8937)
							
							
							
							
							
							
							
							* Support xpu finetuning
* support xpu finetuning
* fix style
* fix style
* fix style
* refine example
* add readme
* refine readme
* refine api
* fix fp16
* fix example
* refactor
* fix style
* fix compute type
* add qlora
* refine training args
* fix example
* fix style
* fast path forinference
* address comments
* refine readme
* revert lint 
							
						 | 
						
							2023-09-19 10:15:44 -07:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Jason Dai
								
							 
						 | 
						
							
							
							
							
								
							
							
								51518e029d
								
							
						 | 
						
							
							
								
								Update llm readme (#9005)
							
							
							
							
							
						 | 
						
							2023-09-19 20:01:33 +08:00 | 
						
						
							
							
							
								
							
							
						 |