SONG Ge 
								
							 
						 
						
							
							
							
							
								
							
							
								160a1e5ee7 
								
							 
						 
						
							
							
								
								[WIP] Add UT for Mistral Optimized Model ( #9248 )  
							
							 
							
							... 
							
							
							
							* add ut for mistral model
* update
* fix model path
* upgrade transformers version for mistral model
* refactor correctness ut for mustral model
* refactor mistral correctness ut
* revert test_optimize_model back
* remove mistral from test_optimize_model
* add to revert transformers version back to 4.31.0 
							
						 
						
							2023-10-25 15:14:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								067c7e8098 
								
							 
						 
						
							
							
								
								Support deepspeed AutoTP ( #9230 )  
							
							 
							
							... 
							
							
							
							* Support deepspeed
* add test script
* refactor convert
* refine example
* refine
* refine example
* fix style
* refine example and adapte latest ipex
* fix style 
							
						 
						
							2023-10-24 23:46:28 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yining Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								a6a8afc47e 
								
							 
						 
						
							
							
								
								Add qwen vl CPU example  ( #9221 )  
							
							 
							
							... 
							
							
							
							* eee
* add examples on CPU and GPU
* fix
* fix
* optimize model examples
* add Qwen-VL-Chat CPU example
* Add Qwen-VL CPU example
* fix optimize problem
* fix error
* Have updated, benchmark fix removed from this PR
* add generate API example
* Change formats in qwen-vl example
* Add CPU transformer int4 example for qwen-vl
* fix repo-id problem and add Readme
* change picture url
* Remove unnecessary file
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> 
							
						 
						
							2023-10-25 13:22:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								f597a9d4f5 
								
							 
						 
						
							
							
								
								LLM: update perf test configuration ( #9264 )  
							
							 
							
							
							
						 
						
							2023-10-25 12:35:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								770ac70b00 
								
							 
						 
						
							
							
								
								LLM: add low_bit option in benchmark scripts ( #9257 )  
							
							 
							
							
							
						 
						
							2023-10-25 10:27:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								ec9195da42 
								
							 
						 
						
							
							
								
								LLM: using html to visualize the perf result for Arc ( #9228 )  
							
							 
							
							... 
							
							
							
							* LLM: using html to visualize the perf result for Arc
* deploy the html file
* add python license
* reslove some comments 
							
						 
						
							2023-10-24 18:05:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								90162264a3 
								
							 
						 
						
							
							
								
								LLM: replace torch.float32 with auto type ( #9261 )  
							
							 
							
							
							
						 
						
							2023-10-24 17:12:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
							
							
								
							
							
								bd5215d75b 
								
							 
						 
						
							
							
								
								[LLM] Reimplement chatglm fuse rms optimization ( #9260 )  
							
							 
							
							... 
							
							
							
							* re-implement chatglm rope rms
* update 
							
						 
						
							2023-10-24 16:35:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								5a2ce421af 
								
							 
						 
						
							
							
								
								add cpu and gpu examples of flan-t5 ( #9171 )  
							
							 
							
							... 
							
							
							
							* add cpu and gpu examples of flan-t5
* address yuwen's comments
* Add explanation  why we add modules to not convert
* Refine prompt and add a translation example
* Add a empty line at the end of files
* add examples of flan-t5 using optimize_mdoel api
* address bin's comments
* address binbin's comments
* add flan-t5 in readme 
							
						 
						
							2023-10-24 15:24:01 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yining Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								4a19f50d16 
								
							 
						 
						
							
							
								
								phi-1_5 CPU and GPU examples ( #9173 )  
							
							 
							
							... 
							
							
							
							* eee
* add examples on CPU and GPU
* fix
* fix
* optimize model examples
* have updated
* Warmup and configs added
* Update two tables 
							
						 
						
							2023-10-24 15:08:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
							
							
								
							
							
								bfc1e2d733 
								
							 
						 
						
							
							
								
								add fused rms optimization for chatglm model ( #9256 )  
							
							 
							
							
							
						 
						
							2023-10-24 14:40:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								b15656229e 
								
							 
						 
						
							
							
								
								LLM: fix benchmark issue ( #9255 )  
							
							 
							
							
							
						 
						
							2023-10-24 14:15:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								f37547249d 
								
							 
						 
						
							
							
								
								Refine README/CICD ( #9253 )  
							
							 
							
							
							
						 
						
							2023-10-24 12:56:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								db37edae8a 
								
							 
						 
						
							
							
								
								LLM: update langchain api document page ( #9222 )  
							
							 
							
							
							
						 
						
							2023-10-24 10:13:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								0c5055d38c 
								
							 
						 
						
							
							
								
								add position_ids and fuse embedding for falcon ( #9242 )  
							
							 
							
							... 
							
							
							
							* add position_ids for falcon
* add cpu
* add cpu
* add license 
							
						 
						
							2023-10-24 09:58:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Wang, Jian4 
								
							 
						 
						
							
							
							
							
								
							
							
								c14a61681b 
								
							 
						 
						
							
							
								
								Add load low-bit in model-serving for reduce EPC ( #9239 )  
							
							 
							
							... 
							
							
							
							* init load low-bit
* fix
* fix 
							
						 
						
							2023-10-23 11:28:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
							
							
								
							
							
								0383306688 
								
							 
						 
						
							
							
								
								Add arc fp8 support ( #9232 )  
							
							 
							
							... 
							
							
							
							* add fp8 support
* add log
* fix style 
							
						 
						
							2023-10-20 17:15:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								118249b011 
								
							 
						 
						
							
							
								
								support transformers 4.34+ for llama ( #9229 )  
							
							 
							
							
							
						 
						
							2023-10-19 22:36:30 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								5850241423 
								
							 
						 
						
							
							
								
								correct Readme GPU example and API docstring ( #9225 )  
							
							 
							
							... 
							
							
							
							* update readme to correct GPU usage
* update from_pretrained supported low bit options
* fix stype check 
							
						 
						
							2023-10-19 16:08:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								f87f67ee1c 
								
							 
						 
						
							
							
								
								LLM: arc perf test for some popular models ( #9188 )  
							
							 
							
							
							
						 
						
							2023-10-19 15:56:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								b0ddde0410 
								
							 
						 
						
							
							
								
								Fix removing convert dtype bug ( #9216 )  
							
							 
							
							... 
							
							
							
							* Fix removing convert dtype bug
* fix style 
							
						 
						
							2023-10-18 11:24:22 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								942d6418e7 
								
							 
						 
						
							
							
								
								LLM: fix chatglm kv cache ( #9215 )  
							
							 
							
							
							
						 
						
							2023-10-18 19:09:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
							
							
								
							
							
								0765f94770 
								
							 
						 
						
							
							
								
								[LLM] Optimize kv_cache for mistral model family ( #9189 )  
							
							 
							
							... 
							
							
							
							* add kv_cache optimization for mistral model
* kv_cache optimize for mistral
* update stylr
* update 
							
						 
						
							2023-10-18 15:13:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								3555ebc148 
								
							 
						 
						
							
							
								
								LLM: fix wrong length in gptj kv_cache optimization ( #9210 )  
							
							 
							
							... 
							
							
							
							* fix wrong length in gptj kv cache
* update 
							
						 
						
							2023-10-18 14:59:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shengsheng Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								6dad8d16df 
								
							 
						 
						
							
							
								
								optimize NormHead for Baichuan2 ( #9205 )  
							
							 
							
							... 
							
							
							
							* optimize NormHead for Baichuan2
* fix ut and change name
* rename functions 
							
						 
						
							2023-10-18 14:05:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								a3b664ed03 
								
							 
						 
						
							
							
								
								LLM: add GPU More-Data-Types and Save/Load example ( #9199 )  
							
							 
							
							
							
						 
						
							2023-10-18 13:13:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								b9194c5786 
								
							 
						 
						
							
							
								
								LLM: skip some model tests using certain api ( #9163 )  
							
							 
							
							... 
							
							
							
							* LLM: Skip some model tests using certain api
* initialize variable named result 
							
						 
						
							2023-10-18 09:39:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								09815f7064 
								
							 
						 
						
							
							
								
								LLM: fix RMSNorm optimization of Baichuan2-13B/Baichuan-13B ( #9204 )  
							
							 
							
							... 
							
							
							
							* fix rmsnorm of baichuan2-13B
* update baichuan1-13B too
* fix style 
							
						 
						
							2023-10-17 18:40:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								d7ce78edf0 
								
							 
						 
						
							
							
								
								LLM: fix portable zip README image link ( #9201 )  
							
							 
							
							... 
							
							
							
							* LLM: fix portable zip readme img link
* LLM: make README first image center align 
							
						 
						
							2023-10-17 16:38:22 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cheen Hau, 俊豪 
								
							 
						 
						
							
							
							
							
								
							
							
								66c2e45634 
								
							 
						 
						
							
							
								
								Add unit tests for optimized model correctness ( #9151 )  
							
							 
							
							... 
							
							
							
							* Add test to check correctness of optimized model
* Refactor optimized model test
* Use models in llm-unit-test
* Use AutoTokenizer for bloom
* Print out each passed test
* Remove unused tokenizer from import 
							
						 
						
							2023-10-17 14:46:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								d946bd7c55 
								
							 
						 
						
							
							
								
								LLM: add CPU More-Data-Types and Save-Load examples ( #9179 )  
							
							 
							
							
							
						 
						
							2023-10-17 14:38:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								c0497ab41b 
								
							 
						 
						
							
							
								
								LLM: support kv_cache optimization for Qwen-VL-Chat ( #9193 )  
							
							 
							
							... 
							
							
							
							* dupport qwen_vl_chat
* fix style 
							
						 
						
							2023-10-17 13:33:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								1cd9ab15b8 
								
							 
						 
						
							
							
								
								LLM: fix ChatGLMConfig check ( #9191 )  
							
							 
							
							
							
						 
						
							2023-10-17 11:52:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								7160afd4d1 
								
							 
						 
						
							
							
								
								Support XPU DDP training and autocast for LowBitMatmul ( #9167 )  
							
							 
							
							... 
							
							
							
							* support autocast in low bit matmul
* Support XPU DDP training
* fix  amp 
							
						 
						
							2023-10-16 20:47:19 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								77afb8796b 
								
							 
						 
						
							
							
								
								LLM: fix convert of chatglm ( #9190 )  
							
							 
							
							
							
						 
						
							2023-10-17 10:48:13 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								af3b575c7e 
								
							 
						 
						
							
							
								
								expose modules_to_not_convert in optimize_model ( #9180 )  
							
							 
							
							... 
							
							
							
							* expose modules_to_not_convert in optimize_model
* some fixes 
							
						 
						
							2023-10-17 09:50:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								5ca8a851e9 
								
							 
						 
						
							
							
								
								LLM: add fuse optimization for Mistral. ( #9184 )  
							
							 
							
							... 
							
							
							
							* add fuse optimization for mistral.
* fix.
* fix
* fix style.
* fix.
* fix error.
* fix style.
* fix style. 
							
						 
						
							2023-10-16 16:50:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								49e1381c7f 
								
							 
						 
						
							
							
								
								update rope ( #9155 )  
							
							 
							
							
							
						 
						
							2023-10-15 21:51:45 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jason Dai 
								
							 
						 
						
							
							
							
							
								
							
							
								b192a8032c 
								
							 
						 
						
							
							
								
								Update llm-readme ( #9176 )  
							
							 
							
							
							
						 
						
							2023-10-16 10:54:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								a164c24746 
								
							 
						 
						
							
							
								
								LLM: add kv_cache optimization for chatglm2-6b-32k ( #9165 )  
							
							 
							
							
							
						 
						
							2023-10-16 10:43:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								7a2de00b48 
								
							 
						 
						
							
							
								
								Fixes for xpu Bf16 training ( #9156 )  
							
							 
							
							... 
							
							
							
							* Support bf16 training
* Use a stable transformer version
* remove env
* fix style 
							
						 
						
							2023-10-14 21:28:59 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								51a133de56 
								
							 
						 
						
							
							
								
								LLM: add fuse rope and norm optimization for Baichuan. ( #9166 )  
							
							 
							
							... 
							
							
							
							* add fuse rope optimization.
* add rms norm optimization. 
							
						 
						
							2023-10-13 17:36:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								db7f938fdc 
								
							 
						 
						
							
							
								
								LLM: add replit and starcoder to gpu pytorch model example ( #9154 )  
							
							 
							
							
							
						 
						
							2023-10-13 15:44:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								797b156a0d 
								
							 
						 
						
							
							
								
								LLM: add dolly-v1 and dolly-v2 to gpu pytorch model example ( #9153 )  
							
							 
							
							
							
						 
						
							2023-10-13 15:43:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								259cbb4126 
								
							 
						 
						
							
							
								
								[LLM] add initial bigdl-llm-init ( #9150 )  
							
							 
							
							
							
						 
						
							2023-10-13 15:31:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								433f408081 
								
							 
						 
						
							
							
								
								LLM: Add fuse rope and norm optimization for Aquila. ( #9161 )  
							
							 
							
							... 
							
							
							
							* add fuse norm optimization.
* add fuse rope optimization 
							
						 
						
							2023-10-13 14:18:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
							
							
								
							
							
								e7aa67e141 
								
							 
						 
						
							
							
								
								[LLM] Add rope  optimization for internlm ( #9159 )  
							
							 
							
							... 
							
							
							
							* add rope and norm optimization for internlm and gptneox
* revert gptneox back and split with pr#9155 #
* add norm_forward
* style fix
* update
* update 
							
						 
						
							2023-10-13 14:18:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jin Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								f754ab3e60 
								
							 
						 
						
							
							
								
								LLM: add baichuan and baichuan2 to gpu pytorch model example ( #9152 )  
							
							 
							
							
							
						 
						
							2023-10-13 13:44:31 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								b8aee7bb1b 
								
							 
						 
						
							
							
								
								LLM: Fix Qwen kv_cache optimization ( #9148 )  
							
							 
							
							... 
							
							
							
							* first commit
* ut pass
* accelerate rotate half by using common util function
* fix style 
							
						 
						
							2023-10-12 15:49:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								69942d3826 
								
							 
						 
						
							
							
								
								LLM: fix model check before attention optimization ( #9149 )  
							
							 
							
							
							
						 
						
							2023-10-12 15:21:51 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									JIN Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								1a1ddc4144 
								
							 
						 
						
							
							
								
								LLM: Add Replit CPU and GPU example ( #9028 )  
							
							 
							
							
							
						 
						
							2023-10-12 13:42:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									JIN Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								d74834ff4c 
								
							 
						 
						
							
							
								
								LLM: add gpu pytorch-models example llama2 and chatglm2 ( #9142 )  
							
							 
							
							
							
						 
						
							2023-10-12 13:41:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								4f34557224 
								
							 
						 
						
							
							
								
								LLM: support num_beams in all-in-one benchmark ( #9141 )  
							
							 
							
							... 
							
							
							
							* support num_beams
* fix 
							
						 
						
							2023-10-12 13:35:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								62ac7ae444 
								
							 
						 
						
							
							
								
								LLM: fix inaccurate input / output tokens of current all-in-one benchmark ( #9137 )  
							
							 
							
							... 
							
							
							
							* first fix
* fix all apis
* fix 
							
						 
						
							2023-10-11 17:13:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								eb3fb18eb4 
								
							 
						 
						
							
							
								
								LLM: improve PyTorch API doc ( #9128 )  
							
							 
							
							
							
						 
						
							2023-10-11 15:03:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								995b0f119f 
								
							 
						 
						
							
							
								
								LLM: update some gpu examples ( #9136 )  
							
							 
							
							
							
						 
						
							2023-10-11 14:23:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								1c8d5da362 
								
							 
						 
						
							
							
								
								LLM: fix llama tokenizer for all-in-one benchmark ( #9129 )  
							
							 
							
							... 
							
							
							
							* fix tokenizer for gpu benchmark
* fix ipex fp16
* meet code review
* fix 
							
						 
						
							2023-10-11 13:39:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								2ad67a18b1 
								
							 
						 
						
							
							
								
								LLM: add mistral examples ( #9121 )  
							
							 
							
							
							
						 
						
							2023-10-11 13:38:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								1363e666fc 
								
							 
						 
						
							
							
								
								LLM: update benchmark_util.py for beam search ( #9126 )  
							
							 
							
							... 
							
							
							
							* update reorder_cache
* fix 
							
						 
						
							2023-10-11 09:41:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guoqiong Song 
								
							 
						 
						
							
							
							
							
								
							
							
								e8c5645067 
								
							 
						 
						
							
							
								
								add LLM example of aquila on GPU ( #9056 )  
							
							 
							
							... 
							
							
							
							* aquila, dolly-v1, dolly-v2, vacuna 
							
						 
						
							2023-10-10 17:01:35 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								388f688ef3 
								
							 
						 
						
							
							
								
								LLM: update setup.py to add bigdl-core-xe package ( #9122 )  
							
							 
							
							
							
						 
						
							2023-10-10 15:02:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhao Changmin 
								
							 
						 
						
							
							
							
							
								
							
							
								1709beba5b 
								
							 
						 
						
							
							
								
								LLM: Explicitly close pickle file pointer before removing temporary directory ( #9120 )  
							
							 
							
							... 
							
							
							
							* fp close 
							
						 
						
							2023-10-10 14:57:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								0e09dd926b 
								
							 
						 
						
							
							
								
								[LLM] Fix example test ( #9118 )  
							
							 
							
							... 
							
							
							
							* Update llm example test link due to example layout change
* Add better change detect 
							
						 
						
							2023-10-10 13:24:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								ad7d9231f5 
								
							 
						 
						
							
							
								
								LLM: add benchmark script for Max gpu and ipex fp16 gpu ( #9112 )  
							
							 
							
							... 
							
							
							
							* add pvc bash
* meet code review
* rename to run-max-gpu.sh 
							
						 
						
							2023-10-10 10:18:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								e4d1457a70 
								
							 
						 
						
							
							
								
								LLM: improve transformers style API doc ( #9113 )  
							
							 
							
							
							
						 
						
							2023-10-10 09:31:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								65212451cc 
								
							 
						 
						
							
							
								
								[LLM] Small update to performance tests ( #9106 )  
							
							 
							
							... 
							
							
							
							* small updates to llm performance tests regarding model handling
* Small fix 
							
						 
						
							2023-10-09 16:55:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhao Changmin 
								
							 
						 
						
							
							
							
							
								
							
							
								edccfb2ed3 
								
							 
						 
						
							
							
								
								LLM: Check model device type ( #9092 )  
							
							 
							
							... 
							
							
							
							* check model device 
							
						 
						
							2023-10-09 15:49:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								5e9962b60e 
								
							 
						 
						
							
							
								
								LLM: update example layout ( #9046 )  
							
							 
							
							
							
						 
						
							2023-10-09 15:36:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
							
							
								
							
							
								4c4f8d1663 
								
							 
						 
						
							
							
								
								[LLM]Fix Arc falcon abnormal output issue ( #9096 )  
							
							 
							
							... 
							
							
							
							* update
* update
* fix error & style
* fix style
* update train
* to input_seq_size 
							
						 
						
							2023-10-09 15:09:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Zhao Changmin 
								
							 
						 
						
							
							
							
							
								
							
							
								548e4dd5fe 
								
							 
						 
						
							
							
								
								LLM: Adapt transformers models for optimize model SL ( #9022 )  
							
							 
							
							... 
							
							
							
							* LLM: Adapt transformers model for SL 
							
						 
						
							2023-10-09 11:13:44 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								f64257a093 
								
							 
						 
						
							
							
								
								LLM: basic api support for esimd fp16 ( #9067 )  
							
							 
							
							... 
							
							
							
							* basic api support for fp16
* fix style
* fix
* fix error and style
* fix style
* meet code review
* update based on comments 
							
						 
						
							2023-10-09 11:05:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									JIN Qiao 
								
							 
						 
						
							
							
							
							
								
							
							
								65373d2a8b 
								
							 
						 
						
							
							
								
								LLM: adjust portable zip content ( #9054 )  
							
							 
							
							... 
							
							
							
							* LLM: adjust portable zip content
* LLM: adjust portable zip README 
							
						 
						
							2023-10-09 10:51:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								b3e94a32d4 
								
							 
						 
						
							
							
								
								change log4error import ( #9098 )  
							
							 
							
							
							
						 
						
							2023-10-08 09:23:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								78ea7ddb1c 
								
							 
						 
						
							
							
								
								Combine apply_rotary_pos_emb for gpt-neox ( #9074 )  
							
							 
							
							
							
						 
						
							2023-10-07 16:27:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								36dd4afd61 
								
							 
						 
						
							
							
								
								Fix llama when rope scaling is not None ( #9086 )  
							
							 
							
							... 
							
							
							
							* Fix llama when rope scaling is not None
* fix style
* fix style 
							
						 
						
							2023-10-06 13:27:37 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								fcb1c618a0 
								
							 
						 
						
							
							
								
								using bigdl-llm fused rope for llama ( #9066 )  
							
							 
							
							... 
							
							
							
							* optimize llama xpu rope
* fix bug
* fix style
* refine append cache
* remove check
* do not cache cos sin
* remove unnecessary changes
* clean up
* fix style
* check for training 
							
						 
						
							2023-10-06 09:57:29 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								aefa5a5bfe 
								
							 
						 
						
							
							
								
								Qwen kv cache ( #9079 )  
							
							 
							
							... 
							
							
							
							* qwen and aquila
* update
* update
* style 
							
						 
						
							2023-10-05 11:59:17 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								d5ca1f32b6 
								
							 
						 
						
							
							
								
								Aquila KV cache optimization ( #9080 )  
							
							 
							
							... 
							
							
							
							* update
* update
* style 
							
						 
						
							2023-10-05 11:10:57 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								88565c76f6 
								
							 
						 
						
							
							
								
								add export merged model example ( #9018 )  
							
							 
							
							... 
							
							
							
							* add export merged model example
* add sources
* add script
* fix style 
							
						 
						
							2023-10-04 21:18:52 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yang Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								0cd8f1c79c 
								
							 
						 
						
							
							
								
								Use ipex fused rms norm for llama ( #9081 )  
							
							 
							
							... 
							
							
							
							* also apply rmsnorm
* fix cpu 
							
						 
						
							2023-10-04 21:04:55 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								fb883100e7 
								
							 
						 
						
							
							
								
								LLM: support chatglm-18b convert attention forward in benchmark scripts. ( #9072 )  
							
							 
							
							... 
							
							
							
							* add chatglm-18b convert.
* fix if statement.
* fix 
							
						 
						
							2023-09-28 14:04:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yishuo Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								6de2189e90 
								
							 
						 
						
							
							
								
								[LLM] fix chatglm main choice ( #9073 )  
							
							 
							
							
							
						 
						
							2023-09-28 11:23:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								ad62c58b33 
								
							 
						 
						
							
							
								
								LLM: Enable jemalloc in benchmark scripts. ( #9058 )  
							
							 
							
							... 
							
							
							
							* enable jemalloc.
* fix readme. 
							
						 
						
							2023-09-26 15:37:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								b4a1266ef0 
								
							 
						 
						
							
							
								
								[WIP] LLM: add kv cache support for internlm. ( #9036 )  
							
							 
							
							... 
							
							
							
							* LLM: add kv cache support for internlm
* add internlm apply_rotary_pos_emb
* fix.
* fix style. 
							
						 
						
							2023-09-25 14:16:59 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								975da86e00 
								
							 
						 
						
							
							
								
								LLM: fix gptneox kv cache ( #9044 )  
							
							 
							
							
							
						 
						
							2023-09-25 13:03:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								26213a5829 
								
							 
						 
						
							
							
								
								LLM: Change benchmark bf16 load format. ( #9035 )  
							
							 
							
							... 
							
							
							
							* LLM: Change benchmark bf16 load format.
* comment on bf16 chatglm.
* fix. 
							
						 
						
							2023-09-22 17:38:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									JinBridge 
								
							 
						 
						
							
							
							
							
								
							
							
								023555fb1f 
								
							 
						 
						
							
							
								
								LLM: Add one-click installer for Windows ( #8999 )  
							
							 
							
							... 
							
							
							
							* LLM: init one-click installer for windows
* LLM: fix typo in one-click installer readme
* LLM: one-click installer try except logic
* LLM: one-click installer add dependency
* LLM: one-click installer adjust README.md
* LLM: one-click installer split README and add zip compress in setup.bat
* LLM: one-click installer verified internlm and llama2 and replace gif
* LLM: remove one-click installer images
* LLM: finetune the one-click installer README.md
* LLM: fix typo in one-click installer README.md
* LLM: rename one-click installer to protable executable
* LLM: rename other places to protable executable
* LLM: rename the zip filename to executable
* LLM: update .gitignore
* LLM: add colorama to setup.bat 
							
						 
						
							2023-09-22 14:46:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Jiao Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								028a6d9383 
								
							 
						 
						
							
							
								
								MPT model optimize for long sequence ( #9020 )  
							
							 
							
							... 
							
							
							
							* mpt_long_seq
* update
* update
* update
* style
* style2
* update 
							
						 
						
							2023-09-21 21:27:23 -07:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								b943d73844 
								
							 
						 
						
							
							
								
								LLM: refactor kv cache ( #9030 )  
							
							 
							
							... 
							
							
							
							* refactor utils
* meet code review; update all models
* small fix 
							
						 
						
							2023-09-21 21:28:03 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								868511cf02 
								
							 
						 
						
							
							
								
								LLM: fix kv cache issue of bloom and falcon. ( #9029 )  
							
							 
							
							
							
						 
						
							2023-09-21 18:12:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								bf51ec40b2 
								
							 
						 
						
							
							
								
								LLM: Fix empty cache ( #9024 )  
							
							 
							
							... 
							
							
							
							* fix
* fix
* update example 
							
						 
						
							2023-09-21 17:16:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yina Chen 
								
							 
						 
						
							
							
							
							
								
							
							
								714884414e 
								
							 
						 
						
							
							
								
								fix error ( #9025 )  
							
							 
							
							
							
						 
						
							2023-09-21 16:42:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								edb225530b 
								
							 
						 
						
							
							
								
								add bark ( #9016 )  
							
							 
							
							
							
						 
						
							2023-09-21 12:24:58 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									SONG Ge 
								
							 
						 
						
							
							
							
							
								
							
							
								fa47967583 
								
							 
						 
						
							
							
								
								[LLM] Optimize kv_cache for gptj model family ( #9010 )  
							
							 
							
							... 
							
							
							
							* optimize gptj model family attention
* add license and comment for dolly-model
* remove xpu mentioned
* remove useless info
* code sytle
* style fix
* code style in gptj fix
* remove gptj arch
* move apply_rotary_pos_emb into utils
* kv_seq_length update
* use hidden_states instead of query layer to reach batch size 
							
						 
						
							2023-09-21 10:42:08 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								b3cad7de57 
								
							 
						 
						
							
							
								
								LLM: add bloom kv cache support ( #9012 )  
							
							 
							
							... 
							
							
							
							* LLM: add bloom kv cache support
* fix style. 
							
						 
						
							2023-09-20 21:10:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								156af15d1e 
								
							 
						 
						
							
							
								
								Add NF3 ( #9008 )  
							
							 
							
							... 
							
							
							
							* add nf3
* grammar 
							
						 
						
							2023-09-20 20:03:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								6981745fe4 
								
							 
						 
						
							
							
								
								Optimize kv_cache for gpt-neox model family ( #9015 )  
							
							 
							
							... 
							
							
							
							* override gptneox
* style
* move to utils
* revert 
							
						 
						
							2023-09-20 19:59:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									JinBridge 
								
							 
						 
						
							
							
							
							
								
							
							
								48b503c630 
								
							 
						 
						
							
							
								
								LLM: add example of aquila ( #9006 )  
							
							 
							
							... 
							
							
							
							* LLM: add example of aquila
* LLM: replace AquilaChat with Aquila
* LLM: shorten prompt of aquila example 
							
						 
						
							2023-09-20 15:52:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								735a17f7b4 
								
							 
						 
						
							
							
								
								LLM: add kv cache to falcon family. ( #8995 )  
							
							 
							
							... 
							
							
							
							* add kv cache to falcon family.
* fix: import error.
* refactor
* update comments.
* add two version falcon attention forward.
* fix
* fix.
* fix.
* fix.
* fix style.
* fix style. 
							
						 
						
							2023-09-20 15:36:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								94a7f8917b 
								
							 
						 
						
							
							
								
								LLM: fix optimized kv cache for baichuan-13b ( #9009 )  
							
							 
							
							... 
							
							
							
							* fix baichuan 13b
* fix style
* fix
* fix style 
							
						 
						
							2023-09-20 15:30:14 +08:00