binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								db8e90796a 
								
							 
						 
						
							
							
								
								LLM: add avg token latency information and benchmark guide of autotp ( #9940 )  
							
							 
							
							
							
						 
						
							2024-01-19 15:09:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								610b5226be 
								
							 
						 
						
							
							
								
								move reserved memory to benchmark_utils.py ( #9907 )  
							
							 
							
							... 
							
							
							
							* move reserved memory to benchmark_utils.py
* meet code review 
							
						 
						
							2024-01-19 09:44:30 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								a8c866c32b 
								
							 
						 
						
							
							
								
								add ppl benchmark ( #9914 )  
							
							 
							
							... 
							
							
							
							* add ppl benchmark
* add license
* add readme
* add dataset argument
* add dataset usage
* fixed low bit args
* correct result
* fix terminal display
* fix ppl update
* enable fp16 fp32 bf16
* format the desc
* fix model_kwargs
* add more readme 
							
						 
						
							2024-01-18 17:54:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								100e0a87e5 
								
							 
						 
						
							
							
								
								LLM: add compressed chatglm3 model ( #9892 )  
							
							 
							
							... 
							
							
							
							* LLM: add compressed chatglm3 model
* small fix
* revert github action 
							
						 
						
							2024-01-18 17:48:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								b059a32fff 
								
							 
						 
						
							
							
								
								LLM: add benchmark api for bigdl-llm fp16 on GPU ( #9919 )  
							
							 
							
							... 
							
							
							
							* add bmk for bigdl fp16
* fix 
							
						 
						
							2024-01-17 14:24:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								511cbcf773 
								
							 
						 
						
							
							
								
								LLM: add Ceval benchmark test. ( #9872 )  
							
							 
							
							... 
							
							
							
							* init ceval benchmark test.
* upload dataset.
* add other tests.
* add qwen evaluator.
* fix qwen evaluator style.
* fix qwen evaluator style.
* update qwen evaluator.
* add llama evaluator.
* update eval
* fix typo.
* fix
* fix typo.
* fix llama evaluator.
* fix bug.
* fix style.
* delete dataset.
* fix style.
* fix style.
* add README.md and fix typo.
* fix comments.
* remove run scripts 
							
						 
						
							2024-01-16 19:14:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								0e69bfe6b0 
								
							 
						 
						
							
							
								
								LLM: fix the performance drop of starcoder ( #9889 )  
							
							 
							
							... 
							
							
							
							* LLM: fix the performance drop of starcoder
* small fix
* small fix 
							
						 
						
							2024-01-12 09:14:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ziteng Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								4f4ce73f31 
								
							 
						 
						
							
							
								
								[LLM] Add transformer_autocast_bf16 into all-in-one ( #9890 )  
							
							 
							
							... 
							
							
							
							* Add transformer_autocast_bf16 into all-in-one 
							
						 
						
							2024-01-11 17:51:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								33fd1f9c76 
								
							 
						 
						
							
							
								
								LLM: fix input length logic for run_transformer_int4_gpu ( #9864 )  
							
							 
							
							... 
							
							
							
							* LLM: fix input length logic for run_transformer_int4_gpu
* small fix
* small fix
* small fix 
							
						 
						
							2024-01-10 18:20:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cheen Hau, 俊豪 
								
							 
						 
						
							
							
							
							
								
							
							
								b2aa267f50 
								
							 
						 
						
							
							
								
								Enhance LLM GPU installation document ( #9828 )  
							
							 
							
							... 
							
							
							
							* Improve gpu install doc
* Add troubleshooting - setvars.sh not done properly.
* Further improvements
* 2024.x.x -> 2024.0
* Fixes
* Fix Install BigDL-LLM From Wheel : bigdl-llm[xpu_2.0]
* Remove "export USE_XETLA=OFF" for Max GPU 
							
						 
						
							2024-01-09 16:30:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								f6bb4ab313 
								
							 
						 
						
							
							
								
								Arc stress test ( #9795 )  
							
							 
							
							... 
							
							
							
							* add arc stress test
* triger ci
* triger CI
* triger ci
* disable ci 
							
						 
						
							2023-12-27 21:02:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shaojun Liu 
								
							 
						 
						
							
							
							
							
								
							
							
								6c75c689ea 
								
							 
						 
						
							
							
								
								bigdl-llm stress test for stable version ( #9781 )  
							
							 
							
							... 
							
							
							
							* 1k-512 2k-512 baseline
* add cpu stress test
* update yaml name
* update
* update
* clean up
* test
* update
* update
* update
* test
* update 
							
						 
						
							2023-12-27 15:40:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								5cfb4c4f5b 
								
							 
						 
						
							
							
								
								Arc stable version performance regression test ( #9785 )  
							
							 
							
							... 
							
							
							
							* add arc stable version regression test
* empty gpu mem between different models
* triger ci
* comment spr test
* triger ci
* address kai's comments and disable ci
* merge fp8 and int4
* disable ci 
							
						 
						
							2023-12-27 11:01:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								c05d7e1532 
								
							 
						 
						
							
							
								
								LLM: add star_corder_15.5b model ( #9772 )  
							
							 
							
							... 
							
							
							
							* LLM: add star_corder_15.5b model
* revert llm_performance_tests.yml 
							
						 
						
							2023-12-26 18:55:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									dingbaorong 
								
							 
						 
						
							
							
							
							
								
							
							
								64d05e581c 
								
							 
						 
						
							
							
								
								add peak gpu mem stats in transformer_int4_gpu ( #9766 )  
							
							 
							
							... 
							
							
							
							* add peak gpu mem stats in transformer_int4_gpu
* address weiguang's comments 
							
						 
						
							2023-12-26 15:38:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								7fd7c37e1b 
								
							 
						 
						
							
							
								
								Enable fp8e5 harness ( #9761 )  
							
							 
							
							... 
							
							
							
							* fix precision format like fp8e5
* match fp8_e5m2 
							
						 
						
							2023-12-22 16:59:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								474c099559 
								
							 
						 
						
							
							
								
								LLM: using separate threads to do inference ( #9727 )  
							
							 
							
							... 
							
							
							
							* using separate threads to do inference
* resolve some comments
* resolve some comments
* revert llm_performance_tests.yml file 
							
						 
						
							2023-12-21 17:56:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								b06a3146c8 
								
							 
						 
						
							
							
								
								Fix 70b oom ( #9738 )  
							
							 
							
							... 
							
							
							
							* add default value to bigdl llm
* fix model oom 
							
						 
						
							2023-12-21 10:40:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								3e8d198b57 
								
							 
						 
						
							
							
								
								LLM: add eval func ( #9662 )  
							
							 
							
							... 
							
							
							
							* Add eval func
* add left eval 
							
						 
						
							2023-12-14 14:59:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								cbdd49f229 
								
							 
						 
						
							
							
								
								[LLM] win igpu performance for ipex 2.1 and oneapi 2024.0 ( #9679 )  
							
							 
							
							... 
							
							
							
							* Change igpu win tests for ipex 2.1 and oneapi 2024.0
* Qwen model repo id updates; updates model list for 512-64
* Add .eval for win igpu all-in-one benchmark for best performance 
							
						 
						
							2023-12-13 18:52:29 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Mingyu Wei 
								
							 
						 
						
							
							
							
							
								
							
							
								16febc949c 
								
							 
						 
						
							
							
								
								[LLM] Add exclude option in all-in-one performance test ( #9632 )  
							
							 
							
							... 
							
							
							
							* add exclude option in all-in-one perf test
* update arc-perf-test.yaml
* Exclude in_out_pairs in main function
* fix some bugs
* address Kai's comments
* define excludes at the beginning
* add bloomz:2048 to exclude 
							
						 
						
							2023-12-13 18:13:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								968d99e6f5 
								
							 
						 
						
							
							
								
								Remove empty cache between each iteration of generation ( #9660 )  
							
							 
							
							
							
						 
						
							2023-12-12 17:24:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								972cdb9992 
								
							 
						 
						
							
							
								
								gsm8k OOM workaround ( #9597 )  
							
							 
							
							... 
							
							
							
							* update bigdl_llm.py
* update the installation of harness
* fix partial function
* import ipex
* force seq len in decrease order
* put func outside class
* move comments
* default 'trust_remote_code' as True
* Update llm-harness-evaluation.yml 
							
						 
						
							2023-12-08 18:47:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								e9299adb3b 
								
							 
						 
						
							
							
								
								LLM: Highlight some values in the html ( #9635 )  
							
							 
							
							... 
							
							
							
							* highlight some values in the html
* revert the llm_performance_tests.yml 
							
						 
						
							2023-12-07 19:02:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								48b85593b3 
								
							 
						 
						
							
							
								
								Update all-in-one benchmark readme ( #9618 )  
							
							 
							
							
							
						 
						
							2023-12-07 10:32:09 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								0e8f4020e5 
								
							 
						 
						
							
							
								
								Add traceback error output for win igpu test api in benchmark ( #9607 )  
							
							 
							
							
							
						 
						
							2023-12-06 14:35:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								c998f5f2ba 
								
							 
						 
						
							
							
								
								[LLM] iGPU long context tests ( #9598 )  
							
							 
							
							... 
							
							
							
							* Temp enable PR
* Enable tests for 256-64
* Try again 128-64
* Empty cache after each iteration for igpu benchmark scripts
* Try tests for 512
* change order for 512
* Skip chatglm3 and llama2 for now
* Separate tests for 512-64
* Small fix
* Further fixes
* Change back to nightly again 
							
						 
						
							2023-12-06 10:19:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								8c8a27ded7 
								
							 
						 
						
							
							
								
								Add harness summary job ( #9457 )  
							
							 
							
							... 
							
							
							
							* format yml
* add make_table_results
* add summary job
* add a job to print single result
* upload full directory 
							
						 
						
							2023-12-05 10:04:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								3f4ad97929 
								
							 
						 
						
							
							
								
								[LLM] Add performance tests for windows iGPU ( #9584 )  
							
							 
							
							... 
							
							
							
							* Add support for win gpu benchmark with peak gpu memory monitoring
* Add win igpu tests
* Small fix
* Forward outputs
* Small fix
* Test and small fixes
* Small fix
* Small fix and test
* Small fixes
* Add tests for 512-64 and change back to nightly tests
* Small fix 
							
						 
						
							2023-12-04 20:50:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								cb228c70ea 
								
							 
						 
						
							
							
								
								Add harness nightly ( #9552 )  
							
							 
							
							... 
							
							
							
							* modify output_path as a directory
* schedule nightly at 21 on Friday
* add tasks and models for nightly
* add accuracy regression
* comment out if to test
* mixed fp4
* for test
* add  missing delimiter
* remove comma
* fixed golden results
* add mixed 4 golden result
* add more options
* add mistral results
* get golden result of stable lm
* move nightly scripts and results to test folder
* add license
* add fp8 stable lm golden
* run on all available devices
* trigger only when ready for review
* fix new line
* update golden
* add mistral 
							
						 
						
							2023-12-01 14:16:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								4d7d5d4c59 
								
							 
						 
						
							
							
								
								Add 3 leaderboard tasks ( #9566 )  
							
							 
							
							... 
							
							
							
							* update leaderboard map
* download model and dataset without overwritten
* fix task drop
* run on all available devices 
							
						 
						
							2023-12-01 14:01:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								c8e0c2ed48 
								
							 
						 
						
							
							
								
								Fixed dumped logs in harness ( #9549 )  
							
							 
							
							... 
							
							
							
							* install transformers==4.34.0
* modify output_path as a directory
* add device and task to output dir parents 
							
						 
						
							2023-11-30 12:47:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								45820cf3b9 
								
							 
						 
						
							
							
								
								add optimize model option ( #9530 )  
							
							 
							
							
							
						 
						
							2023-11-24 17:10:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								bf579507c2 
								
							 
						 
						
							
							
								
								Integrate vllm ( #9310 )  
							
							 
							
							... 
							
							
							
							* done
* Rename structure
* add models
* Add structure/sampling_params,sequence
* add input_metadata
* add outputs
* Add policy,logger
* add and update
* add parallelconfig back
* core/scheduler.py
* Add llm_engine.py
* Add async_llm_engine.py
* Add tested entrypoint
* fix minor error
* Fix everything
* fix kv cache view
* fix
* fix
* fix
* format&refine
* remove logger from repo
* try to add token latency
* remove logger
* Refine config.py
* finish worker.py
* delete utils.py
* add license
* refine
* refine sequence.py
* remove sampling_params.py
* finish
* add license
* format
* add license
* refine
* refine
* Refine line too long
* remove exception
* so dumb style-check
* refine
* refine
* refine
* refine
* refine
* refine
* add README
* refine README
* add warning instead error
* fix padding
* add license
* format
* format
* format fix
* Refine vllm dependency (#1 )
vllm dependency clear
* fix licence
* fix format
* fix format
* fix
* adapt LLM engine
* fix
* add license
* fix format
* fix
* Moving README.md to the correct position
* Fix readme.md
* done
* guide for adding models
* fix
* Fix README.md
* Add new model readme
* remove ray-logic
* refactor arg_utils.py
* remove distributed_init_method logic
* refactor entrypoints
* refactor input_metadata
* refactor model_loader
* refactor utils.py
* refactor models
* fix api server
* remove vllm.stucture
* revert by txy 1120
* remove utils
* format
* fix license
* add bigdl model
* Refer to a specfic commit
* Change code base
* add comments
* add async_llm_engine comment
* refine
* formatted
* add worker comments
* add comments
* add comments
* fix style
* add changes
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com>
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com> 
							
						 
						
							2023-11-23 16:46:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								139e98aa18 
								
							 
						 
						
							
							
								
								LLM: quick fix benchmark ( #9509 )  
							
							 
							
							
							
						 
						
							2023-11-22 10:19:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								c2aeb4d1e8 
								
							 
						 
						
							
							
								
								del model after test ( #9504 )  
							
							 
							
							
							
						 
						
							2023-11-21 18:41:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cheen Hau, 俊豪 
								
							 
						 
						
							
							
							
							
								
							
							
								3e39828420 
								
							 
						 
						
							
							
								
								Update all in one benchmark readme ( #9496 )  
							
							 
							
							... 
							
							
							
							* Add gperftools install to all in one benchmark readme
* Update readme 
							
						 
						
							2023-11-21 14:57:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								c487b53f21 
								
							 
						 
						
							
							
								
								LLM: only run arc perf test nightly ( #9448 )  
							
							 
							
							... 
							
							
							
							* LLM: only run arc perf test nightly
* deleted unused python scripts
* rebase main 
							
						 
						
							2023-11-15 19:38:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								dbbdb53a18 
								
							 
						 
						
							
							
								
								fix multiple gpu usage ( #9459 )  
							
							 
							
							
							
						 
						
							2023-11-14 17:06:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								d19ca21957 
								
							 
						 
						
							
							
								
								patch bigdl-llm model to harness by binding instead of patch file ( #9420 )  
							
							 
							
							... 
							
							
							
							* add run_llb.py
* fix args interpret
* modify outputs
* update workflow
* add license
* test mixed 4 bit
* update readme
* use autotokenizer
* add timeout
* refactor workflow file
* fix working directory
* fix env
* throw exception if some jobs failed
* improve terminal outputs
* Disable var which cause the run stuck
* fix unknown precision
* fix key error
* directly output config instead
* rm harness submodule 
							
						 
						
							2023-11-14 12:51:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								0ecb9efb05 
								
							 
						 
						
							
							
								
								use AutoTokenizer to enable more models ( #9446 )  
							
							 
							
							
							
						 
						
							2023-11-13 17:47:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								ece5805572 
								
							 
						 
						
							
							
								
								LLM: add chatglm3-6b to latency benchmark test. ( #9442 )  
							
							 
							
							
							
						 
						
							2023-11-13 17:24:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								5747e2fe69 
								
							 
						 
						
							
							
								
								fix multiple gpu usage of harness ( #9444 )  
							
							 
							
							
							
						 
						
							2023-11-13 16:53:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								b23b91407c 
								
							 
						 
						
							
							
								
								fix llm-init on deepspeed missing lib ( #9419 )  
							
							 
							
							
							
						 
						
							2023-11-10 13:51:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								298b64217e 
								
							 
						 
						
							
							
								
								add auto triggered acc test ( #9364 )  
							
							 
							
							... 
							
							
							
							* add auto triggered acc test
* use llama 7b instead
* fix env
* debug download
* fix download prefix
* add cut dirs
* fix env of model path
* fix dataset download
* full job
* source xpu env vars
* use matrix to trigger model run
* reset batch=1
* remove redirect
* remove some trigger
* add task matrix
* add precision list
* test llama-7b-chat
* use /mnt/disk1 to store model and datasets
* remove installation test
* correct downloading path
* fix HF vars
* add bigdl-llm env vars
* rename file
* fix hf_home
* fix script path
* rename as harness evalution
* rerun 
							
						 
						
							2023-11-08 10:22:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								84ab614aab 
								
							 
						 
						
							
							
								
								LLM: add more models and skip runtime error ( #9349 )  
							
							 
							
							... 
							
							
							
							* add more models and skip runtime error
* upgrade transformers
* temporarily removed Mistral-7B-v0.1
* temporarily disable the upload of arc perf result 
							
						 
						
							2023-11-08 09:45:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								af94058203 
								
							 
						 
						
							
							
								
								[LLM] Support CPU deepspeed distributed inference ( #9259 )  
							
							 
							
							... 
							
							
							
							* [LLM] Support CPU Deepspeed distributed inference
* Update run_deepspeed.py
* Rename
* fix style
* add new codes
* refine
* remove annotated codes
* refine
* Update README.md
* refine doc and example code 
							
						 
						
							2023-11-06 17:56:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								d4dffbdb62 
								
							 
						 
						
							
							
								
								Merge harness ( #9319 )  
							
							 
							
							... 
							
							
							
							* add harness patch and llb script
* add readme
* add license
* use patch instead
* update readme
* rename tests to evaluation
* fix typo
* remove nano dependency
* add original harness link
* rename title of usage
* rename BigDLGPULM as BigDLLM
* empty commit to rerun job 
							
						 
						
							2023-11-02 15:14:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								7e73c354a6 
								
							 
						 
						
							
							
								
								LLM: decoupling bigdl-llm and bigdl-nano ( #9306 )  
							
							 
							
							
							
						 
						
							2023-11-01 11:00:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								770ac70b00 
								
							 
						 
						
							
							
								
								LLM: add low_bit option in benchmark scripts ( #9257 )  
							
							 
							
							
							
						 
						
							2023-10-25 10:27:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								ec9195da42 
								
							 
						 
						
							
							
								
								LLM: using html to visualize the perf result for Arc ( #9228 )  
							
							 
							
							... 
							
							
							
							* LLM: using html to visualize the perf result for Arc
* deploy the html file
* add python license
* reslove some comments 
							
						 
						
							2023-10-24 18:05:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								b15656229e 
								
							 
						 
						
							
							
								
								LLM: fix benchmark issue ( #9255 )  
							
							 
							
							
							
						 
						
							2023-10-24 14:15:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								b9194c5786 
								
							 
						 
						
							
							
								
								LLM: skip some model tests using certain api ( #9163 )  
							
							 
							
							... 
							
							
							
							* LLM: Skip some model tests using certain api
* initialize variable named result 
							
						 
						
							2023-10-18 09:39:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								4f34557224 
								
							 
						 
						
							
							
								
								LLM: support num_beams in all-in-one benchmark ( #9141 )  
							
							 
							
							... 
							
							
							
							* support num_beams
* fix 
							
						 
						
							2023-10-12 13:35:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								62ac7ae444 
								
							 
						 
						
							
							
								
								LLM: fix inaccurate input / output tokens of current all-in-one benchmark ( #9137 )  
							
							 
							
							... 
							
							
							
							* first fix
* fix all apis
* fix 
							
						 
						
							2023-10-11 17:13:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								1c8d5da362 
								
							 
						 
						
							
							
								
								LLM: fix llama tokenizer for all-in-one benchmark ( #9129 )  
							
							 
							
							... 
							
							
							
							* fix tokenizer for gpu benchmark
* fix ipex fp16
* meet code review
* fix 
							
						 
						
							2023-10-11 13:39:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								1363e666fc 
								
							 
						 
						
							
							
								
								LLM: update benchmark_util.py for beam search ( #9126 )  
							
							 
							
							... 
							
							
							
							* update reorder_cache
* fix 
							
						 
						
							2023-10-11 09:41:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								0e09dd926b 
								
							 
						 
						
							
							
								
								[LLM] Fix example test ( #9118 )  
							
							 
							
							... 
							
							
							
							* Update llm example test link due to example layout change
* Add better change detect 
							
						 
						
							2023-10-10 13:24:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								ad7d9231f5 
								
							 
						 
						
							
							
								
								LLM: add benchmark script for Max gpu and ipex fp16 gpu ( #9112 )  
							
							 
							
							... 
							
							
							
							* add pvc bash
* meet code review
* rename to run-max-gpu.sh 
							
						 
						
							2023-10-10 10:18:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								65212451cc 
								
							 
						 
						
							
							
								
								[LLM] Small update to performance tests ( #9106 )  
							
							 
							
							... 
							
							
							
							* small updates to llm performance tests regarding model handling
* Small fix 
							
						 
						
							2023-10-09 16:55:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								78ea7ddb1c 
								
							 
						 
						
							
							
								
								Combine apply_rotary_pos_emb for gpt-neox ( #9074 )  
							
							 
							
							
							
						 
						
							2023-10-07 16:27:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								ad62c58b33 
								
							 
						 
						
							
							
								
								LLM: Enable jemalloc in benchmark scripts. ( #9058 )  
							
							 
							
							... 
							
							
							
							* enable jemalloc.
* fix readme. 
							
						 
						
							2023-09-26 15:37:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								26213a5829 
								
							 
						 
						
							
							
								
								LLM: Change benchmark bf16 load format. ( #9035 )  
							
							 
							
							... 
							
							
							
							* LLM: Change benchmark bf16 load format.
* comment on bf16 chatglm.
* fix. 
							
						 
						
							2023-09-22 17:38:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								6981745fe4 
								
							 
						 
						
							
							
								
								Optimize kv_cache for gpt-neox model family ( #9015 )  
							
							 
							
							... 
							
							
							
							* override gptneox
* style
* move to utils
* revert 
							
						 
						
							2023-09-20 19:59:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								37bb0cbf8f 
								
							 
						 
						
							
							
								
								Speed up gpt-j in gpubenchmark ( #9000 )  
							
							 
							
							... 
							
							
							
							* Speedup gpt-j in gpubenchmark
* meet code review 
							
						 
						
							2023-09-19 14:22:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								8299b68fea 
								
							 
						 
						
							
							
								
								update readme. ( #8996 )  
							
							 
							
							
							
						 
						
							2023-09-18 17:06:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								74338fd291 
								
							 
						 
						
							
							
								
								LLM: add auto torch dtype in benchmark. ( #8981 )  
							
							 
							
							
							
						 
						
							2023-09-18 15:48:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								32716106e0 
								
							 
						 
						
							
							
								
								update use_cahce=True ( #8986 )  
							
							 
							
							
							
						 
						
							2023-09-18 07:59:33 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								64ee1d7689 
								
							 
						 
						
							
							
								
								update run_transformer_int4_gpu ( #8983 )  
							
							 
							
							... 
							
							
							
							* xpuperf
* update run.py
* clean upo
* uodate
* update
* meet code review 
							
						 
						
							2023-09-15 15:10:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								cca84b0a64 
								
							 
						 
						
							
							
								
								LLM: update llm benchmark scripts. ( #8943 )  
							
							 
							
							... 
							
							
							
							* update llm benchmark scripts.
* change tranformer_bf16 to pytorch_autocast_bf16.
* add autocast in transformer int4.
* revert autocast.
* add "pytorch_autocast_bf16" to doc
* fix comments. 
							
						 
						
							2023-09-13 12:23:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								ea0853c0b5 
								
							 
						 
						
							
							
								
								update benchmark_utils readme ( #8925 )  
							
							 
							
							... 
							
							
							
							* update readme
* meet code review 
							
						 
						
							2023-09-08 10:30:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								3d2efe9608 
								
							 
						 
						
							
							
								
								LLM: update llm latency benchmark. ( #8922 )  
							
							 
							
							
							
						 
						
							2023-09-07 19:00:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								7897eb4b51 
								
							 
						 
						
							
							
								
								LLM: add benchmark scripts on GPU ( #8916 )  
							
							 
							
							
							
						 
						
							2023-09-07 18:08:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								d8a01d7c4f 
								
							 
						 
						
							
							
								
								fix chatglm in run.pu ( #8919 )  
							
							 
							
							
							
						 
						
							2023-09-07 16:44:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								e9de9d9950 
								
							 
						 
						
							
							
								
								benchmark for native int4  ( #8918 )  
							
							 
							
							... 
							
							
							
							* native4
* update
* update
* update 
							
						 
						
							2023-09-07 15:56:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								057e77e229 
								
							 
						 
						
							
							
								
								LLM: update benchmark_utils.py to handle do_sample=True ( #8903 )  
							
							 
							
							
							
						 
						
							2023-09-07 14:20:47 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								5d9942a3ca 
								
							 
						 
						
							
							
								
								transformer int4 and native int4's benchmark script for 32 256 1k 2k input ( #8871 )  
							
							 
							
							... 
							
							
							
							* transformer
* move
* update
* add header
* update all-in-one
* clean up 
							
						 
						
							2023-09-07 09:49:55 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								49a39452c6 
								
							 
						 
						
							
							
								
								update benchmark ( #8899 )  
							
							 
							
							
							
						 
						
							2023-09-06 15:11:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Song Jiaming 
								
							 
						 
						
							
							
							
							
								
							
							
								7b3ac66e17 
								
							 
						 
						
							
							
								
								[LLM] auto performance test fix specific settings to template ( #8876 )  
							
							 
							
							
							
						 
						
							2023-09-01 15:49:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Song Jiaming 
								
							 
						 
						
							
							
							
							
								
							
							
								c06f1ca93e 
								
							 
						 
						
							
							
								
								[LLM] auto perf test to output to csv ( #8846 )  
							
							 
							
							
							
						 
						
							2023-09-01 10:48:00 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Song Jiaming 
								
							 
						 
						
							
							
							
							
								
							
							
								b8b1b6888b 
								
							 
						 
						
							
							
								
								[LLM] Performance test ( #8796 )  
							
							 
							
							
							
						 
						
							2023-08-25 14:31:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								e9aa2bd890 
								
							 
						 
						
							
							
								
								LLM: reduce GPU 1st token latency and update example ( #8763 )  
							
							 
							
							... 
							
							
							
							* reduce 1st token latency
* update example
* fix
* fix style
* update readme of gpu benchmark 
							
						 
						
							2023-08-16 18:01:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Song Jiaming 
								
							 
						 
						
							
							
							
							
								
							
							
								c1f9af6d97 
								
							 
						 
						
							
							
								
								[LLM] chatglm example and transformers low-bit examples ( #8751 )  
							
							 
							
							
							
						 
						
							2023-08-16 11:41:44 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								8805186f2f 
								
							 
						 
						
							
							
								
								LLM: add benchmark tool for gpu ( #8760 )  
							
							 
							
							... 
							
							
							
							* add benchmark tool for gpu
* update 
							
						 
						
							2023-08-16 11:22:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Song Jiaming 
								
							 
						 
						
							
							
							
							
								
							
							
								e717e304a6 
								
							 
						 
						
							
							
								
								LLM first example test and template ( #8658 )  
							
							 
							
							
							
						 
						
							2023-08-10 10:03:11 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								64b38e1dc8 
								
							 
						 
						
							
							
								
								llm: benchmark tool for transformers int4 (separate 1st token and rest) ( #8460 )  
							
							 
							
							... 
							
							
							
							* add benchmark utils
* fix
* fix bug and add readme
* hidden latency data 
							
						 
						
							2023-07-06 09:49:52 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Junwei Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								2fd751de7a 
								
							 
						 
						
							
							
								
								LLM: add a dev tool for getting glibc/glibcxx requirement ( #8399 )  
							
							 
							
							... 
							
							
							
							* add a dev tool
* pep8 change 
							
						 
						
							2023-06-30 11:09:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Shengsheng Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								02c583144c 
								
							 
						 
						
							
							
								
								[LLM] langchain integrations and examples ( #8256 )  
							
							 
							
							... 
							
							
							
							* langchain intergrations and examples
* add licences and rename
* add licences
* fix license issues and change backbone to model_family
* update examples to use model_family param
* fix linting
* fix code style
* exclude langchain integration from stylecheck
* update langchain examples and update integrations based on latets changes
* update simple llama-cpp-python style API example
* remove bloom in README
* change default n_threads to 2 and remove redundant code
---------
Co-authored-by: leonardozcm <changmin.zhao@intel.com> 
							
						 
						
							2023-06-12 19:22:07 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Pingchuan Ma (Henry) 
								
							 
						 
						
							
							
							
							
								
							
							
								773255e009 
								
							 
						 
						
							
							
								
								[LLM] Add dev wheel building and basic UT script for LLM package on Linux ( #8264 )  
							
							 
							
							... 
							
							
							
							* add wheel build for linux
* test fix
* test self-hosted runner
* test fix
* update runner
* update runner
* update fix
* init cicd
* init cicd
* test conda
* update fix
* update no need manual python deps
* test fix bugs
* test fix bugs
* test fix bugs
* fix bugs 
							
						 
						
							2023-06-08 00:49:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Pingchuan Ma (Henry) 
								
							 
						 
						
							
							
							
							
								
							
							
								2ed5842448 
								
							 
						 
						
							
							
								
								[LLM] add convert's python deps for LLM ( #8260 )  
							
							 
							
							... 
							
							
							
							* add python deps for LLM
* update release.sh
* change deps group name
* update all
* fix update
* test fix
* update 
							
						 
						
							2023-06-06 16:01:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Pingchuan Ma (Henry) 
								
							 
						 
						
							
							
							
							
								
							
							
								c48d5f7cff 
								
							 
						 
						
							
							
								
								[LLM] Enable UT workflow logics for LLM ( #8243 )  
							
							 
							
							... 
							
							
							
							* check push connection
* enable UT workflow logics for LLM
* test fix
* add licenses
* test fix according to suggestions
* test fix
* update changes 
							
						 
						
							2023-06-02 17:06:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Pingchuan Ma (Henry) 
								
							 
						 
						
							
							
							
							
								
							
							
								141febec1f 
								
							 
						 
						
							
							
								
								Add dev wheel building script for LLM package on Windows ( #8238 )  
							
							 
							
							... 
							
							
							
							* Add dev wheel building script for LLM package on Windows
* delete conda
* delete python version check
* minor adjust
* wheel name fixed
* test check
* test fix
* change wheel name 
							
						 
						
							2023-06-01 11:55:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								8421af51ae 
								
							 
						 
						
							
							
								
								LLM: support converting to ggml format ( #8235 )  
							
							 
							
							... 
							
							
							
							* add convert
* fix
* fix
* fix
* try
* test
* update check
* fix
* fix 
							
						 
						
							2023-05-31 15:20:06 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Pingchuan Ma (Henry) 
								
							 
						 
						
							
							
							
							
								
							
							
								1f913a6941 
								
							 
						 
						
							
							
								
								[LLM] Add LLM pep8 coding style checking ( #8233 )  
							
							 
							
							... 
							
							
							
							* add LLM pep8 coding checking
* resolve bugs in testing scripts and code style revision 
							
						 
						
							2023-05-30 15:58:14 +08:00