Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								0e8f4020e5 
								
							 
						 
						
							
							
								
								Add traceback error output for win igpu test api in benchmark ( #9607 )  
							
							 
							
							
							
						 
						
							2023-12-06 14:35:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								c998f5f2ba 
								
							 
						 
						
							
							
								
								[LLM] iGPU long context tests ( #9598 )  
							
							 
							
							... 
							
							
							
							* Temp enable PR
* Enable tests for 256-64
* Try again 128-64
* Empty cache after each iteration for igpu benchmark scripts
* Try tests for 512
* change order for 512
* Skip chatglm3 and llama2 for now
* Separate tests for 512-64
* Small fix
* Further fixes
* Change back to nightly again 
							
						 
						
							2023-12-06 10:19:20 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								8c8a27ded7 
								
							 
						 
						
							
							
								
								Add harness summary job ( #9457 )  
							
							 
							
							... 
							
							
							
							* format yml
* add make_table_results
* add summary job
* add a job to print single result
* upload full directory 
							
						 
						
							2023-12-05 10:04:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								3f4ad97929 
								
							 
						 
						
							
							
								
								[LLM] Add performance tests for windows iGPU ( #9584 )  
							
							 
							
							... 
							
							
							
							* Add support for win gpu benchmark with peak gpu memory monitoring
* Add win igpu tests
* Small fix
* Forward outputs
* Small fix
* Test and small fixes
* Small fix
* Small fix and test
* Small fixes
* Add tests for 512-64 and change back to nightly tests
* Small fix 
							
						 
						
							2023-12-04 20:50:02 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								cb228c70ea 
								
							 
						 
						
							
							
								
								Add harness nightly ( #9552 )  
							
							 
							
							... 
							
							
							
							* modify output_path as a directory
* schedule nightly at 21 on Friday
* add tasks and models for nightly
* add accuracy regression
* comment out if to test
* mixed fp4
* for test
* add  missing delimiter
* remove comma
* fixed golden results
* add mixed 4 golden result
* add more options
* add mistral results
* get golden result of stable lm
* move nightly scripts and results to test folder
* add license
* add fp8 stable lm golden
* run on all available devices
* trigger only when ready for review
* fix new line
* update golden
* add mistral 
							
						 
						
							2023-12-01 14:16:35 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								4d7d5d4c59 
								
							 
						 
						
							
							
								
								Add 3 leaderboard tasks ( #9566 )  
							
							 
							
							... 
							
							
							
							* update leaderboard map
* download model and dataset without overwritten
* fix task drop
* run on all available devices 
							
						 
						
							2023-12-01 14:01:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								c8e0c2ed48 
								
							 
						 
						
							
							
								
								Fixed dumped logs in harness ( #9549 )  
							
							 
							
							... 
							
							
							
							* install transformers==4.34.0
* modify output_path as a directory
* add device and task to output dir parents 
							
						 
						
							2023-11-30 12:47:56 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								45820cf3b9 
								
							 
						 
						
							
							
								
								add optimize model option ( #9530 )  
							
							 
							
							
							
						 
						
							2023-11-24 17:10:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Guancheng Fu 
								
							 
						 
						
							
							
							
							
								
							
							
								bf579507c2 
								
							 
						 
						
							
							
								
								Integrate vllm ( #9310 )  
							
							 
							
							... 
							
							
							
							* done
* Rename structure
* add models
* Add structure/sampling_params,sequence
* add input_metadata
* add outputs
* Add policy,logger
* add and update
* add parallelconfig back
* core/scheduler.py
* Add llm_engine.py
* Add async_llm_engine.py
* Add tested entrypoint
* fix minor error
* Fix everything
* fix kv cache view
* fix
* fix
* fix
* format&refine
* remove logger from repo
* try to add token latency
* remove logger
* Refine config.py
* finish worker.py
* delete utils.py
* add license
* refine
* refine sequence.py
* remove sampling_params.py
* finish
* add license
* format
* add license
* refine
* refine
* Refine line too long
* remove exception
* so dumb style-check
* refine
* refine
* refine
* refine
* refine
* refine
* add README
* refine README
* add warning instead error
* fix padding
* add license
* format
* format
* format fix
* Refine vllm dependency (#1 )
vllm dependency clear
* fix licence
* fix format
* fix format
* fix
* adapt LLM engine
* fix
* add license
* fix format
* fix
* Moving README.md to the correct position
* Fix readme.md
* done
* guide for adding models
* fix
* Fix README.md
* Add new model readme
* remove ray-logic
* refactor arg_utils.py
* remove distributed_init_method logic
* refactor entrypoints
* refactor input_metadata
* refactor model_loader
* refactor utils.py
* refactor models
* fix api server
* remove vllm.stucture
* revert by txy 1120
* remove utils
* format
* fix license
* add bigdl model
* Refer to a specfic commit
* Change code base
* add comments
* add async_llm_engine comment
* refine
* formatted
* add worker comments
* add comments
* add comments
* fix style
* add changes
---------
Co-authored-by: xiangyuT <xiangyu.tian@intel.com>
Co-authored-by: Xiangyu Tian <109123695+xiangyuT@users.noreply.github.com>
Co-authored-by: leonardozcm <leonardo1997zcm@gmail.com> 
							
						 
						
							2023-11-23 16:46:45 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								139e98aa18 
								
							 
						 
						
							
							
								
								LLM: quick fix benchmark ( #9509 )  
							
							 
							
							
							
						 
						
							2023-11-22 10:19:57 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								c2aeb4d1e8 
								
							 
						 
						
							
							
								
								del model after test ( #9504 )  
							
							 
							
							
							
						 
						
							2023-11-21 18:41:50 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cheen Hau, 俊豪 
								
							 
						 
						
							
							
							
							
								
							
							
								3e39828420 
								
							 
						 
						
							
							
								
								Update all in one benchmark readme ( #9496 )  
							
							 
							
							... 
							
							
							
							* Add gperftools install to all in one benchmark readme
* Update readme 
							
						 
						
							2023-11-21 14:57:16 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								c487b53f21 
								
							 
						 
						
							
							
								
								LLM: only run arc perf test nightly ( #9448 )  
							
							 
							
							... 
							
							
							
							* LLM: only run arc perf test nightly
* deleted unused python scripts
* rebase main 
							
						 
						
							2023-11-15 19:38:14 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								dbbdb53a18 
								
							 
						 
						
							
							
								
								fix multiple gpu usage ( #9459 )  
							
							 
							
							
							
						 
						
							2023-11-14 17:06:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								d19ca21957 
								
							 
						 
						
							
							
								
								patch bigdl-llm model to harness by binding instead of patch file ( #9420 )  
							
							 
							
							... 
							
							
							
							* add run_llb.py
* fix args interpret
* modify outputs
* update workflow
* add license
* test mixed 4 bit
* update readme
* use autotokenizer
* add timeout
* refactor workflow file
* fix working directory
* fix env
* throw exception if some jobs failed
* improve terminal outputs
* Disable var which cause the run stuck
* fix unknown precision
* fix key error
* directly output config instead
* rm harness submodule 
							
						 
						
							2023-11-14 12:51:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								0ecb9efb05 
								
							 
						 
						
							
							
								
								use AutoTokenizer to enable more models ( #9446 )  
							
							 
							
							
							
						 
						
							2023-11-13 17:47:43 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								ece5805572 
								
							 
						 
						
							
							
								
								LLM: add chatglm3-6b to latency benchmark test. ( #9442 )  
							
							 
							
							
							
						 
						
							2023-11-13 17:24:37 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								5747e2fe69 
								
							 
						 
						
							
							
								
								fix multiple gpu usage of harness ( #9444 )  
							
							 
							
							
							
						 
						
							2023-11-13 16:53:23 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								b23b91407c 
								
							 
						 
						
							
							
								
								fix llm-init on deepspeed missing lib ( #9419 )  
							
							 
							
							
							
						 
						
							2023-11-10 13:51:24 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								298b64217e 
								
							 
						 
						
							
							
								
								add auto triggered acc test ( #9364 )  
							
							 
							
							... 
							
							
							
							* add auto triggered acc test
* use llama 7b instead
* fix env
* debug download
* fix download prefix
* add cut dirs
* fix env of model path
* fix dataset download
* full job
* source xpu env vars
* use matrix to trigger model run
* reset batch=1
* remove redirect
* remove some trigger
* add task matrix
* add precision list
* test llama-7b-chat
* use /mnt/disk1 to store model and datasets
* remove installation test
* correct downloading path
* fix HF vars
* add bigdl-llm env vars
* rename file
* fix hf_home
* fix script path
* rename as harness evalution
* rerun 
							
						 
						
							2023-11-08 10:22:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								84ab614aab 
								
							 
						 
						
							
							
								
								LLM: add more models and skip runtime error ( #9349 )  
							
							 
							
							... 
							
							
							
							* add more models and skip runtime error
* upgrade transformers
* temporarily removed Mistral-7B-v0.1
* temporarily disable the upload of arc perf result 
							
						 
						
							2023-11-08 09:45:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Heyang Sun 
								
							 
						 
						
							
							
							
							
								
							
							
								af94058203 
								
							 
						 
						
							
							
								
								[LLM] Support CPU deepspeed distributed inference ( #9259 )  
							
							 
							
							... 
							
							
							
							* [LLM] Support CPU Deepspeed distributed inference
* Update run_deepspeed.py
* Rename
* fix style
* add new codes
* refine
* remove annotated codes
* refine
* Update README.md
* refine doc and example code 
							
						 
						
							2023-11-06 17:56:42 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Chen, Zhentao 
								
							 
						 
						
							
							
							
							
								
							
							
								d4dffbdb62 
								
							 
						 
						
							
							
								
								Merge harness ( #9319 )  
							
							 
							
							... 
							
							
							
							* add harness patch and llb script
* add readme
* add license
* use patch instead
* update readme
* rename tests to evaluation
* fix typo
* remove nano dependency
* add original harness link
* rename title of usage
* rename BigDLGPULM as BigDLLM
* empty commit to rerun job 
							
						 
						
							2023-11-02 15:14:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								7e73c354a6 
								
							 
						 
						
							
							
								
								LLM: decoupling bigdl-llm and bigdl-nano ( #9306 )  
							
							 
							
							
							
						 
						
							2023-11-01 11:00:54 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								770ac70b00 
								
							 
						 
						
							
							
								
								LLM: add low_bit option in benchmark scripts ( #9257 )  
							
							 
							
							
							
						 
						
							2023-10-25 10:27:48 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								ec9195da42 
								
							 
						 
						
							
							
								
								LLM: using html to visualize the perf result for Arc ( #9228 )  
							
							 
							
							... 
							
							
							
							* LLM: using html to visualize the perf result for Arc
* deploy the html file
* add python license
* reslove some comments 
							
						 
						
							2023-10-24 18:05:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								b15656229e 
								
							 
						 
						
							
							
								
								LLM: fix benchmark issue ( #9255 )  
							
							 
							
							
							
						 
						
							2023-10-24 14:15:05 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									WeiguangHan 
								
							 
						 
						
							
							
							
							
								
							
							
								b9194c5786 
								
							 
						 
						
							
							
								
								LLM: skip some model tests using certain api ( #9163 )  
							
							 
							
							... 
							
							
							
							* LLM: Skip some model tests using certain api
* initialize variable named result 
							
						 
						
							2023-10-18 09:39:27 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								4f34557224 
								
							 
						 
						
							
							
								
								LLM: support num_beams in all-in-one benchmark ( #9141 )  
							
							 
							
							... 
							
							
							
							* support num_beams
* fix 
							
						 
						
							2023-10-12 13:35:12 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								62ac7ae444 
								
							 
						 
						
							
							
								
								LLM: fix inaccurate input / output tokens of current all-in-one benchmark ( #9137 )  
							
							 
							
							... 
							
							
							
							* first fix
* fix all apis
* fix 
							
						 
						
							2023-10-11 17:13:34 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								1c8d5da362 
								
							 
						 
						
							
							
								
								LLM: fix llama tokenizer for all-in-one benchmark ( #9129 )  
							
							 
							
							... 
							
							
							
							* fix tokenizer for gpu benchmark
* fix ipex fp16
* meet code review
* fix 
							
						 
						
							2023-10-11 13:39:39 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								1363e666fc 
								
							 
						 
						
							
							
								
								LLM: update benchmark_util.py for beam search ( #9126 )  
							
							 
							
							... 
							
							
							
							* update reorder_cache
* fix 
							
						 
						
							2023-10-11 09:41:53 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								0e09dd926b 
								
							 
						 
						
							
							
								
								[LLM] Fix example test ( #9118 )  
							
							 
							
							... 
							
							
							
							* Update llm example test link due to example layout change
* Add better change detect 
							
						 
						
							2023-10-10 13:24:18 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								ad7d9231f5 
								
							 
						 
						
							
							
								
								LLM: add benchmark script for Max gpu and ipex fp16 gpu ( #9112 )  
							
							 
							
							... 
							
							
							
							* add pvc bash
* meet code review
* rename to run-max-gpu.sh 
							
						 
						
							2023-10-10 10:18:41 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Yuwen Hu 
								
							 
						 
						
							
							
							
							
								
							
							
								65212451cc 
								
							 
						 
						
							
							
								
								[LLM] Small update to performance tests ( #9106 )  
							
							 
							
							... 
							
							
							
							* small updates to llm performance tests regarding model handling
* Small fix 
							
						 
						
							2023-10-09 16:55:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								78ea7ddb1c 
								
							 
						 
						
							
							
								
								Combine apply_rotary_pos_emb for gpt-neox ( #9074 )  
							
							 
							
							
							
						 
						
							2023-10-07 16:27:46 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								ad62c58b33 
								
							 
						 
						
							
							
								
								LLM: Enable jemalloc in benchmark scripts. ( #9058 )  
							
							 
							
							... 
							
							
							
							* enable jemalloc.
* fix readme. 
							
						 
						
							2023-09-26 15:37:49 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								26213a5829 
								
							 
						 
						
							
							
								
								LLM: Change benchmark bf16 load format. ( #9035 )  
							
							 
							
							... 
							
							
							
							* LLM: Change benchmark bf16 load format.
* comment on bf16 chatglm.
* fix. 
							
						 
						
							2023-09-22 17:38:38 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Kai Huang 
								
							 
						 
						
							
							
							
							
								
							
							
								6981745fe4 
								
							 
						 
						
							
							
								
								Optimize kv_cache for gpt-neox model family ( #9015 )  
							
							 
							
							... 
							
							
							
							* override gptneox
* style
* move to utils
* revert 
							
						 
						
							2023-09-20 19:59:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								37bb0cbf8f 
								
							 
						 
						
							
							
								
								Speed up gpt-j in gpubenchmark ( #9000 )  
							
							 
							
							... 
							
							
							
							* Speedup gpt-j in gpubenchmark
* meet code review 
							
						 
						
							2023-09-19 14:22:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								8299b68fea 
								
							 
						 
						
							
							
								
								update readme. ( #8996 )  
							
							 
							
							
							
						 
						
							2023-09-18 17:06:15 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								74338fd291 
								
							 
						 
						
							
							
								
								LLM: add auto torch dtype in benchmark. ( #8981 )  
							
							 
							
							
							
						 
						
							2023-09-18 15:48:25 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Ruonan Wang 
								
							 
						 
						
							
							
							
							
								
							
							
								32716106e0 
								
							 
						 
						
							
							
								
								update use_cahce=True ( #8986 )  
							
							 
							
							
							
						 
						
							2023-09-18 07:59:33 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								64ee1d7689 
								
							 
						 
						
							
							
								
								update run_transformer_int4_gpu ( #8983 )  
							
							 
							
							... 
							
							
							
							* xpuperf
* update run.py
* clean upo
* uodate
* update
* meet code review 
							
						 
						
							2023-09-15 15:10:04 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								cca84b0a64 
								
							 
						 
						
							
							
								
								LLM: update llm benchmark scripts. ( #8943 )  
							
							 
							
							... 
							
							
							
							* update llm benchmark scripts.
* change tranformer_bf16 to pytorch_autocast_bf16.
* add autocast in transformer int4.
* revert autocast.
* add "pytorch_autocast_bf16" to doc
* fix comments. 
							
						 
						
							2023-09-13 12:23:28 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								ea0853c0b5 
								
							 
						 
						
							
							
								
								update benchmark_utils readme ( #8925 )  
							
							 
							
							... 
							
							
							
							* update readme
* meet code review 
							
						 
						
							2023-09-08 10:30:26 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Cengguang Zhang 
								
							 
						 
						
							
							
							
							
								
							
							
								3d2efe9608 
								
							 
						 
						
							
							
								
								LLM: update llm latency benchmark. ( #8922 )  
							
							 
							
							
							
						 
						
							2023-09-07 19:00:19 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									binbin Deng 
								
							 
						 
						
							
							
							
							
								
							
							
								7897eb4b51 
								
							 
						 
						
							
							
								
								LLM: add benchmark scripts on GPU ( #8916 )  
							
							 
							
							
							
						 
						
							2023-09-07 18:08:17 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								d8a01d7c4f 
								
							 
						 
						
							
							
								
								fix chatglm in run.pu ( #8919 )  
							
							 
							
							
							
						 
						
							2023-09-07 16:44:10 +08:00  
						
						
							 
							
							
								 
							 
							
						 
					 
				
					
						
							
								
								
									 
									Xin Qiu 
								
							 
						 
						
							
							
							
							
								
							
							
								e9de9d9950 
								
							 
						 
						
							
							
								
								benchmark for native int4  ( #8918 )  
							
							 
							
							... 
							
							
							
							* native4
* update
* update
* update 
							
						 
						
							2023-09-07 15:56:15 +08:00