Heyang Sun
								
							 
						 | 
						
							
							
							
							
								
							
							
								af94058203
								
							
						 | 
						
							
							
								
								[LLM] Support CPU deepspeed distributed inference (#9259)
							
							
							
							
							
							
							
							* [LLM] Support CPU Deepspeed distributed inference
* Update run_deepspeed.py
* Rename
* fix style
* add new codes
* refine
* remove annotated codes
* refine
* Update README.md
* refine doc and example code 
							
						 | 
						
							2023-11-06 17:56:42 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								770ac70b00
								
							
						 | 
						
							
							
								
								LLM: add low_bit option in benchmark scripts (#9257)
							
							
							
							
							
						 | 
						
							2023-10-25 10:27:48 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									WeiguangHan
								
							 
						 | 
						
							
							
							
							
								
							
							
								ec9195da42
								
							
						 | 
						
							
							
								
								LLM: using html to visualize the perf result for Arc (#9228)
							
							
							
							
							
							
							
							* LLM: using html to visualize the perf result for Arc
* deploy the html file
* add python license
* reslove some comments 
							
						 | 
						
							2023-10-24 18:05:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								b15656229e
								
							
						 | 
						
							
							
								
								LLM: fix benchmark issue (#9255)
							
							
							
							
							
						 | 
						
							2023-10-24 14:15:05 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									WeiguangHan
								
							 
						 | 
						
							
							
							
							
								
							
							
								b9194c5786
								
							
						 | 
						
							
							
								
								LLM: skip some model tests using certain api (#9163)
							
							
							
							
							
							
							
							* LLM: Skip some model tests using certain api
* initialize variable named result 
							
						 | 
						
							2023-10-18 09:39:27 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								4f34557224
								
							
						 | 
						
							
							
								
								LLM: support num_beams in all-in-one benchmark (#9141)
							
							
							
							
							
							
							
							* support num_beams
* fix 
							
						 | 
						
							2023-10-12 13:35:12 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								62ac7ae444
								
							
						 | 
						
							
							
								
								LLM: fix inaccurate input / output tokens of current all-in-one benchmark (#9137)
							
							
							
							
							
							
							
							* first fix
* fix all apis
* fix 
							
						 | 
						
							2023-10-11 17:13:34 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								1c8d5da362
								
							
						 | 
						
							
							
								
								LLM: fix llama tokenizer for all-in-one benchmark (#9129)
							
							
							
							
							
							
							
							* fix tokenizer for gpu benchmark
* fix ipex fp16
* meet code review
* fix 
							
						 | 
						
							2023-10-11 13:39:39 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								ad7d9231f5
								
							
						 | 
						
							
							
								
								LLM: add benchmark script for Max gpu and ipex fp16 gpu (#9112)
							
							
							
							
							
							
							
							* add pvc bash
* meet code review
* rename to run-max-gpu.sh 
							
						 | 
						
							2023-10-10 10:18:41 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								26213a5829
								
							
						 | 
						
							
							
								
								LLM: Change benchmark bf16 load format. (#9035)
							
							
							
							
							
							
							
							* LLM: Change benchmark bf16 load format.
* comment on bf16 chatglm.
* fix. 
							
						 | 
						
							2023-09-22 17:38:38 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xin Qiu
								
							 
						 | 
						
							
							
							
							
								
							
							
								37bb0cbf8f
								
							
						 | 
						
							
							
								
								Speed up gpt-j in gpubenchmark (#9000)
							
							
							
							
							
							
							
							* Speedup gpt-j in gpubenchmark
* meet code review 
							
						 | 
						
							2023-09-19 14:22:28 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								74338fd291
								
							
						 | 
						
							
							
								
								LLM: add auto torch dtype in benchmark. (#8981)
							
							
							
							
							
						 | 
						
							2023-09-18 15:48:25 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Ruonan Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								32716106e0
								
							
						 | 
						
							
							
								
								update use_cahce=True (#8986)
							
							
							
							
							
						 | 
						
							2023-09-18 07:59:33 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xin Qiu
								
							 
						 | 
						
							
							
							
							
								
							
							
								64ee1d7689
								
							
						 | 
						
							
							
								
								update run_transformer_int4_gpu (#8983)
							
							
							
							
							
							
							
							* xpuperf
* update run.py
* clean upo
* uodate
* update
* meet code review 
							
						 | 
						
							2023-09-15 15:10:04 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								cca84b0a64
								
							
						 | 
						
							
							
								
								LLM: update llm benchmark scripts. (#8943)
							
							
							
							
							
							
							
							* update llm benchmark scripts.
* change tranformer_bf16 to pytorch_autocast_bf16.
* add autocast in transformer int4.
* revert autocast.
* add "pytorch_autocast_bf16" to doc
* fix comments. 
							
						 | 
						
							2023-09-13 12:23:28 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Cengguang Zhang
								
							 
						 | 
						
							
							
							
							
								
							
							
								3d2efe9608
								
							
						 | 
						
							
							
								
								LLM: update llm latency benchmark. (#8922)
							
							
							
							
							
						 | 
						
							2023-09-07 19:00:19 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									binbin Deng
								
							 
						 | 
						
							
							
							
							
								
							
							
								7897eb4b51
								
							
						 | 
						
							
							
								
								LLM: add benchmark scripts on GPU (#8916)
							
							
							
							
							
						 | 
						
							2023-09-07 18:08:17 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xin Qiu
								
							 
						 | 
						
							
							
							
							
								
							
							
								d8a01d7c4f
								
							
						 | 
						
							
							
								
								fix chatglm in run.pu (#8919)
							
							
							
							
							
						 | 
						
							2023-09-07 16:44:10 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xin Qiu
								
							 
						 | 
						
							
							
							
							
								
							
							
								e9de9d9950
								
							
						 | 
						
							
							
								
								benchmark for native int4  (#8918)
							
							
							
							
							
							
							
							* native4
* update
* update
* update 
							
						 | 
						
							2023-09-07 15:56:15 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Xin Qiu
								
							 
						 | 
						
							
							
							
							
								
							
							
								5d9942a3ca
								
							
						 | 
						
							
							
								
								transformer int4 and native int4's benchmark script for 32 256 1k 2k input (#8871)
							
							
							
							
							
							
							
							* transformer
* move
* update
* add header
* update all-in-one
* clean up 
							
						 | 
						
							2023-09-07 09:49:55 +08:00 | 
						
						
							
							
							
								
							
							
						 | 
					
				
					
						
							
								
								
									 
									Song Jiaming
								
							 
						 | 
						
							
							
							
							
								
							
							
								c06f1ca93e
								
							
						 | 
						
							
							
								
								[LLM] auto perf test to output to csv (#8846)
							
							
							
							
							
						 | 
						
							2023-09-01 10:48:00 +08:00 | 
						
						
							
							
							
								
							
							
						 |