Yang Wang
								
							 
						 | 
						
							
							
							
							
								
							
							
								9e763b049c
								
							
						 | 
						
							
							
								
								Support running pipeline parallel inference by vertically partitioning model to different devices (#10392)
							
							
							
							
							
							
							
							* support pipeline parallel inference
* fix logging
* remove benchmark file
* fic
* need to warmup twice
* support qwen and qwen2
* fix lint
* remove genxir
* refine 
							
						 | 
						
							2024-03-18 13:04:45 -07:00 | 
						
						
							
							
							
								
							
							
						 |