Commit graph

2 commits

Author SHA1 Message Date
Wang, Jian4
16b2ef49c6
Update_document by heyang (#30) 2024-03-25 10:06:02 +08:00
Yang Wang
9e763b049c Support running pipeline parallel inference by vertically partitioning model to different devices (#10392)
* support pipeline parallel inference

* fix logging

* remove benchmark file

* fic

* need to warmup twice

* support qwen and qwen2

* fix lint

* remove genxir

* refine
2024-03-18 13:04:45 -07:00