Commit graph

8 commits

Author SHA1 Message Date
Wang, Jian4
9c15abf825
Refactor fastapi-serving and add one card serving(#11581)
* init fastapi-serving one card

* mv api code to source

* update worker

* update for style-check

* add worker

* update bash

* update

* update worker name and add readme

* rename update

* rename to fastapi
2024-07-17 11:12:43 +08:00
Xiangyu Tian
7f5111a998
LLM: Refine start script for Pipeline Parallel Serving (#11557)
Refine start script and readme for Pipeline Parallel Serving
2024-07-11 15:45:27 +08:00
Xiangyu Tian
7d8bc83415
LLM: Partial Prefilling for Pipeline Parallel Serving (#11457)
LLM: Partial Prefilling for Pipeline Parallel Serving
2024-07-05 13:10:35 +08:00
binbin Deng
987017ef47
Update pipeline parallel serving for more model support (#11428) 2024-06-27 18:21:01 +08:00
Xiangyu Tian
8ddae22cfb
LLM: Refactor Pipeline-Parallel-FastAPI example (#11319)
Initially Refactor for Pipeline-Parallel-FastAPI example
2024-06-25 13:30:36 +08:00
Xiangyu Tian
4359ab3172
LLM: Add /generate_stream endpoint for Pipeline-Parallel-FastAPI example (#11187)
Add /generate_stream and OpenAI-formatted endpoint for Pipeline-Parallel-FastAPI example
2024-06-14 15:15:32 +08:00
Xiangyu Tian
2299698b45
Refine Pipeline Parallel FastAPI example (#11168) 2024-05-29 17:16:50 +08:00
Xiangyu Tian
5c8ccf0ba9
LLM: Add Pipeline-Parallel-FastAPI example (#10917)
Add multi-stage Pipeline-Parallel-FastAPI example

---------

Co-authored-by: hzjane <a1015616934@qq.com>
2024-05-27 14:46:29 +08:00