Wang, Jian4
|
9c15abf825
|
Refactor fastapi-serving and add one card serving(#11581)
* init fastapi-serving one card
* mv api code to source
* update worker
* update for style-check
* add worker
* update bash
* update
* update worker name and add readme
* rename update
* rename to fastapi
|
2024-07-17 11:12:43 +08:00 |
|
Xiangyu Tian
|
fd933c92d8
|
Fix: Correct num_requests in benchmark for Pipeline Parallel Serving (#11462)
|
2024-06-28 16:10:51 +08:00 |
|
Xiangyu Tian
|
8ddae22cfb
|
LLM: Refactor Pipeline-Parallel-FastAPI example (#11319)
Initially Refactor for Pipeline-Parallel-FastAPI example
|
2024-06-25 13:30:36 +08:00 |
|
Xiangyu Tian
|
4359ab3172
|
LLM: Add /generate_stream endpoint for Pipeline-Parallel-FastAPI example (#11187)
Add /generate_stream and OpenAI-formatted endpoint for Pipeline-Parallel-FastAPI example
|
2024-06-14 15:15:32 +08:00 |
|