Shaojun Liu
|
107f7aafd0
|
enable inference mode for deepspeed tp serving (#11742)
|
2024-08-08 14:38:30 +08:00 |
|
Zijie Li
|
8fb36b9f4a
|
add new benchmark_util.py (#11713)
* add new benchmark_util.py
|
2024-08-05 16:18:48 +08:00 |
|
binbin Deng
|
f97cce2642
|
Fix import error of ds autotp (#11307)
|
2024-06-13 16:22:52 +08:00 |
|
Shaojun Liu
|
85df5e7699
|
fix nightly perf test (#11251)
|
2024-06-07 09:33:14 +08:00 |
|
ZehuaCao
|
751e1a4e29
|
Fix concurrent issue in autoTP streming. (#11150)
* add benchmark test
* update
|
2024-05-29 08:22:38 +08:00 |
|
ZehuaCao
|
63e95698eb
|
[LLM]Reopen autotp generate_stream (#11120)
* reopen autotp generate_stream
* fix style error
* update
|
2024-05-24 17:16:14 +08:00 |
|
Wang, Jian4
|
d9f71f1f53
|
Update benchmark util for example using (#11027)
* mv benchmark_util.py to utils/
* remove
* update
|
2024-05-15 14:16:35 +08:00 |
|
Xiangyu Tian
|
02870dc385
|
LLM: Refine README of AutoTP-FastAPI example (#10960)
|
2024-05-08 16:55:23 +08:00 |
|
Xiangyu Tian
|
13a44cdacb
|
LLM: Refine Deepspped-AutoTP-FastAPI example (#10916)
|
2024-05-07 09:37:31 +08:00 |
|
Xiangyu Tian
|
3d4950b0f0
|
LLM: Enable batch generate (world_size>1) in Deepspeed-AutoTP-FastAPI example (#10876)
Enable batch generate (world_size>1) in Deepspeed-AutoTP-FastAPI example.
|
2024-04-26 13:24:28 +08:00 |
|
ZehuaCao
|
599a88db53
|
Add deepsped-autoTP-Fastapi serving (#10748)
* add deepsped-autoTP-Fastapi serving
* add readme
* add license
* update
* update
* fix
|
2024-04-16 14:03:23 +08:00 |
|