Wang, Jian4
|
16b2ef49c6
|
Update_document by heyang (#30)
|
2024-03-25 10:06:02 +08:00 |
|
binbin Deng
|
5d7e044dbc
|
LLM: add low bit option in deepspeed autotp example (#10382)
|
2024-03-12 17:07:09 +08:00 |
|
binbin Deng
|
db8e90796a
|
LLM: add avg token latency information and benchmark guide of autotp (#9940)
|
2024-01-19 15:09:57 +08:00 |
|
Yuwen Hu
|
23fc888abe
|
Update llm gpu xpu default related info to PyTorch 2.1 (#9866)
|
2024-01-09 15:38:47 +08:00 |
|
binbin Deng
|
294fd32787
|
LLM: update DeepSpeed AutoTP example with GPU memory optimization (#9823)
|
2024-01-09 09:22:49 +08:00 |
|
binbin Deng
|
ed8ed76d4f
|
LLM: update deepspeed autotp usage (#9733)
|
2023-12-25 09:41:14 +08:00 |
|
Yang Wang
|
8838707009
|
Add deepspeed autotp example readme (#9289)
* Add deepspeed autotp example readme
* change word
|
2023-10-27 13:04:38 -07:00 |
|