Chu,Youcheng
29400e2e75
feat: change oneccl to internal ( #12296 )
...
* feat: change oneccl
* fix: restore llama-70b
* fix: remove tab
* fix: remove extra blank
* small fix
* add comments
* fix: add a blank space
2024-10-31 09:51:43 +08:00
Qiyuan Gong
762ad49362
Add RANK_WAIT_TIME into DeepSpeed-AutoTP to avoid CPU memory OOM ( #11704 )
...
* DeepSpeed-AutoTP will start multiple processors to load models and convert them in CPU memory. If model/rank_num is large, this will lead to OOM. Add RANK_WAIT_TIME to reduce memory usage by controlling model reading parallelism.
2024-08-01 18:16:21 +08:00
binbin Deng
7170dd9192
Update guide for running qwen with AutoTP ( #11065 )
2024-05-20 10:53:17 +08:00
Heyang Sun
26cae0a39c
Update FLEX in Deepspeed README ( #10774 )
...
* Update FLEX in Deepspeed README
* Update README.md
2024-04-17 09:28:24 +08:00
ZehuaCao
599a88db53
Add deepsped-autoTP-Fastapi serving ( #10748 )
...
* add deepsped-autoTP-Fastapi serving
* add readme
* add license
* update
* update
* fix
2024-04-16 14:03:23 +08:00
Shaojun Liu
f37a1f2a81
Upgrade to python 3.11 ( #10711 )
...
* create conda env with python 3.11
* recommend to use Python 3.11
* update
2024-04-09 17:41:17 +08:00
binbin Deng
d9a1153b4e
LLM: upgrade deepspeed in AutoTP on GPU ( #10647 )
2024-04-07 14:05:19 +08:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package ( #10557 )
...
* Change to 'pip install .. --extra-index-url' for readthedocs
* Change to 'pip install .. --extra-index-url' for examples
* Change to 'pip install .. --extra-index-url' for remaining files
* Fix URL for ipex
* Add links for ipex US and CN servers
* Update ipex cpu url
* remove readme
* Update for github actions
* Update for dockerfiles
2024-03-28 09:56:23 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang ( #30 )
2024-03-25 10:06:02 +08:00
binbin Deng
5d7e044dbc
LLM: add low bit option in deepspeed autotp example ( #10382 )
2024-03-12 17:07:09 +08:00
binbin Deng
db8e90796a
LLM: add avg token latency information and benchmark guide of autotp ( #9940 )
2024-01-19 15:09:57 +08:00
Yuwen Hu
23fc888abe
Update llm gpu xpu default related info to PyTorch 2.1 ( #9866 )
2024-01-09 15:38:47 +08:00
binbin Deng
294fd32787
LLM: update DeepSpeed AutoTP example with GPU memory optimization ( #9823 )
2024-01-09 09:22:49 +08:00
binbin Deng
ed8ed76d4f
LLM: update deepspeed autotp usage ( #9733 )
2023-12-25 09:41:14 +08:00
Yang Wang
8838707009
Add deepspeed autotp example readme ( #9289 )
...
* Add deepspeed autotp example readme
* change word
2023-10-27 13:04:38 -07:00