binbin Deng
|
e50c890e1f
|
Support finishing PP inference once eos_token_id is found (#11336)
|
2024-06-18 09:55:40 +08:00 |
|
SONG Ge
|
ef4b6519fb
|
Add phi-3 model support for pipeline parallel inference (#11334)
* add phi-3 model support
* add phi3 example
|
2024-06-17 17:44:24 +08:00 |
|
binbin Deng
|
6ea1e71af0
|
Update PP inference benchmark script (#11323)
|
2024-06-17 09:59:36 +08:00 |
|
SONG Ge
|
be00380f1a
|
Fix pipeline parallel inference past_key_value error in Baichuan (#11318)
* fix past_key_value error
* add baichuan2 example
* fix style
* update doc
* add script link in doc
* fix import error
* update
|
2024-06-17 09:29:32 +08:00 |
|
binbin Deng
|
220151e2a1
|
Refactor pipeline parallel multi-stage implementation (#11286)
|
2024-06-13 10:00:23 +08:00 |
|