binbin Deng
e50c890e1f
Support finishing PP inference once eos_token_id is found ( #11336 )
2024-06-18 09:55:40 +08:00
SONG Ge
ef4b6519fb
Add phi-3 model support for pipeline parallel inference ( #11334 )
...
* add phi-3 model support
* add phi3 example
2024-06-17 17:44:24 +08:00
SONG Ge
be00380f1a
Fix pipeline parallel inference past_key_value error in Baichuan ( #11318 )
...
* fix past_key_value error
* add baichuan2 example
* fix style
* update doc
* add script link in doc
* fix import error
* update
2024-06-17 09:29:32 +08:00
binbin Deng
60cb1dac7c
Support PP for qwen1.5 ( #11300 )
2024-06-13 17:35:24 +08:00
binbin Deng
220151e2a1
Refactor pipeline parallel multi-stage implementation ( #11286 )
2024-06-13 10:00:23 +08:00
Yuwen Hu
af96579c76
Update installation guide for pipeline parallel inference ( #11224 )
...
* Update installation guide for pipeline parallel inference
* Small fix
* further fix
* Small fix
* Small fix
* Update based on comments
* Small fix
* Small fix
* Small fix
2024-06-05 17:54:29 +08:00
binbin Deng
fabf54e052
LLM: make pipeline parallel inference example more common ( #10786 )
2024-04-24 09:28:52 +08:00
Shaojun Liu
f37a1f2a81
Upgrade to python 3.11 ( #10711 )
...
* create conda env with python 3.11
* recommend to use Python 3.11
* update
2024-04-09 17:41:17 +08:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package ( #10557 )
...
* Change to 'pip install .. --extra-index-url' for readthedocs
* Change to 'pip install .. --extra-index-url' for examples
* Change to 'pip install .. --extra-index-url' for remaining files
* Fix URL for ipex
* Add links for ipex US and CN servers
* Update ipex cpu url
* remove readme
* Update for github actions
* Update for dockerfiles
2024-03-28 09:56:23 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang ( #30 )
2024-03-25 10:06:02 +08:00
Yang Wang
9e763b049c
Support running pipeline parallel inference by vertically partitioning model to different devices ( #10392 )
...
* support pipeline parallel inference
* fix logging
* remove benchmark file
* fic
* need to warmup twice
* support qwen and qwen2
* fix lint
* remove genxir
* refine
2024-03-18 13:04:45 -07:00