ipex-llm

Author	SHA1	Message	Date
binbin Deng	60cb1dac7c	Support PP for qwen1.5 (#11300 )	2024-06-13 17:35:24 +08:00
binbin Deng	220151e2a1	Refactor pipeline parallel multi-stage implementation (#11286 )	2024-06-13 10:00:23 +08:00
Yuwen Hu	af96579c76	Update installation guide for pipeline parallel inference (#11224 ) * Update installation guide for pipeline parallel inference * Small fix * further fix * Small fix * Small fix * Update based on comments * Small fix * Small fix * Small fix	2024-06-05 17:54:29 +08:00
binbin Deng	fabf54e052	LLM: make pipeline parallel inference example more common (#10786 )	2024-04-24 09:28:52 +08:00
Shaojun Liu	f37a1f2a81	Upgrade to python 3.11 (#10711 ) * create conda env with python 3.11 * recommend to use Python 3.11 * update	2024-04-09 17:41:17 +08:00
Cheen Hau, 俊豪	1c5eb14128	Update pip install to use --extra-index-url for ipex package (#10557 ) * Change to 'pip install .. --extra-index-url' for readthedocs * Change to 'pip install .. --extra-index-url' for examples * Change to 'pip install .. --extra-index-url' for remaining files * Fix URL for ipex * Add links for ipex US and CN servers * Update ipex cpu url * remove readme * Update for github actions * Update for dockerfiles	2024-03-28 09:56:23 +08:00
Wang, Jian4	16b2ef49c6	Update_document by heyang (#30 )	2024-03-25 10:06:02 +08:00
Yang Wang	9e763b049c	Support running pipeline parallel inference by vertically partitioning model to different devices (#10392 ) * support pipeline parallel inference * fix logging * remove benchmark file * fic * need to warmup twice * support qwen and qwen2 * fix lint * remove genxir * refine	2024-03-18 13:04:45 -07:00

8 commits