Commit graph

12 commits

Author SHA1 Message Date
Guancheng Fu
61c67af386
Fix vLLM-v2 install instructions(#10822) 2024-04-22 09:02:48 +08:00
Guancheng Fu
cbe7b5753f
Add vLLM[xpu] related code (#10779)
* Add ipex-llm side change

* add runable offline_inference

* refactor to call vllm2

* Verified async server

* add new v2 example

* add README

* fix

* change dir

* refactor readme.md

* add experimental

* fix
2024-04-18 15:29:20 +08:00
Shaojun Liu
f37a1f2a81
Upgrade to python 3.11 (#10711)
* create conda env with python 3.11

* recommend to use Python 3.11

* update
2024-04-09 17:41:17 +08:00
Jiao Wang
69bdbf5806
Fix vllm print error message issue (#10664)
* update chatglm readme

* Add condition to invalidInputError

* update

* update

* style
2024-04-05 15:08:13 -07:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package (#10557)
* Change to 'pip install .. --extra-index-url' for readthedocs

* Change to 'pip install .. --extra-index-url' for examples

* Change to 'pip install .. --extra-index-url' for remaining files

* Fix URL for ipex

* Add links for ipex US and CN servers

* Update ipex cpu url

* remove readme

* Update for github actions

* Update for dockerfiles
2024-03-28 09:56:23 +08:00
Cheen Hau, 俊豪
f239bc329b
Specify oneAPI minor version in documentation (#10561) 2024-03-27 17:58:57 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang (#30) 2024-03-25 10:06:02 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm

* rm python/llm/src/bigdl

* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Guancheng Fu
2d930bdca8 Add vLLM bf16 support (#10278)
* add argument load_in_low_bit

* add docs

* modify gpu doc

* done

---------

Co-authored-by: ivy-lv11 <lvzc@lamda.nju.edu.cn>
2024-02-29 16:33:42 +08:00
Yuwen Hu
23fc888abe Update llm gpu xpu default related info to PyTorch 2.1 (#9866) 2024-01-09 15:38:47 +08:00
Guancheng Fu
8b00653039 fix doc (#9599) 2023-12-05 13:49:31 +08:00
Guancheng Fu
963a5c8d79 Add vLLM-XPU version's README/examples (#9536)
* test

* test

* fix last kv cache

* add xpu readme

* remove numactl for xpu example

* fix link error

* update max_num_batched_tokens logic

* add explaination

* add xpu environement version requirement

* refine gpu memory

* fix

* fix style
2023-11-28 09:44:03 +08:00