Guancheng Fu
61c67af386
Fix vLLM-v2 install instructions( #10822 )
2024-04-22 09:02:48 +08:00
Guancheng Fu
cbe7b5753f
Add vLLM[xpu] related code ( #10779 )
...
* Add ipex-llm side change
* add runable offline_inference
* refactor to call vllm2
* Verified async server
* add new v2 example
* add README
* fix
* change dir
* refactor readme.md
* add experimental
* fix
2024-04-18 15:29:20 +08:00
Shaojun Liu
f37a1f2a81
Upgrade to python 3.11 ( #10711 )
...
* create conda env with python 3.11
* recommend to use Python 3.11
* update
2024-04-09 17:41:17 +08:00
Jiao Wang
69bdbf5806
Fix vllm print error message issue ( #10664 )
...
* update chatglm readme
* Add condition to invalidInputError
* update
* update
* style
2024-04-05 15:08:13 -07:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package ( #10557 )
...
* Change to 'pip install .. --extra-index-url' for readthedocs
* Change to 'pip install .. --extra-index-url' for examples
* Change to 'pip install .. --extra-index-url' for remaining files
* Fix URL for ipex
* Add links for ipex US and CN servers
* Update ipex cpu url
* remove readme
* Update for github actions
* Update for dockerfiles
2024-03-28 09:56:23 +08:00
Cheen Hau, 俊豪
f239bc329b
Specify oneAPI minor version in documentation ( #10561 )
2024-03-27 17:58:57 +08:00
Wang, Jian4
16b2ef49c6
Update_document by heyang ( #30 )
2024-03-25 10:06:02 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm ( #24 )
...
* Rename bigdl/llm to ipex_llm
* rm python/llm/src/bigdl
* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Guancheng Fu
2d930bdca8
Add vLLM bf16 support ( #10278 )
...
* add argument load_in_low_bit
* add docs
* modify gpu doc
* done
---------
Co-authored-by: ivy-lv11 <lvzc@lamda.nju.edu.cn>
2024-02-29 16:33:42 +08:00
Yuwen Hu
23fc888abe
Update llm gpu xpu default related info to PyTorch 2.1 ( #9866 )
2024-01-09 15:38:47 +08:00
Guancheng Fu
8b00653039
fix doc ( #9599 )
2023-12-05 13:49:31 +08:00
Guancheng Fu
963a5c8d79
Add vLLM-XPU version's README/examples ( #9536 )
...
* test
* test
* fix last kv cache
* add xpu readme
* remove numactl for xpu example
* fix link error
* update max_num_batched_tokens logic
* add explaination
* add xpu environement version requirement
* refine gpu memory
* fix
* fix style
2023-11-28 09:44:03 +08:00