Shaojun Liu
a10f5a1b8d
add python style check ( #10620 )
...
* add python style check
* fix style checks
* update runner
* add ipex-llm-finetune-qlora-cpu-k8s to manually_build workflow
* update tag to 2.1.0-SNAPSHOT
2024-04-02 16:17:56 +08:00
Cengguang Zhang
58b57177e3
LLM: support bigdl quantize kv cache env and add warning. ( #10623 )
...
* LLM: support bigdl quantize kv cache env and add warnning.
* fix style.
* fix comments.
2024-04-02 15:41:08 +08:00
Shaojun Liu
20a5e72da0
refine and verify ipex-llm-serving-xpu docker document ( #10615 )
...
* refine serving on cpu/xpu
* minor fix
* replace localhost with 0.0.0.0 so that service can be accessed through ip address
2024-04-02 11:45:45 +08:00
Yuwen Hu
89d780f2e9
Small fix to install guide ( #10618 )
2024-04-02 11:10:55 +08:00
Kai Huang
0a95c556a1
Fix starcoder first token perf ( #10612 )
...
* add bias check
* update
2024-04-02 09:21:38 +08:00
Cengguang Zhang
e567956121
LLM: add memory optimization for llama. ( #10592 )
...
* add initial memory optimization.
* fix logic.
* fix logic,
* remove env var check in mlp split.
2024-04-02 09:07:50 +08:00
Keyan (Kyrie) Zhang
01f491757a
Modify the link in Langchain-upstream ut ( #10608 )
...
* Modify the link in Langchain-upstream ut
* fix langchain-upstream ut
2024-04-01 17:03:40 +08:00
Ruonan Wang
bfc1caa5e5
LLM: support iq1s for llama2-70b-hf ( #10596 )
2024-04-01 13:13:13 +08:00
Ruonan Wang
d6af4877dd
LLM: remove ipex.optimize for gpt-j ( #10606 )
...
* remove ipex.optimize
* fix
* fix
2024-04-01 12:21:49 +08:00
Shaojun Liu
59058bb206
replace 2.5.0-SNAPSHOT with 2.1.0-SNAPSHOT for llm docker images ( #10603 )
2024-04-01 09:58:51 +08:00
Yishuo Wang
437a349dd6
fix rwkv with pip installer ( #10591 )
2024-03-29 17:56:45 +08:00
WeiguangHan
9a83f21b86
LLM: check user env ( #10580 )
...
* LLM: check user env
* small fix
* small fix
* small fix
2024-03-29 17:19:34 +08:00
Shaojun Liu
c4b533f0e1
nightly build docker images ( #10585 )
...
* nightly build docker images
2024-03-29 16:12:28 +08:00
Shaojun Liu
b06de94a50
verify xpu-inference image and refine document ( #10593 )
2024-03-29 16:11:12 +08:00
Yuxuan Xia
856f1ace2b
Add linux 6.5 kernel installation ( #10573 )
...
* Add linux 6.5 kernel installation
* Fix linux quick start typo
2024-03-29 16:02:19 +08:00
Keyan (Kyrie) Zhang
848fa04dd6
Fix typo in Baichuan2 example ( #10589 )
2024-03-29 13:31:47 +08:00
Shaojun Liu
52f1b541cf
refine and verify ipex-inference-cpu docker document ( #10565 )
...
* restructure the index
* refine and verify cpu-inference document
* update
2024-03-29 10:16:10 +08:00
Ruonan Wang
0136fad1d4
LLM: support iq1_s ( #10564 )
...
* init version
* update utils
* remove unsed code
2024-03-29 09:43:55 +08:00
Qiyuan Gong
f4537798c1
Enable kv cache quantization by default for flex when 1 < batch <= 8 ( #10584 )
...
* Enable kv cache quantization by default for flex when 1 < batch <= 8.
* Change up bound from <8 to <=8.
2024-03-29 09:43:42 +08:00
Cengguang Zhang
b44f7adbad
LLM: Disable esimd sdp for PVC GPU when batch size>1 ( #10579 )
...
* llm: disable esimd sdp for pvc bz>1.
* fix logic.
* fix: avoid call get device name twice.
2024-03-28 22:55:48 +08:00
Yuwen Hu
e6c5a6a5e6
Small style fix in Install Guide ( #10581 )
...
* Remove strange bold style
* Small fix
2024-03-28 18:36:17 +08:00
Yuwen Hu
15b8964403
Win install change oneapi to pip installer ( #10577 )
...
* Update windows related guide to use pip installer for oneAPI
* Small style fix
* Add oneAPI version
* Update based on comments
* Small fix
2024-03-28 18:22:46 +08:00
Xin Qiu
5963239b46
Fix qwen's position_ids no enough ( #10572 )
...
* fix position_ids
* fix position_ids
2024-03-28 17:05:49 +08:00
ZehuaCao
52a2135d83
Replace ipex with ipex-llm ( #10554 )
...
* fix ipex with ipex_llm
* fix ipex with ipex_llm
* update
* update
* update
* update
* update
* update
* update
* update
2024-03-28 13:54:40 +08:00
Keyan (Kyrie) Zhang
0a2e820c9f
Modify install_linux_gpu.md ( #10576 )
2024-03-28 13:20:42 +08:00
Cheen Hau, 俊豪
1c5eb14128
Update pip install to use --extra-index-url for ipex package ( #10557 )
...
* Change to 'pip install .. --extra-index-url' for readthedocs
* Change to 'pip install .. --extra-index-url' for examples
* Change to 'pip install .. --extra-index-url' for remaining files
* Fix URL for ipex
* Add links for ipex US and CN servers
* Update ipex cpu url
* remove readme
* Update for github actions
* Update for dockerfiles
2024-03-28 09:56:23 +08:00
binbin Deng
92dfed77be
LLM: fix abnormal output of fp16 deepspeed autotp ( #10558 )
2024-03-28 09:35:48 +08:00
Kai Huang
e619142a16
Add SYCL_CACHE_PERSISTENT in doc and explain warmup in benchmark quickstart ( #10571 )
...
* update doc
* update
2024-03-27 21:03:51 +08:00
Jason Dai
c450c85489
Delete llm/readme.md ( #10569 )
2024-03-27 20:06:40 +08:00
Jason Dai
08e9aeb31f
Update index.rst
2024-03-27 19:41:19 +08:00
Yuwen Hu
1bae5f40d2
Hide pip installer for windows install ( #10568 )
...
* Hide oneAPI install with pip installer for now
* Small fix
2024-03-27 18:41:41 +08:00
Xiangyu Tian
51d34ca68e
Fix wrong import in speculative ( #10562 )
2024-03-27 18:21:07 +08:00
Cheen Hau, 俊豪
f239bc329b
Specify oneAPI minor version in documentation ( #10561 )
2024-03-27 17:58:57 +08:00
WeiguangHan
fbeb10c796
LLM: Set different env based on different Linux kernels ( #10566 )
2024-03-27 17:56:33 +08:00
hxsz1997
d86477f14d
Remove native_int4 in LangChain examples ( #10510 )
...
* rebase the modify to ipex-llm
* modify the typo
2024-03-27 17:48:16 +08:00
Shaojun Liu
924e01b842
Create scorecard.yml ( #10559 )
2024-03-27 16:51:10 +08:00
Guancheng Fu
04baac5a2e
Fix fastchat top_k ( #10560 )
...
* fix -1 top_k
* fix
* done
2024-03-27 16:01:58 +08:00
binbin Deng
fc8c7904f0
LLM: fix torch_dtype setting of apply fp16 optimization through optimize_model ( #10556 )
2024-03-27 14:18:45 +08:00
Ruonan Wang
ea4bc450c4
LLM: add esimd sdp for pvc ( #10543 )
...
* add esimd sdp for pvc
* update
* fix
* fix batch
2024-03-26 19:04:40 +08:00
Jin Qiao
817ef2d1de
Add verified models in document index ( #10546 )
...
* Add verified models in document index
* try to adjust column width
* try to adjust column width
* try to adjust column width
* try to adjust column width
* try replace link
* change to ipex-llm-tutorial
* try use raw html
* adjust table header
2024-03-26 18:25:32 +08:00
Jin Qiao
b78289a595
Remove ipex-llm dependency in readme ( #10544 )
2024-03-26 18:25:14 +08:00
Xiangyu Tian
11550d3f25
LLM: Add length check for IPEX-CPU speculative decoding ( #10529 )
...
Add length check for IPEX-CPU speculative decoding.
2024-03-26 17:47:10 +08:00
Guancheng Fu
a3b007f3b1
[Serving] Fix fastchat breaks ( #10548 )
...
* fix fastchat
* fix doc
2024-03-26 17:03:52 +08:00
Yishuo Wang
69a28d6b4c
fix chatglm ( #10540 )
2024-03-26 16:01:00 +08:00
Shaojun Liu
2ecd737474
change bigdl-llm-tutorial to ipex-llm-tutorial in README ( #10547 )
...
* update bigdl-llm-tutorial to ipex-llm-tutorial
* change to ipex-llm-tutorial
2024-03-26 15:19:53 +08:00
Shaojun Liu
bb9be70105
replace bigdl-llm with ipex-llm ( #10545 )
2024-03-26 15:12:38 +08:00
Shaojun Liu
c563b41491
add nightly_build workflow ( #10533 )
...
* add nightly_build workflow
* add create-job-status-badge action
* update
* update
* update
* update setup.py
* release
* revert
2024-03-26 12:47:38 +08:00
binbin Deng
0a3e4e788f
LLM: fix mistral hidden_size setting for deepspeed autotp ( #10527 )
2024-03-26 10:55:44 +08:00
Xin Qiu
1dd40b429c
enable fp4 fused mlp and qkv ( #10531 )
...
* enable fp4 fused mlp and qkv
* update qwen
* update qwen2
2024-03-26 08:34:00 +08:00
Yuwen Hu
9367db7f2b
Small typo fix ( #10535 )
2024-03-25 18:48:44 +08:00