Commit graph

9 commits

Author SHA1 Message Date
Yuwen Hu
fd384ddfb8
Optimize StableLM (#10619)
* Initial commit for stablelm optimizations

* Small style fix

* add dependency

* Add mlp optimizations

* Small fix

* add attention forward

* Remove quantize kv for now as head_dim=80

* Add merged qkv

* fix lisence

* Python style fix

---------

Co-authored-by: qiuxin2012 <qiuxin2012cs@gmail.com>
2024-04-02 18:58:38 +08:00
Shaojun Liu
a10f5a1b8d
add python style check (#10620)
* add python style check

* fix style checks

* update runner

* add ipex-llm-finetune-qlora-cpu-k8s to manually_build workflow

* update tag to 2.1.0-SNAPSHOT
2024-04-02 16:17:56 +08:00
Cengguang Zhang
58b57177e3
LLM: support bigdl quantize kv cache env and add warning. (#10623)
* LLM: support bigdl quantize kv cache env and add warnning.

* fix style.

* fix comments.
2024-04-02 15:41:08 +08:00
Cengguang Zhang
e567956121
LLM: add memory optimization for llama. (#10592)
* add initial memory optimization.

* fix logic.

* fix logic,

* remove env var check in mlp split.
2024-04-02 09:07:50 +08:00
Qiyuan Gong
f4537798c1
Enable kv cache quantization by default for flex when 1 < batch <= 8 (#10584)
* Enable kv cache quantization by default for flex when 1 < batch <= 8.
* Change up bound from <8 to <=8.
2024-03-29 09:43:42 +08:00
Cengguang Zhang
b44f7adbad
LLM: Disable esimd sdp for PVC GPU when batch size>1 (#10579)
* llm: disable esimd sdp for pvc bz>1.

* fix logic.

* fix: avoid call get device name twice.
2024-03-28 22:55:48 +08:00
Ruonan Wang
ea4bc450c4
LLM: add esimd sdp for pvc (#10543)
* add esimd sdp for pvc

* update

* fix

* fix batch
2024-03-26 19:04:40 +08:00
Xin Qiu
1dd40b429c
enable fp4 fused mlp and qkv (#10531)
* enable fp4 fused mlp and qkv

* update qwen

* update qwen2
2024-03-26 08:34:00 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm

* rm python/llm/src/bigdl

* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Renamed from python/llm/src/bigdl/llm/transformers/models/utils.py (Browse further)