Commit graph

19 commits

Author SHA1 Message Date
Yishuo Wang
68857494a5
refactor to simplify following upgrade 2 (#12685) 2025-01-10 09:29:03 +08:00
Yishuo Wang
7234c9b27b
update quantize kv cache condition (#12681) 2025-01-09 15:23:04 +08:00
Yishuo Wang
4135b895b3
refactor chatglm2, internlm, stablelm and qwen (#12604) 2024-12-24 18:18:00 +08:00
Yishuo Wang
540eaeb12c
refactor attention_softmax (#12295) 2024-10-30 13:20:50 +08:00
Yishuo Wang
bb247e991b
refactor merge_qkv and attention_softmax (#12213) 2024-10-16 15:58:14 +08:00
Yishuo Wang
a945500a98
fix internlm xcomposser stream chat (#11564) 2024-07-11 18:21:17 +08:00
Yishuo Wang
994e49a510
optimize internlm xcomposser performance again (#11551) 2024-07-10 17:08:56 +08:00
Yishuo Wang
82f9514303
optimize internlm xcomposer2 performance (#11550) 2024-07-10 15:57:04 +08:00
Yishuo Wang
c6e5ad668d
fix internlm xcomposser meta-instruction typo (#11448) 2024-06-27 15:29:43 +08:00
Yishuo Wang
10e480ee96
refactor internlm and internlm2 (#11274) 2024-06-11 14:19:19 +08:00
Yishuo Wang
d307622797
fix first token sdp with batch (#11153) 2024-05-28 15:03:06 +08:00
Yina Chen
b6b70d1ba0
Divide core-xe packages (#11131)
* temp

* add batch

* fix style

* update package name

* fix style

* add workflow

* use temp version to run uts

* trigger performance test

* trigger win igpu perf

* revert workflow & setup
2024-05-28 12:00:18 +08:00
Yishuo Wang
1db9d9a63b
optimize internlm2 xcomposer agin (#11124) 2024-05-24 13:44:52 +08:00
Yishuo Wang
9372ce87ce
fix internlm xcomposer2 fp16 (#11123) 2024-05-24 11:03:31 +08:00
Yishuo Wang
37b98a531f
support running internlm xcomposer2 on gpu and add sdp optimization (#11115) 2024-05-23 17:26:24 +08:00
Yishuo Wang
0e53f20edb
support running internlm-xcomposer2 on cpu (#11111) 2024-05-23 16:36:09 +08:00
Cengguang Zhang
3e2662c87e
LLM: fix get env KV_CACHE_ALLOC_BLOCK_LENGTH type. (#10771) 2024-04-16 09:32:30 +08:00
Keyan (Kyrie) Zhang
585c174e92
Read the value of KV_CACHE_ALLOC_BLOCK_LENGTH from the environment variables (#10707)
* Read the value of KV_CACHE_ALLOC_BLOCK_LENGTH from the environment variables.

* Fix style
2024-04-10 10:48:46 +08:00
Wang, Jian4
9df70d95eb
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm

* rm python/llm/src/bigdl

* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
Renamed from python/llm/src/bigdl/llm/transformers/models/internlm.py (Browse further)