Commit graph

9 commits

Author SHA1 Message Date
Yishuo Wang
7234c9b27b
update quantize kv cache condition (#12681) 2025-01-09 15:23:04 +08:00
Yishuo Wang
b050368efc
refactor yuan2 and starcoder2 and fix (#12589) 2024-12-20 16:41:50 +08:00
Yishuo Wang
540eaeb12c
refactor attention_softmax (#12295) 2024-10-30 13:20:50 +08:00
Yishuo Wang
bb247e991b
refactor merge_qkv and attention_softmax (#12213) 2024-10-16 15:58:14 +08:00
Yishuo Wang
c4e5806e01
add latest optimization in starcoder2 (#11236) 2024-06-06 14:02:17 +08:00
Yina Chen
b6b70d1ba0
Divide core-xe packages (#11131)
* temp

* add batch

* fix style

* update package name

* fix style

* add workflow

* use temp version to run uts

* trigger performance test

* trigger win igpu perf

* revert workflow & setup
2024-05-28 12:00:18 +08:00
Yishuo Wang
d884c62dc4
remove new_layout parameter (#10906) 2024-04-29 10:31:50 +08:00
Yishuo Wang
702e686901
optimize starcoder normal kv cache (#10642) 2024-04-03 15:27:02 +08:00
Yishuo Wang
ba8cc6bd68
optimize starcoder2-3b (#10625) 2024-04-02 17:16:29 +08:00