Commit graph

4 commits

Author SHA1 Message Date
binbin Deng
a6674f5bce
Fix should_use_fuse_rope error of Qwen1.5-MoE-A2.7B-Chat (#11216) 2024-06-05 15:56:10 +08:00
Yina Chen
b6b70d1ba0
Divide core-xe packages (#11131)
* temp

* add batch

* fix style

* update package name

* fix style

* add workflow

* use temp version to run uts

* trigger performance test

* trigger win igpu perf

* revert workflow & setup
2024-05-28 12:00:18 +08:00
Yishuo Wang
170e3d65e0
use new sdp and fp32 sdp (#11007) 2024-05-14 14:29:18 +08:00
Wang, Jian4
191b184341
LLM: Optimize cohere model (#10878)
* use mlp and rms

* optimize kv_cache

* add fuse qkv

* add flash attention and fp16 sdp

* error fp8 sdp

* fix optimized

* fix style

* update

* add for pp
2024-05-07 10:19:50 +08:00