binbin Deng
|
a6674f5bce
|
Fix should_use_fuse_rope error of Qwen1.5-MoE-A2.7B-Chat (#11216)
|
2024-06-05 15:56:10 +08:00 |
|
Yina Chen
|
b6b70d1ba0
|
Divide core-xe packages (#11131)
* temp
* add batch
* fix style
* update package name
* fix style
* add workflow
* use temp version to run uts
* trigger performance test
* trigger win igpu perf
* revert workflow & setup
|
2024-05-28 12:00:18 +08:00 |
|
Yishuo Wang
|
170e3d65e0
|
use new sdp and fp32 sdp (#11007)
|
2024-05-14 14:29:18 +08:00 |
|
Wang, Jian4
|
191b184341
|
LLM: Optimize cohere model (#10878)
* use mlp and rms
* optimize kv_cache
* add fuse qkv
* add flash attention and fp16 sdp
* error fp8 sdp
* fix optimized
* fix style
* update
* add for pp
|
2024-05-07 10:19:50 +08:00 |
|