Commit graph

2 commits

Author SHA1 Message Date
Yishuo Wang
170e3d65e0
use new sdp and fp32 sdp (#11007) 2024-05-14 14:29:18 +08:00
Wang, Jian4
191b184341
LLM: Optimize cohere model (#10878)
* use mlp and rms

* optimize kv_cache

* add fuse qkv

* add flash attention and fp16 sdp

* error fp8 sdp

* fix optimized

* fix style

* update

* add for pp
2024-05-07 10:19:50 +08:00