* use mlp and rms * optimize kv_cache * add fuse qkv * add flash attention and fp16 sdp * error fp8 sdp * fix optimized * fix style * update * add for pp |
||
|---|---|---|
| .. | ||
| ipex_llm | ||
* use mlp and rms * optimize kv_cache * add fuse qkv * add flash attention and fp16 sdp * error fp8 sdp * fix optimized * fix style * update * add for pp |
||
|---|---|---|
| .. | ||
| ipex_llm | ||