* use mlp and rms * optimize kv_cache * add fuse qkv * add flash attention and fp16 sdp * error fp8 sdp * fix optimized * fix style * update * add for pp |
||
|---|---|---|
| .. | ||
| dev | ||
| example | ||
| portable-zip | ||
| scripts | ||
| src/ipex_llm | ||
| test | ||
| .gitignore | ||
| setup.py | ||
| version.txt | ||
* use mlp and rms * optimize kv_cache * add fuse qkv * add flash attention and fp16 sdp * error fp8 sdp * fix optimized * fix style * update * add for pp |
||
|---|---|---|
| .. | ||
| dev | ||
| example | ||
| portable-zip | ||
| scripts | ||
| src/ipex_llm | ||
| test | ||
| .gitignore | ||
| setup.py | ||
| version.txt | ||