Xin Qiu
|
1274cba79b
|
stablelm fp8 kv cache (#10672)
* stablelm fp8 kvcache
* update
* fix
* change to fp8 matmul
* fix style
* fix
* fix
* meet code review
* add comment
|
2024-04-08 15:16:46 +08:00 |
|
Xin Qiu
|
4c3e493b2d
|
fix stablelm2 1.6b (#10656)
* fix stablelm2 1.6b
* meet code review
|
2024-04-03 22:15:32 +08:00 |
|
Xin Qiu
|
3a9ab8f1ae
|
fix stablelm logits diff (#10636)
* fix logits diff
* Small fixes
---------
Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
|
2024-04-03 15:08:12 +08:00 |
|
Yuwen Hu
|
fd384ddfb8
|
Optimize StableLM (#10619)
* Initial commit for stablelm optimizations
* Small style fix
* add dependency
* Add mlp optimizations
* Small fix
* add attention forward
* Remove quantize kv for now as head_dim=80
* Add merged qkv
* fix lisence
* Python style fix
---------
Co-authored-by: qiuxin2012 <qiuxin2012cs@gmail.com>
|
2024-04-02 18:58:38 +08:00 |
|