* add initial quantize_kv support for yuan2 model * fix yuan2 quantize_kv generation * apply fp16 conv layer optimizations * disable mlp for quantize_kv |
||
|---|---|---|
| .. | ||
| llm | ||
| __init__.py | ||
* add initial quantize_kv support for yuan2 model * fix yuan2 quantize_kv generation * apply fp16 conv layer optimizations * disable mlp for quantize_kv |
||
|---|---|---|
| .. | ||
| llm | ||
| __init__.py | ||