* init ceval benchmark test. * upload dataset. * add other tests. * add qwen evaluator. * fix qwen evaluator style. * fix qwen evaluator style. * update qwen evaluator. * add llama evaluator. * update eval * fix typo. * fix * fix typo. * fix llama evaluator. * fix bug. * fix style. * delete dataset. * fix style. * fix style. * add README.md and fix typo. * fix comments. * remove run scripts |
||
|---|---|---|
| .. | ||
| benchmark | ||
| test | ||
| print_glib_requirement.py | ||
| release.sh | ||
| release_default_linux.sh | ||
| release_default_windows.sh | ||