* Add supports for loading rwkv models using from_pretrained api
* Temporarily enable pr tests
* Add RWKV in tests and more in-out pairs
* Add rwkv for 512 tests
* Make iterations smaller
* Change back to nightly trigger
* Temp enable PR
* Enable tests for 256-64
* Try again 128-64
* Empty cache after each iteration for igpu benchmark scripts
* Try tests for 512
* change order for 512
* Skip chatglm3 and llama2 for now
* Separate tests for 512-64
* Small fix
* Further fixes
* Change back to nightly again