* remove q4_1 * fixes
* add gpu gguf example * some fixes * address kai's comments * address json's comments