* first commit * update example * fix style * update example * embedding as const * fix generate * code refactor * meet code review * fix style * change max_output_len to max_context_len * fix all-in-one * fix example * add check for new tokens
* fix npu_model raise sym_int4 error * add load_lowbit * remove print&perf
* add minicpm example