* Add mix_precision argument to control whether use INT8 lm_head for Qwen2-7B-Instruct
* Small fix
* Fixed on load low bit with mixed precision
* Small fix
* Update example accordingly
* Update for default prompt
* Update base on comments
* Final fix