* Add initial quickstart for Ollama portable zip
* Small fix
* Fixed based on comments
* Small fix
* Add demo image for run ollama
* Update download link
* Remove model with optimize_model=False in NPU verified models tables, and remove related example
* Remove experimental in run optimized model section title
* Unify model table order & example cmd
* Move embedding example to separate folder & update quickstart example link
* Add Quickstart reference in main NPU readme
* Small fix
* Small fix
* Move save/load examples under NPU/HF-Transformers-AutoModels
* Add low-bit and polish arguments for LLM Python examples
* Small fix
* Add low-bit and polish arguments for Multi-Model examples
* Polish argument for Embedding models
* Polish argument for LLM CPP examples
* Add low-bit and polish argument for Save-Load examples
* Add accuracy tuning tips for examples
* Update NPU qucikstart accuracy tuning with low-bit optimizations
* Add save/load section to qucikstart
* Update CPP example sample output to EN
* Add installation regarding cmake for CPP examples
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Small fix
* Unify max prompt length to 512
* Change recommended low-bit for Qwen2.5-3B-Instruct to asym_int4
* Update based on comments
* Small fix
* Update install_linux_gpu.zh-CN.md
Add the link for guide of windows installation.
* Update install_windows_gpu.zh-CN.md
Add the link for guide of linux installation.
* Update install_windows_gpu.md
Add the link for guide of Linux installation.
* Update install_linux_gpu.md
Add the link for guide of Windows installation.
* Update install_linux_gpu.md
Modify based on comments.
* Update install_windows_gpu.md
Modify based on comments
* Add initial NPU quickstart (c++ part unfinished)
* Small update
* Update based on comments
* Update main readme
* Remove LLaMA description
* Small fix
* Small fix
* Remove subsection link in main README
* Small fix
* Update based on comments
* Small fix
* TOC update and other small fixes
* Update for Chinese main readme
* Update based on comments and other small fixes
* Change order
* Add install_linux_gpu.zh-CN.md
* Add install_windows_gpu.zh-CN.md
* Update llama_cpp_quickstart.zh-CN.md
Related links updated to zh-CN version.
* Update install_linux_gpu.zh-CN.md
Added link to English version.
* Update install_windows_gpu.zh-CN.md
Add the link to English version.
* Update install_windows_gpu.md
Add the link to CN version.
* Update install_linux_gpu.md
Add the link to CN version.
* Update README.zh-CN.md
Modified the related link to zh-CN version.
* remove the openwebui in inference-cpp-xpu dockerfile
* update docker_cpp_xpu_quickstart.md
* add sample output in inference-cpp/readme
* remove the openwebui in main readme
* remove the openwebui in main readme