From ea5e46c8cbe2bac558d71e416ff4b2ba7d3f5a20 Mon Sep 17 00:00:00 2001
From: Ruonan Wang <ruonan1.wang@intel.com>
Date: Tue, 16 Apr 2024 10:46:58 +0800
Subject: [PATCH] Small update of quickstart (#10772)

---
 .../source/doc/LLM/Quickstart/llama_cpp_quickstart.md       | 5 +++++
 .../source/doc/LLM/Quickstart/ollama_quickstart.md          | 6 +++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/docs/readthedocs/source/doc/LLM/Quickstart/llama_cpp_quickstart.md b/docs/readthedocs/source/doc/LLM/Quickstart/llama_cpp_quickstart.md
index e3e9840b..f3d064a5 100644
--- a/docs/readthedocs/source/doc/LLM/Quickstart/llama_cpp_quickstart.md
+++ b/docs/readthedocs/source/doc/LLM/Quickstart/llama_cpp_quickstart.md
@@ -262,3 +262,8 @@ Log end
 #### Fail to quantize model
 If you encounter `main: failed to quantize model from xxx`, please make sure you have created related output directory.
 
+#### Program hang during model loading
+If your program hang after `llm_load_tensors:  SYCL_Host buffer size =    xx.xx MiB`, you can add `--no-mmap` in your command.
+
+#### How to set `-ngl` parameter
+`-ngl` means the number of layers to store in VRAM. If your VRAM is enough, we recommend putting all layers on GPU, you can just set `-ngl` to a large number like 999 to achieve this goal.
\ No newline at end of file
diff --git a/docs/readthedocs/source/doc/LLM/Quickstart/ollama_quickstart.md b/docs/readthedocs/source/doc/LLM/Quickstart/ollama_quickstart.md
index f2bdbca2..4c893a93 100644
--- a/docs/readthedocs/source/doc/LLM/Quickstart/ollama_quickstart.md
+++ b/docs/readthedocs/source/doc/LLM/Quickstart/ollama_quickstart.md
@@ -69,7 +69,7 @@ Launch the Ollama service:
          set ZES_ENABLE_SYSMAN=1
          call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
 
-         ollama.exe serve
+         ollama serve
 
 ```
 
@@ -174,8 +174,8 @@ Then you can create the model in Ollama by `ollama create example -f Modelfile`
       .. code-block:: bash
 
          set no_proxy=localhost,127.0.0.1
-         ollama.exe create example -f Modelfile
-         ollama.exe run example
+         ollama create example -f Modelfile
+         ollama run example
 
 ```