Small update of quickstart (#10772)

This commit is contained in:
Ruonan Wang 2024-04-16 10:46:58 +08:00 committed by GitHub
parent 0a62933d36
commit ea5e46c8cb
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 8 additions and 3 deletions

View file

@ -262,3 +262,8 @@ Log end
#### Fail to quantize model #### Fail to quantize model
If you encounter `main: failed to quantize model from xxx`, please make sure you have created related output directory. If you encounter `main: failed to quantize model from xxx`, please make sure you have created related output directory.
#### Program hang during model loading
If your program hang after `llm_load_tensors: SYCL_Host buffer size = xx.xx MiB`, you can add `--no-mmap` in your command.
#### How to set `-ngl` parameter
`-ngl` means the number of layers to store in VRAM. If your VRAM is enough, we recommend putting all layers on GPU, you can just set `-ngl` to a large number like 999 to achieve this goal.

View file

@ -69,7 +69,7 @@ Launch the Ollama service:
set ZES_ENABLE_SYSMAN=1 set ZES_ENABLE_SYSMAN=1
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
ollama.exe serve ollama serve
``` ```
@ -174,8 +174,8 @@ Then you can create the model in Ollama by `ollama create example -f Modelfile`
.. code-block:: bash .. code-block:: bash
set no_proxy=localhost,127.0.0.1 set no_proxy=localhost,127.0.0.1
ollama.exe create example -f Modelfile ollama create example -f Modelfile
ollama.exe run example ollama run example
``` ```