Small update of quickstart (#10772)
This commit is contained in:
parent
0a62933d36
commit
ea5e46c8cb
2 changed files with 8 additions and 3 deletions
|
|
@ -262,3 +262,8 @@ Log end
|
||||||
#### Fail to quantize model
|
#### Fail to quantize model
|
||||||
If you encounter `main: failed to quantize model from xxx`, please make sure you have created related output directory.
|
If you encounter `main: failed to quantize model from xxx`, please make sure you have created related output directory.
|
||||||
|
|
||||||
|
#### Program hang during model loading
|
||||||
|
If your program hang after `llm_load_tensors: SYCL_Host buffer size = xx.xx MiB`, you can add `--no-mmap` in your command.
|
||||||
|
|
||||||
|
#### How to set `-ngl` parameter
|
||||||
|
`-ngl` means the number of layers to store in VRAM. If your VRAM is enough, we recommend putting all layers on GPU, you can just set `-ngl` to a large number like 999 to achieve this goal.
|
||||||
|
|
@ -69,7 +69,7 @@ Launch the Ollama service:
|
||||||
set ZES_ENABLE_SYSMAN=1
|
set ZES_ENABLE_SYSMAN=1
|
||||||
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
|
||||||
|
|
||||||
ollama.exe serve
|
ollama serve
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -174,8 +174,8 @@ Then you can create the model in Ollama by `ollama create example -f Modelfile`
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
set no_proxy=localhost,127.0.0.1
|
set no_proxy=localhost,127.0.0.1
|
||||||
ollama.exe create example -f Modelfile
|
ollama create example -f Modelfile
|
||||||
ollama.exe run example
|
ollama run example
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue