Small update of quickstart (#10772)
This commit is contained in:
		
							parent
							
								
									0a62933d36
								
							
						
					
					
						commit
						ea5e46c8cb
					
				
					 2 changed files with 8 additions and 3 deletions
				
			
		| 
						 | 
				
			
			@ -262,3 +262,8 @@ Log end
 | 
			
		|||
#### Fail to quantize model
 | 
			
		||||
If you encounter `main: failed to quantize model from xxx`, please make sure you have created related output directory.
 | 
			
		||||
 | 
			
		||||
#### Program hang during model loading
 | 
			
		||||
If your program hang after `llm_load_tensors:  SYCL_Host buffer size =    xx.xx MiB`, you can add `--no-mmap` in your command.
 | 
			
		||||
 | 
			
		||||
#### How to set `-ngl` parameter
 | 
			
		||||
`-ngl` means the number of layers to store in VRAM. If your VRAM is enough, we recommend putting all layers on GPU, you can just set `-ngl` to a large number like 999 to achieve this goal.
 | 
			
		||||
| 
						 | 
				
			
			@ -69,7 +69,7 @@ Launch the Ollama service:
 | 
			
		|||
         set ZES_ENABLE_SYSMAN=1
 | 
			
		||||
         call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"
 | 
			
		||||
 | 
			
		||||
         ollama.exe serve
 | 
			
		||||
         ollama serve
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -174,8 +174,8 @@ Then you can create the model in Ollama by `ollama create example -f Modelfile`
 | 
			
		|||
      .. code-block:: bash
 | 
			
		||||
 | 
			
		||||
         set no_proxy=localhost,127.0.0.1
 | 
			
		||||
         ollama.exe create example -f Modelfile
 | 
			
		||||
         ollama.exe run example
 | 
			
		||||
         ollama create example -f Modelfile
 | 
			
		||||
         ollama run example
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
		Reference in a new issue