add ollama quickstart (#10649)
Co-authored-by: arda <arda@arda-arc12.sh.intel.com>
This commit is contained in:
parent
1ae519ec69
commit
f789c2eee4
1 changed files with 44 additions and 0 deletions
|
|
@ -0,0 +1,44 @@
|
||||||
|
# Run Ollama on Intel GPU
|
||||||
|
|
||||||
|
### 1 Install Ollama integrated with IPEX-LLM
|
||||||
|
|
||||||
|
First ensure that IPEX-LLM is installed. Follow the instructions on the [IPEX-LLM Installation Quickstart for Windows with Intel GPU](install_windows_gpu.html). And activate your conda environment.
|
||||||
|
|
||||||
|
Run `pip install --pre --upgrade ipex-llm[cpp]`, then execute `init-ollama`, you can see a softlink of `ollama`under your current directory.
|
||||||
|
|
||||||
|
### 2 Verify Ollama Serve
|
||||||
|
|
||||||
|
To avoid potential proxy issues, run `export no_proxy=localhost,127.0.0.1`. Execute `export ZES_ENABLE_SYSMAN=1` and `source /opt/intel/oneapi/setvars.sh` to enable driver initialization and dependencies for system management.
|
||||||
|
|
||||||
|
Start the service using `./ollama serve`. It should display something like:
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
To expose the `ollama` service port and access it from another machine, use `OLLAMA_HOST=0.0.0.0 ./ollama serve`.
|
||||||
|
|
||||||
|
Open another terminal, use `./ollama pull <model_name>` to download a model locally.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Verify the setup with the following command:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
curl http://localhost:11434/api/generate -d '
|
||||||
|
{
|
||||||
|
"model": "<model_name>",
|
||||||
|
"prompt": "Why is the sky blue?",
|
||||||
|
"stream": false
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
Expected results:
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### 3 Example: Ollama Run
|
||||||
|
|
||||||
|
You can use `./ollama run <model_name>` to automatically pull and load the model for a stream chat.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|

|
||||||
Loading…
Reference in a new issue