Add troubleshootings for ollama and llama.cpp (#12358)

* add ollama troubleshoot en

* zh ollama troubleshoot

* llamacpp trouble shoot

* llamacpp trouble shoot

* fix

* save gpu memory
This commit is contained in:
Jinhe 2024-11-07 15:49:20 +08:00 committed by GitHub
parent ce0c6ae423
commit 71ea539351
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 12 additions and 0 deletions

View file

@ -366,3 +366,6 @@ On latest version of `ipex-llm`, you might come across `native API failed` error
#### 15. `signal: bus error (core dumped)` error #### 15. `signal: bus error (core dumped)` error
If you meet this error, please check your Linux kernel version first. You may encounter this issue on higher kernel versions (like kernel 6.15). You can also refer to [this issue](https://github.com/intel-analytics/ipex-llm/issues/10955) to see if it helps. If you meet this error, please check your Linux kernel version first. You may encounter this issue on higher kernel versions (like kernel 6.15). You can also refer to [this issue](https://github.com/intel-analytics/ipex-llm/issues/10955) to see if it helps.
#### 16. `backend buffer base cannot be NULL` error
If you meet `ggml-backend.c:96: GGML_ASSERT(base != NULL && "backend buffer base cannot be NULL") failed`, simply adding `-c xx` parameter during inference, for example `-c 1024` would resolve this problem.

View file

@ -367,3 +367,6 @@ Log end
#### 15. `signal: bus error (core dumped)` 错误 #### 15. `signal: bus error (core dumped)` 错误
如果你遇到此错误,请先检查你的 Linux 内核版本。较高版本的内核(例如 6.15)可能会导致此问题。你也可以参考[此问题](https://github.com/intel-analytics/ipex-llm/issues/10955)来查看是否有帮助。 如果你遇到此错误,请先检查你的 Linux 内核版本。较高版本的内核(例如 6.15)可能会导致此问题。你也可以参考[此问题](https://github.com/intel-analytics/ipex-llm/issues/10955)来查看是否有帮助。
#### 16. `backend buffer base cannot be NULL` 错误
如果你遇到`ggml-backend.c:96: GGML_ASSERT(base != NULL && "backend buffer base cannot be NULL") failed`错误,在推理时传入参数`-c xx`,如`-c 1024`即可解决。

View file

@ -223,3 +223,6 @@ If you find ollama hang when multiple different questions is asked or context is
#### 7. `signal: bus error (core dumped)` error #### 7. `signal: bus error (core dumped)` error
If you meet this error, please check your Linux kernel version first. You may encounter this issue on higher kernel versions (like kernel 6.15). You can also refer to [this issue](https://github.com/intel-analytics/ipex-llm/issues/10955) to see if it helps. If you meet this error, please check your Linux kernel version first. You may encounter this issue on higher kernel versions (like kernel 6.15). You can also refer to [this issue](https://github.com/intel-analytics/ipex-llm/issues/10955) to see if it helps.
#### 8. Save GPU memory by specify `OLLAMA_NUM_PARALLEL=1`
If you have a limited GPU memory, use `set OLLAMA_NUM_PARALLEL=1` on Windows or `export OLLAMA_NUM_PARALLEL=1` on Linux before `ollama serve` to reduce GPU usage. The default `OLLAMA_NUM_PARALLEL` in ollama upstream is set to 4.

View file

@ -218,3 +218,6 @@ Ollama 默认每 5 分钟从 GPU 内存卸载一次模型。针对 ollama 的最
#### 7. `signal: bus error (core dumped)` 错误 #### 7. `signal: bus error (core dumped)` 错误
如果你遇到此错误,请先检查你的 Linux 内核版本。较高版本的内核(例如 6.15)可能会导致此问题。你也可以参考[此问题](https://github.com/intel-analytics/ipex-llm/issues/10955)来查看是否有帮助。 如果你遇到此错误,请先检查你的 Linux 内核版本。较高版本的内核(例如 6.15)可能会导致此问题。你也可以参考[此问题](https://github.com/intel-analytics/ipex-llm/issues/10955)来查看是否有帮助。
#### 8. 通过设置`OLLAMA_NUM_PARALLEL=1`节省GPU内存
如果你的GPU内存较小可以通过在运行`ollama serve`前运行`set OLLAMA_NUM_PARALLEL=1`Windows或`export OLLAMA_NUM_PARALLEL=1`Linux来减少内存使用。Ollama默认使用的`OLLAMA_NUM_PARALLEL`为4。