diff --git a/docs/mddocs/Quickstart/bmg_quickstart.md b/docs/mddocs/Quickstart/bmg_quickstart.md
index ff1afd26..af29f003 100644
--- a/docs/mddocs/Quickstart/bmg_quickstart.md
+++ b/docs/mddocs/Quickstart/bmg_quickstart.md
@@ -11,11 +11,13 @@ This guide demonstrates how to install and use IPEX-LLM on the Intel Arc B-Serie
1. [Linux](#1-linux)
1.1 [Install Prerequisites](#11-install-prerequisites)
- 1.2 [Install IPEX-LLM](#12-install-ipex-llm)
-2. [Windows](#2-windows)
+ 1.2 [Install IPEX-LLM](#for-pytorch-and-huggingface) (for PyTorch and HuggingFace)
+ 1.3 [Install IPEX-LLM](#for-llamacpp-and-ollama) (for llama.cpp and Ollama)
+3. [Windows](#2-windows)
2.1 [Install Prerequisites](#21-install-prerequisites)
- 2.2 [Install IPEX-LLM](#22-install-ipex-llm)
-3. [Use Cases](#3-use-cases)
+ 2.2 [Install IPEX-LLM](#for-pytorch-and-huggingface-1) (for PyTorch and HuggingFace)
+ 2.3 [Install IPEX-LLM](#for-llamacpp-and-ollama-1) (for llama.cpp and Ollama)
+5. [Use Cases](#3-use-cases)
3.1 [PyTorch](#31-pytorch)
3.2 [Ollama](#32-ollama)
3.3 [llama.cpp](#33-llamacpp)
@@ -59,7 +61,7 @@ conda activate llm
With the `llm` environment active, install the appropriate `ipex-llm` package based on your use case:
-#### For PyTorch:
+#### For PyTorch and HuggingFace:
Install the `ipex-llm[xpu-arc]` package. Choose either the US or CN website for `extra-index-url`:
- For **US**:
@@ -109,7 +111,7 @@ conda activate llm
With the `llm` environment active, install the appropriate `ipex-llm` package based on your use case:
-#### For PyTorch:
+#### For PyTorch and HuggingFace:
Install the `ipex-llm[xpu-arc]` package. Choose either the US or CN website for `extra-index-url`:
- For **US**:
diff --git a/docs/mddocs/Quickstart/llama_cpp_quickstart.md b/docs/mddocs/Quickstart/llama_cpp_quickstart.md
index 52c9a702..cd6b5f15 100644
--- a/docs/mddocs/Quickstart/llama_cpp_quickstart.md
+++ b/docs/mddocs/Quickstart/llama_cpp_quickstart.md
@@ -5,6 +5,14 @@
[ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp) prvoides fast LLM inference in pure C++ across a variety of hardware; you can now use the C++ interface of [`ipex-llm`](https://github.com/intel-analytics/ipex-llm) as an accelerated backend for `llama.cpp` running on Intel **GPU** *(e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max)*.
+> [!NOTE]
+> For installation on Intel Arc B-Series GPU (such as **B580**), please refer to this [guide](./bmg_quickstart.md).
+
+> [!NOTE]
+> Our latest version is consistent with [3f1ae2e](https://github.com/ggerganov/llama.cpp/commit/3f1ae2e32cde00c39b96be6d01c2997c29bae555) of llama.cpp.
+>
+> `ipex-llm[cpp]==2.2.0b20241204` is consistent with [a1631e5](https://github.com/ggerganov/llama.cpp/commit/a1631e53f6763e17da522ba219b030d8932900bd) of llama.cpp.
+
See the demo of running LLaMA2-7B on Intel Arc GPU below.
@@ -16,16 +24,6 @@ See the demo of running LLaMA2-7B on Intel Arc GPU below.
-> [!NOTE]
-> `ipex-llm[cpp]==2.2.0b20241204` is consistent with [a1631e5](https://github.com/ggerganov/llama.cpp/commit/a1631e53f6763e17da522ba219b030d8932900bd) of llama.cpp.
->
-> Our latest version is consistent with [3f1ae2e](https://github.com/ggerganov/llama.cpp/commit/3f1ae2e32cde00c39b96be6d01c2997c29bae555) of llama.cpp.
-
-> [!NOTE]
-> Starting from `ipex-llm[cpp]==2.2.0b20240912`, oneAPI dependency of `ipex-llm[cpp]` on Windows will switch from `2024.0.0` to `2024.2.1` .
->
-> For this update, it's necessary to create a new conda environment to install the latest version on Windows. If you directly upgrade to `ipex-llm[cpp]>=2.2.0b20240912` in the previous cpp conda environment, you may encounter the error `Can't find sycl7.dll`.
-
## Table of Contents
- [Prerequisites](./llama_cpp_quickstart.md#0-prerequisites)
- [Install IPEX-LLM for llama.cpp](./llama_cpp_quickstart.md#1-install-ipex-llm-for-llamacpp)
@@ -368,4 +366,4 @@ On latest version of `ipex-llm`, you might come across `native API failed` error
If you meet this error, please check your Linux kernel version first. You may encounter this issue on higher kernel versions (like kernel 6.15). You can also refer to [this issue](https://github.com/intel-analytics/ipex-llm/issues/10955) to see if it helps.
#### 16. `backend buffer base cannot be NULL` error
-If you meet `ggml-backend.c:96: GGML_ASSERT(base != NULL && "backend buffer base cannot be NULL") failed`, simply adding `-c xx` parameter during inference, for example `-c 1024` would resolve this problem.
\ No newline at end of file
+If you meet `ggml-backend.c:96: GGML_ASSERT(base != NULL && "backend buffer base cannot be NULL") failed`, simply adding `-c xx` parameter during inference, for example `-c 1024` would resolve this problem.
diff --git a/docs/mddocs/Quickstart/ollama_quickstart.md b/docs/mddocs/Quickstart/ollama_quickstart.md
index 0590007f..0ed97830 100644
--- a/docs/mddocs/Quickstart/ollama_quickstart.md
+++ b/docs/mddocs/Quickstart/ollama_quickstart.md
@@ -5,6 +5,14 @@
[ollama/ollama](https://github.com/ollama/ollama) is popular framework designed to build and run language models on a local machine; you can now use the C++ interface of [`ipex-llm`](https://github.com/intel-analytics/ipex-llm) as an accelerated backend for `ollama` running on Intel **GPU** *(e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max)*.
+> [!NOTE]
+> For installation on Intel Arc B-Series GPU (such as **B580**), please refer to this [guide](./bmg_quickstart.md).
+
+> [!NOTE]
+> Our current version is consistent with [v0.4.6](https://github.com/ollama/ollama/releases/tag/v0.4.6) of ollama.
+>
+> `ipex-llm[cpp]==2.2.0b20241204` is consistent with [v0.3.6](https://github.com/ollama/ollama/releases/tag/v0.3.6) of ollama.
+
See the demo of running LLaMA2-7B on Intel Arc GPU below.
@@ -16,11 +24,6 @@ See the demo of running LLaMA2-7B on Intel Arc GPU below.
-> [!NOTE]
-> `ipex-llm[cpp]==2.2.0b20241204` is consistent with [v0.3.6](https://github.com/ollama/ollama/releases/tag/v0.3.6) of ollama.
->
-> Our current version is consistent with [v0.4.6](https://github.com/ollama/ollama/releases/tag/v0.4.6) of ollama.
-
> [!NOTE]
> Starting from `ipex-llm[cpp]==2.2.0b20240912`, oneAPI dependency of `ipex-llm[cpp]` on Windows will switch from `2024.0.0` to `2024.2.1` .
>