diff --git a/docs/mddocs/Quickstart/README.md b/docs/mddocs/Quickstart/README.md index 2f76c59b..49a839c2 100644 --- a/docs/mddocs/Quickstart/README.md +++ b/docs/mddocs/Quickstart/README.md @@ -23,7 +23,8 @@ This section includes efficient guide to show you how to: - [Run llama.cpp with IPEX-LLM on Intel GPU](./llama_cpp_quickstart.md) - [Run Ollama with IPEX-LLM on Intel GPU](./ollama_quickstart.md) - [Run Llama 3 on Intel GPU using llama.cpp and ollama with IPEX-LLM](./llama3_llamacpp_ollama_quickstart.md) -- [Run RAGFlow with IPEX_LLM on Intel GPU](./ragflow_quickstart.md) +- [Run RAGFlow with IPEX-LLM on Intel GPU](./ragflow_quickstart.md) +- [Run GraphRAG with IPEX-LLM on Intel GPU](./graphrag_quickstart.md) ## Serving diff --git a/docs/mddocs/Quickstart/graphrag_quickstart.md b/docs/mddocs/Quickstart/graphrag_quickstart.md new file mode 100644 index 00000000..c5517847 --- /dev/null +++ b/docs/mddocs/Quickstart/graphrag_quickstart.md @@ -0,0 +1,204 @@ +# Run GraphRAG with IPEX-LLM on Intel GPU + +The [GraphRAG project](https://github.com/microsoft/graphrag) is designed to leverage large language models (LLMs) for extracting structured and meaningful data from unstructured texts; by integrating it with [`ipex-llm`](https://github.com/intel-analytics/ipex-llm), users can now easily utilize local LLMs running on Intel GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max). + +## Table of Contents + +- [Install and Start `Ollama` Service on Intel GPU](#1-install-and-start-ollama-service-on-intel-gpu) +- [Prepare LLM and Embedding Model](#2-prepare-llm-and-embedding-model) +- [Setup Python Environment for GraphRAG](#3-setup-python-environment-for-graphrag) +- [Index GraphRAG](#4-index-graphrag) +- [Query GraphRAG](#5-query-graphrag) + +## Quickstart + +### 1. Install and Start `Ollama` Service on Intel GPU + +Follow the steps in [Run Ollama with IPEX-LLM on Intel GPU Guide](./ollama_quickstart.md) to install and run Ollama on Intel GPU. Ensure that `ollama serve` is running correctly and can be accessed through a local URL (e.g., `https://127.0.0.1:11434`). + +> [!TIP] +> If your local LLM is running on Intel Arc™ A-Series Graphics with Linux OS (Kernel 6.2), it is recommended to additionaly set the following environment variable for optimal performance before executing `ollama serve`: +> +> ```bash +> export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 +> ``` + +### 2. Prepare LLM and Embedding Model + +In another terminal window, separate from where you executed `ollama serve`, download the LLM and embedding model using the following commands: + +- For **Linux users**: + + ```bash + export no_proxy=localhost,127.0.0.1 + # LLM + ./ollama pull mistral + # Embedding model + ./ollama pull nomic-embed-text + ``` + +- For **Windows users**: + + Please run the following command in Miniforge or Anaconda Prompt. + + ```cmd + set no_proxy=localhost,127.0.0.1 + :: LLM + ollama pull mistral + :: Embedding model + ollama pull nomic-embed-text + ``` + +> [!TIP] +> Here we take [`mistral`](https://ollama.com/library/mistral) and [`nomic-embed-text`](https://ollama.com/library/nomic-embed-text) as an example. You could have a try on other LLMs or embedding models in [`ollama.com`](https://ollama.com/search?p=1). + +### 3. Setup Python Environment for GraphRAG + +To run the LLM and embedding model on a local machine, we utilize the [`graphrag-local-ollama`](https://github.com/TheAiSingularity/graphrag-local-ollama) repository: + +```shell +git clone https://github.com/TheAiSingularity/graphrag-local-ollama.git +cd graphrag-local-ollama + +conda create -n graphrag-local-ollama python=3.10 +conda activate graphrag-local-ollama + +pip install -e . + +pip install ollama +pip install plotly +``` + +in which `pip install ollama` is for enabling restful APIs through python, and `pip install plotly` is for visualizing the knowledge graph. + +### 4. Index GraphRAG + +The environment is now ready for GraphRAG with local LLMs and embedding models running on Intel GPUs. Before querying GraphRAG, it is necessary to first index GraphRAG, which could be a resource-intensive operation. + +> [!TIP] +> Refer to [here](https://microsoft.github.io/graphrag/) for more details in GraphRAG process explanation. + +#### Prepare Input Corpus + +Some [sample documents](https://github.com/TheAiSingularity/graphrag-local-ollama/tree/main/input) are used here as input corpus for indexing GraphRAG, based on which LLM will create a knowledge graph. + +Perpare the input corpus, and then initialize the workspace: + +- For **Linux users**: + + ```bash + # define inputs corpus + mkdir -p ./ragtest/input + cp input/* ./ragtest/input + + export no_proxy=localhost,127.0.0.1 + + # initialize ragtest folder + python -m graphrag.index --init --root ./ragtest + + # prepare settings.yml, please make sure the initialized settings.yml in ragtest folder is replaced by settings.yml in graphrag-ollama-local folder + cp settings.yaml ./ragtest + ``` + +- For **Windows users**: + + Please run the following command in Miniforge or Anaconda Prompt. + + ```cmd + :: define inputs corpus + mkdir ragtest && cd ragtest && mkdir input && cd .. + xcopy input\* .\ragtest\input + + set no_proxy=localhost,127.0.0.1 + + :: initialize ragtest folder + python -m graphrag.index --init --root .\ragtest + + :: prepare settings.yml, please make sure the initialized settings.yml in ragtest folder is replaced by settings.yml in graphrag-ollama-local folder + copy settings.yaml .\ragtest /y + ``` + +#### Update `settings.yml` + +In the `settings.yml` file inside the `ragtest` folder, add the configuration `request_timeout: 1800.0` for `llm`. Besides, if you would like to use LLMs or embedding models other than `mistral` or `nomic-embed-text`, you are required to update the `settings.yml` in `ragtest` folder accordingly: +> +> ```yml +> llm: +> api_key: ${GRAPHRAG_API_KEY} +> type: openai_chat +> model: mistral # change it accordingly if using another LLM +> model_supports_json: true +> request_timeout: 1800.0 # add this configuration; you could also increase the request_timeout +> api_base: http://localhost:11434/v1 +> +> embeddings: +> async_mode: threaded +> llm: +> api_key: ${GRAPHRAG_API_KEY} +> type: openai_embedding +> model: nomic_embed_text # change it accordingly if using another embedding model +> api_base: http://localhost:11434/api +> ``` + +Finally, conduct GraphRAG indexing, which may take a while: + +```shell +python -m graphrag.index --root ragtest +``` + +You will got message `🚀 All workflows completed successfully.` after the GraphRAG indexing is successfully finished. + +#### (Optional) Visualize Knowledge Graph + +For a clearer view of the knowledge graph, you can visualize it by specifying the path to the `.graphml` file in the `visualize-graphml.py` script, like below: + +- For **Linux users**: + + ```python + graph = nx.read_graphml('ragtest/output/20240715-151518/artifacts/summarized_graph.graphml') + ``` + +- For **Windows users**: + + ```python + graph = nx.read_graphml('ragtest\\output\\20240715-151518\\artifacts\\summarized_graph.graphml') + ``` + +and run the following command to interactively visualize the knowledge graph: + +```shell +python visualize-graphml.py +``` + +
![]()  | 
+    ![]()  | 
+