Add privateGPT quickstart (#10932)
* Add privateGPT quickstart * Update privateGPT_quickstart.md * Update _toc.yml * Update _toc.yml --------- Co-authored-by: Shengsheng Huang <shengsheng.huang@intel.com>
This commit is contained in:
parent
f4c615b1ee
commit
37820e1d86
4 changed files with 96 additions and 0 deletions
|
|
@ -61,6 +61,9 @@
|
|||
<li>
|
||||
<a href="doc/LLM/Quickstart/axolotl_quickstart.html">Finetune LLM with Axolotl on Intel GPU</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/LLM/Quickstart/privateGPT_quickstart.html">Run PrivateGPT with IPEX-LLM on Intel GPU</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/LLM/Quickstart/deepspeed_autotp_fastapi_quickstart.html">Run IPEX-LLM serving on Multiple Intel GPUs
|
||||
using DeepSpeed AutoTP and FastApi</a>
|
||||
|
|
|
|||
|
|
@ -26,6 +26,7 @@ subtrees:
|
|||
- file: doc/LLM/Quickstart/chatchat_quickstart
|
||||
- file: doc/LLM/Quickstart/webui_quickstart
|
||||
- file: doc/LLM/Quickstart/open_webui_with_ollama_quickstart
|
||||
- file: doc/LLM/Quickstart/privateGPT_quickstart
|
||||
- file: doc/LLM/Quickstart/continue_quickstart
|
||||
- file: doc/LLM/Quickstart/benchmark_quickstart
|
||||
- file: doc/LLM/Quickstart/llama_cpp_quickstart
|
||||
|
|
|
|||
|
|
@ -16,6 +16,7 @@ This section includes efficient guide to show you how to:
|
|||
* `Run Local RAG using Langchain-Chatchat on Intel GPU <./chatchat_quickstart.html>`_
|
||||
* `Run Text Generation WebUI on Intel GPU <./webui_quickstart.html>`_
|
||||
* `Run Open WebUI on Intel GPU <./open_webui_with_ollama_quickstart.html>`_
|
||||
* `Run PrivateGPT with IPEX-LLM on Intel GPU <./privateGPT_quickstart.html>`_
|
||||
* `Run Coding Copilot (Continue) in VSCode with Intel GPU <./continue_quickstart.html>`_
|
||||
* `Run Dify on Intel GPU <./dify_quickstart.html>`_
|
||||
* `Run llama.cpp with IPEX-LLM on Intel GPU <./llama_cpp_quickstart.html>`_
|
||||
|
|
@ -25,5 +26,6 @@ This section includes efficient guide to show you how to:
|
|||
* `Finetune LLM with Axolotl on Intel GPU <./axolotl_quickstart.html>`_
|
||||
* `Run IPEX-LLM serving on Multiple Intel GPUs using DeepSpeed AutoTP and FastApi <./deepspeed_autotp_fastapi_quickstart.html>`
|
||||
|
||||
|
||||
.. |bigdl_llm_migration_guide| replace:: ``bigdl-llm`` Migration Guide
|
||||
.. _bigdl_llm_migration_guide: bigdl_llm_migration.html
|
||||
|
|
|
|||
|
|
@ -0,0 +1,90 @@
|
|||
# Run PrivateGPT with IPEX-LLM on Intel GPU
|
||||
|
||||
[zylon-ai/private-gpt](https://github.com/zylon-ai/private-gpt) is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an Internet connection; you can easily run PrivateGPT using `Ollama` with IPEX-LLM on Intel **GPU** *(e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max)*.
|
||||
|
||||
See the demo of running Mistral-7B on Intel iGPU below:
|
||||
|
||||
<video src="https://github.com/bigdl-project/bigdl-project.github.io/raw/2ca10e374da99310f7b8f5b01bbd69242eab4d3a/assets/quickstart/privateGPT_quickstart/privateGPT-windows-MTL.mp4" width="100%" controls></video>
|
||||
|
||||
## Quickstart
|
||||
|
||||
### 1 Run Ollama with Intel GPU
|
||||
|
||||
Follow the instructions on the [Run Ollama with IPEX-LLM on Intel GPU](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/ollama_quickstart.html) to install and run Ollama Serve. Please ensure that the Ollama server continues to run while you're using the PrivateGPT.
|
||||
|
||||
### 2. Install PrivateGPT
|
||||
|
||||
#### Download PrivateGPT
|
||||
|
||||
Use `git` to clone the [zylon-ai/private-gpt](https://github.com/zylon-ai/private-gpt).
|
||||
|
||||
#### Install Dependencies
|
||||
|
||||
You may run below commands to install PrivateGPT dependencies:
|
||||
```cmd
|
||||
pip install poetry
|
||||
pip install ffmpy==0.3.1
|
||||
poetry install --extras "ui llms-ollama embeddings-ollama vector-stores-qdrant"
|
||||
```
|
||||
|
||||
### 3. Start PrivateGPT
|
||||
|
||||
#### Configure PrivateGPT
|
||||
|
||||
Change PrivateGPT settings by modifying `settings.yaml` and `settings-ollama.yaml`.
|
||||
|
||||
* `settings.yaml` is always loaded and contains the default configuration. In order to run PrivateGPT locally, you need to replace the tokenizer path under the `llm` option with your local path.
|
||||
* `settings-ollama.yaml` is loaded if the ollama profile is specified in the PGPT_PROFILES environment variable. It can override configuration from the default `settings.yaml`. You can modify the settings in this file according to your preference. It is worth noting that to use the options `llm_model: <Model Name>` and `embedding_model: <Embedding Model Name>`, you need to first use `ollama pull` to fetch the models locally.
|
||||
|
||||
To learn more about the configuration of PrivatePGT, please refer to [PrivateGPT Main Concepts](https://docs.privategpt.dev/installation/getting-started/main-concepts)
|
||||
|
||||
|
||||
#### Start the service
|
||||
Please ensure that the Ollama server continues to run in a terminal while you're using the PrivateGPT.
|
||||
|
||||
Run below commands to start the service in another terminal:
|
||||
|
||||
```eval_rst
|
||||
.. tabs::
|
||||
.. tab:: Linux
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
export no_proxy=localhost,127.0.0.1
|
||||
PGPT_PROFILES=ollama make run
|
||||
|
||||
.. note:
|
||||
|
||||
Setting ``PGPT_PROFILES=ollama`` will load the configuration from ``settings.yaml`` and ``settings-ollama.yaml``.
|
||||
|
||||
.. tab:: Windows
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
set no_proxy=localhost,127.0.0.1
|
||||
set PGPT_PROFILES=ollama
|
||||
make run
|
||||
|
||||
.. note:
|
||||
|
||||
Setting ``PGPT_PROFILES=ollama`` will load the configuration from ``settings.yaml`` and ``settings-ollama.yaml``.
|
||||
```
|
||||
|
||||
### 4. Using PrivateGPT
|
||||
|
||||
#### Chat with the Model
|
||||
|
||||
Select the "LLM Chat" option in the upper left corner of the page to chat with LLM.
|
||||
|
||||
<a href="https://raw.githubusercontent.com/bigdl-project/bigdl-project.github.io/2ca10e374da99310f7b8f5b01bbd69242eab4d3a/assets/quickstart/privateGPT_quickstart/privateGPT-LLM-Chat.png" target="_blank">
|
||||
<img src="https://raw.githubusercontent.com/bigdl-project/bigdl-project.github.io/2ca10e374da99310f7b8f5b01bbd69242eab4d3a/assets/quickstart/privateGPT_quickstart/privateGPT-LLM-Chat.png" width=100%; />
|
||||
</a>
|
||||
|
||||
|
||||
#### Using RAG
|
||||
|
||||
Select the "Query Files" option in the upper left corner of the page, then click the "Upload File(s)" button to upload documents. Once the document vectorization is completed, you can proceed with document-based QA.
|
||||
|
||||
<a href="https://raw.githubusercontent.com/bigdl-project/bigdl-project.github.io/2ca10e374da99310f7b8f5b01bbd69242eab4d3a/assets/quickstart/privateGPT_quickstart/privateGPT-Query-Files.png" target="_blank">
|
||||
<img src="https://raw.githubusercontent.com/bigdl-project/bigdl-project.github.io/2ca10e374da99310f7b8f5b01bbd69242eab4d3a/assets/quickstart/privateGPT_quickstart/privateGPT-Query-Files.png" width=100%; />
|
||||
</a>
|
||||
Loading…
Reference in a new issue