ipex-llm/python/llm/example/CPU/LlamaIndex
Wang, Jian4 9df70d95eb
Refactor bigdl.llm to ipex_llm (#24)
* Rename bigdl/llm to ipex_llm

* rm python/llm/src/bigdl

* from bigdl.llm to from ipex_llm
2024-03-22 15:41:21 +08:00
..
rag.py Refactor bigdl.llm to ipex_llm (#24) 2024-03-22 15:41:21 +08:00
README.md Fix llamaindex AutoTokenizer bug (#10345) 2024-03-08 16:24:50 +08:00

LlamaIndex Examples

This folder contains examples showcasing how to use LlamaIndex with bigdl-llm.

LlamaIndex is a data framework designed to improve large language models by providing tools for easier data ingestion, management, and application integration.

Prerequisites

Ensure bigdl-llm is installed by following the BigDL-LLM Installation Guide before proceeding with the examples provided here.

Retrieval-Augmented Generation (RAG) Example

The RAG example (rag.py) is adapted from the Official llama index RAG example. This example builds a pipeline to ingest data (e.g. llama2 paper in pdf format) into a vector database (e.g. PostgreSQL), and then build a retrieval pipeline from that vector database.

Setting up Dependencies

  • Install LlamaIndex Packages

    pip install llama-index-readers-file llama-index-vector-stores-postgres llama-index-embeddings-huggingface
    
  • Database Setup (using PostgreSQL):

    • Installation:

      sudo apt-get install postgresql-client
      sudo apt-get install postgresql
      
    • Initialization:

      Switch to the postgres user and launch psql console:

      sudo su - postgres
      psql
      

      Then, create a new user role:

      CREATE ROLE <user> WITH LOGIN PASSWORD '<password>';
      ALTER ROLE <user> SUPERUSER;    
      
  • Pgvector Installation: Follow installation instructions on pgvector's GitHub and refer to the installation notes for additional help.

  • Data Preparation: Download the Llama2 paper and save it as data/llama2.pdf, which serves as the default source file for retrieval.

    mkdir data
    wget --user-agent "Mozilla" "https://arxiv.org/pdf/2307.09288.pdf" -O "data/llama2.pdf"
    

Running the RAG example

In the current directory, run the example with command:

python rag.py -m <path_to_model>

Additional Parameters for Configuration:

  • -m MODEL_PATH: Required, path to the LLM model
  • -e EMBEDDING_MODEL_PATH: path to the embedding model
  • -u USERNAME: username in the PostgreSQL database
  • -p PASSWORD: password in the PostgreSQL database
  • -q QUESTION: question you want to ask
  • -d DATA: path to source data used for retrieval (in pdf format)
  • -n N_PREDICT: max predict tokens

Example Output

A query such as "How does Llama 2 compare to other open-source models?" with the Llama2 paper as the data source, using the Llama-2-7b-chat-hf model, will produce the output like below:

Llama 2 performs better than most open-source models on the benchmarks we tested. Specifically, it outperforms all open-source models on MMLU and BBH, and is close to GPT-3.5 on these benchmarks. Additionally, Llama 2 is on par or better than PaLM-2-L on almost all benchmarks. The only exception is the coding benchmarks, where Llama 2 lags significantly behind GPT-4 and PaLM-2-L. Overall, Llama 2 demonstrates strong performance on a wide range of natural language processing tasks.