ipex-llm/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/langchain_api.md
Shaojun Liu 401013a630
Remove chatglm_C Module to Eliminate LGPL Dependency (#11178)
* remove chatglm_C.**.pyd to solve ngsolve weak copyright vunl

* fix style check error

* remove chatglm native int4 from langchain
2024-05-31 17:03:11 +08:00

2 KiB

LangChain API

You may run the models using the LangChain API in ipex-llm.

Using Hugging Face transformers INT4 Format

You may run any Hugging Face Transformers model (with INT4 optimiztions applied) using the LangChain API as follows:

from ipex_llm.langchain.llms import TransformersLLM
from ipex_llm.langchain.embeddings import TransformersEmbeddings
from langchain.chains.question_answering import load_qa_chain

embeddings = TransformersEmbeddings.from_model_id(model_id=model_path)
ipex_llm = TransformersLLM.from_model_id(model_id=model_path, ...)

doc_chain = load_qa_chain(ipex_llm, ...)
output = doc_chain.run(...)
.. seealso::

   See the examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain/transformers_int4>`_.

Using Native INT4 Format

You may also convert Hugging Face Transformers models into native INT4 format, and then run the converted models using the LangChain API as follows.

.. note::

   * Currently only llama/bloom/gptneox/starcoder model families are supported; for other models, you may use the Hugging Face ``transformers`` INT4 format as described `above <./langchain_api.html#using-hugging-face-transformers-int4-format>`_.

   * You may choose the corresponding API developed for specific native models to load the converted model.
from ipex_llm.langchain.llms import LlamaLLM
from ipex_llm.langchain.embeddings import LlamaEmbeddings
from langchain.chains.question_answering import load_qa_chain

# switch to GptneoxEmbeddings/BloomEmbeddings/StarcoderEmbeddings to load other models
embeddings = LlamaEmbeddings(model_path='/path/to/converted/model.bin')
# switch to GptneoxLLM/BloomLLM/StarcoderLLM to load other models
ipex_llm = LlamaLLM(model_path='/path/to/converted/model.bin')

doc_chain = load_qa_chain(ipex_llm, ...)
doc_chain.run(...)
.. seealso::

   See the examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain/native_int4>`_.