* Change order of LLM in header * Some updates to footer * Add BigDL-LLM index page and basic file structure * Update index page for key features * Add initial content for BigDL-LLM in 5 mins * Improvement to footnote * Add initial contents based on current contents we have * Add initial quick links * Small fix * Rename file * Hide cli section for now and change model supports to examples * Hugging Face format -> Hugging Face transformers format * Add placeholder for GPU supports * Add GPU related content structure * Add cpu/gpu installation initial contents * Add initial contents for GPU supports * Add image link to LLM index page * Hide tips and known issues for now * Small fix * Update based on comments * Small fix * Add notes for Python 3.9 * Add placehoder optimize model & reveal CLI; small revision * examples add gpu part * Hide CLI part again for first version of merging * add keyfeatures-optimize_model part (#1) * change gif link to the ones hosted on github * Small fix --------- Co-authored-by: plusbang <binbin1.deng@intel.com> Co-authored-by: binbin Deng <108676127+plusbang@users.noreply.github.com>
		
			
				
	
	
	
	
		
			2.1 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			2.1 KiB
		
	
	
	
	
	
	
	
LangChain API
You may run the models using the LangChain API in bigdl-llm.
Using Hugging Face transformers INT4 Format
You may run any Hugging Face Transformers model (with INT4 optimiztions applied) using the LangChain API as follows:
from bigdl.llm.langchain.llms import TransformersLLM
from bigdl.llm.langchain.embeddings import TransformersEmbeddings
from langchain.chains.question_answering import load_qa_chain
embeddings = TransformersEmbeddings.from_model_id(model_id=model_path)
bigdl_llm = TransformersLLM.from_model_id(model_id=model_path, ...)
doc_chain = load_qa_chain(bigdl_llm, ...)
output = doc_chain.run(...)
.. seealso::
   See the examples `here <https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/langchain/transformers_int4>`_.
Using Native INT4 Format
You may also convert Hugging Face Transformers models into native INT4 format, and then run the converted models using the LangChain API as follows.
.. note::
   * Currently only llama/bloom/gptneox/starcoder/chatglm model families are supported; for other models, you may use the Hugging Face ``transformers`` INT4 format as described `above <./langchain_api.html#using-hugging-face-transformers-int4-format>`_.
   * You may choose the corresponding API developed for specific native models to load the converted model.
from bigdl.llm.langchain.llms import LlamaLLM
from bigdl.llm.langchain.embeddings import LlamaEmbeddings
from langchain.chains.question_answering import load_qa_chain
# switch to ChatGLMEmbeddings/GptneoxEmbeddings/BloomEmbeddings/StarcoderEmbeddings to load other models
embeddings = LlamaEmbeddings(model_path='/path/to/converted/model.bin')
# switch to ChatGLMLLM/GptneoxLLM/BloomLLM/StarcoderLLM to load other models
bigdl_llm = LlamaLLM(model_path='/path/to/converted/model.bin')
doc_chain = load_qa_chain(bigdl_llm, ...)
doc_chain.run(...)
.. seealso::
   See the examples `here <https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/langchain/native_int4>`_.