* Added SYCL_CACHE_PERSISTENT=1 to xpu Dockerfile * Update the document to add explanations for environment variables. * update quickstart |
||
|---|---|---|
| .. | ||
| DockerGuides | ||
| Inference | ||
| Overview | ||
| PythonAPI | ||
| Quickstart | ||
| README.md | ||
IPEX-LLM Documentation
Table of Contents
- LLM in 5 minutes
- Installation
- Docker Guides
- Overview of IPEX-LLM Containers for Intel GPU
- Python Inference using IPEX-LLM on Intel GPU
- Run/Develop PyTorch in VSCode with Docker on Intel GPU
- Run llama.cpp/Ollama/Open-WebUI on an Intel GPU via Docker
- FastChat Serving with IPEX-LLM on Intel GPUs via docker
- vLLM Serving with IPEX-LLM on Intel GPUs via Docker
- vLLM Serving with IPEX-LLM on Intel CPU via Docker
- Quickstart
bigdl-llmMigration Guide- Install IPEX-LLM on Linux with Intel GPU
- Install IPEX-LLM on Windows with Intel GPU
- Run Local RAG using Langchain-Chatchat on Intel CPU and GPU
- Run Text Generation WebUI on Intel GPU
- Run Open WebUI with Intel GPU
- Run PrivateGPT with IPEX-LLM on Intel GPU
- Run Coding Copilot in VSCode with Intel GPU
- Run Dify on Intel GPU
- Run Performance Benchmarking with IPEX-LLM
- Run llama.cpp with IPEX-LLM on Intel GPU
- Run Ollama with IPEX-LLM on Intel GPU
- Run Llama 3 on Intel GPU using llama.cpp and ollama with IPEX-LLM
- Serving using IPEX-LLM and FastChat
- Serving using IPEX-LLM and vLLM on Intel GPU
- Finetune LLM with Axolotl on Intel GPU
- Run IPEX-LLM serving on Multiple Intel GPUs using DeepSpeed AutoTP and FastApi
- Run RAGFlow with IPEX-LLM on Intel GPU
- Key Features
- Examples
- API Reference
- FQA