# IPEX-LLM Documentation ## Table of Contents - [LLM in 5 minutes](./Overview/llm.md) - [Installation](./Overview/install.md) - [CPU](./Overview/install_cpu.md) - [GPU](./Overview/install_gpu.md) - [Docker Guides](./DockerGuides/) - [Overview of IPEX-LLM Containers for Intel GPU](./DockerGuides/docker_windows_gpu.md) - [Python Inference using IPEX-LLM on Intel GPU](./DockerGuides/docker_pytorch_inference_gpu.md) - [Run/Develop PyTorch in VSCode with Docker on Intel GPU](./DockerGuides/docker_run_pytorch_inference_in_vscode.md) - [Run llama.cpp/Ollama/Open-WebUI on an Intel GPU via Docker](./DockerGuides/docker_cpp_xpu_quickstart.md) - [FastChat Serving with IPEX-LLM on Intel GPUs via docker](./DockerGuides/fastchat_docker_quickstart.md) - [vLLM Serving with IPEX-LLM on Intel GPUs via Docker](./DockerGuides/vllm_docker_quickstart.md) - [vLLM Serving with IPEX-LLM on Intel CPU via Docker](./DockerGuides/vllm_cpu_docker_quickstart.md) - [Quickstart](https://github.com/intel-analytics/ipex-llm/tree/main/docs/mddocs/Quickstart/) - [`bigdl-llm` Migration Guide](./Quickstart/bigdl_llm_migration.md) - [Install IPEX-LLM on Linux with Intel GPU](./Quickstart/install_linux_gpu.md) - [Install IPEX-LLM on Windows with Intel GPU](./Quickstart/install_windows_gpu.md) - [Run Local RAG using Langchain-Chatchat on Intel CPU and GPU](./Quickstart/chatchat_quickstart.md) - [Run Text Generation WebUI on Intel GPU](./Quickstart/webui_quickstart.md) - [Run Open WebUI with Intel GPU](./Quickstart/open_webui_with_ollama_quickstart.md) - [Run PrivateGPT with IPEX-LLM on Intel GPU](./Quickstart/privateGPT_quickstart.md) - [Run Coding Copilot in VSCode with Intel GPU](./Quickstart/continue_quickstart.md) - [Run Dify on Intel GPU](./Quickstart/dify_quickstart.md) - [Run Performance Benchmarking with IPEX-LLM](./Quickstart/benchmark_quickstart.md) - [Run llama.cpp with IPEX-LLM on Intel GPU](./Quickstart/llama_cpp_quickstart.md) - [Run Ollama with IPEX-LLM on Intel GPU](./Quickstart/ollama_quickstart.md) - [Run Llama 3 on Intel GPU using llama.cpp and ollama with IPEX-LLM](./Quickstart/llama3_llamacpp_ollama_quickstart.md) - [Serving using IPEX-LLM and FastChat](./Quickstart/fastchat_quickstart.md) - [Serving using IPEX-LLM and vLLM on Intel GPU](./Quickstart/vLLM_quickstart.md) - [Finetune LLM with Axolotl on Intel GPU](./Quickstart/axolotl_quickstart.md) - [Run IPEX-LLM serving on Multiple Intel GPUs using DeepSpeed AutoTP and FastApi](./Quickstart/deepspeed_autotp_fastapi_quickstart.md) - [Run RAGFlow with IPEX-LLM on Intel GPU](./Quickstart/ragflow_quickstart.md) - [Key Features](./Overview/KeyFeatures/) - [PyTorch API](./Overview/KeyFeatures/optimize_model.md) - [`transformers`-style API](./Overview/KeyFeatures/hugging_face_format.md) - [GPU Supports](./Overview/KeyFeatures/gpu_supports.md) - [Inference on GPU](./Overview/KeyFeatures/inference_on_gpu.md) - [Finetune (QLoRA)](./Overview/KeyFeatures/finetune.md) - [Multi Intel GPUs selection](./Overview/KeyFeatures/multi_gpus_selection.md) - [Examples](../../python/llm/example/) - [CPU](../../python/llm/example/CPU/) - [GPU](../../python/llm/example/GPU/) - [API Reference](./PythonAPI/) - [IPEX-LLM PyTorch API](./PythonAPI/optimize.md) - [IPEX-LLM `transformers`-style API](./PythonAPI/transformers.md) - [FQA](./Overview/FAQ/faq.md)