ipex-llm/docs/readthedocs/source/doc/LLM/Overview/install_cpu.md
Yuwen Hu f0aaa130a9
Update miniconda/anaconda -> miniforge in documentation (#11176)
* Update miniconda/anaconda -> miniforge in installation guide

* Update for all Quickstart

* further fix for docs
2024-05-30 17:40:18 +08:00

2.3 KiB

IPEX-LLM Installation: CPU

Quick Installation

Install IPEX-LLM for CPU supports using pip through:

.. tabs::

   .. tab:: Linux

      .. code-block:: bash

         pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu

   .. tab:: Windows

      .. code-block:: cmd

         pip install --pre --upgrade ipex-llm[all]

Please refer to Environment Setup for more information.

.. note::

   ``all`` option will trigger installation of all the dependencies for common LLM application development.

.. important::

   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11; Python 3.11 is recommended for best practices.

Here list the recommended hardware and OS for smooth IPEX-LLM optimization experiences on CPU:

  • Hardware

    • PCs equipped with 12th Gen Intel® Core™ processor or higher, and at least 16GB RAM
    • Servers equipped with Intel® Xeon® processors, at least 32G RAM.
  • Operating System

    • Ubuntu 20.04 or later
    • CentOS 7 or later
    • Windows 10/11, with or without WSL

Environment Setup

For optimal performance with LLM models using IPEX-LLM optimizations on Intel CPUs, here are some best practices for setting up environment:

First we recommend using Conda to create a python 3.11 enviroment:

.. tabs::

   .. tab:: Linux

      .. code-block:: bash

         conda create -n llm python=3.11
         conda activate llm

         pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu

   .. tab:: Windows

      .. code-block:: cmd

         conda create -n llm python=3.11
         conda activate llm

         pip install --pre --upgrade ipex-llm[all]

Then for running a LLM model with IPEX-LLM optimizations (taking an example.py an example):

.. tabs::

   .. tab:: Client

      It is recommended to run directly with full utilization of all CPU cores:

      .. code-block:: bash

         python example.py

   .. tab:: Server

      It is recommended to run with all the physical cores of a single socket:

      .. code-block:: bash

         # e.g. for a server with 48 cores per socket
         export OMP_NUM_THREADS=48
         numactl -C 0-47 -m 0 python example.py