Update Linux Quickstart (#10499)
* fix quick start * update toc * expose docker
This commit is contained in:
		
							parent
							
								
									3a3756b51d
								
							
						
					
					
						commit
						92ee2077b3
					
				
					 5 changed files with 27 additions and 19 deletions
				
			
		| 
						 | 
					@ -9,9 +9,15 @@
 | 
				
			||||||
                    <i class="fa-solid fa-chevron-down"></i>
 | 
					                    <i class="fa-solid fa-chevron-down"></i>
 | 
				
			||||||
                </label>
 | 
					                </label>
 | 
				
			||||||
                <ul class="nav bigdl-quicklinks-section-nav">
 | 
					                <ul class="nav bigdl-quicklinks-section-nav">
 | 
				
			||||||
 | 
					                    <li>
 | 
				
			||||||
 | 
					                        <a href="doc/LLM/Quickstart/install_linux_gpu.html">Install BigDL-LLM on Linux with Intel GPU</a>
 | 
				
			||||||
 | 
					                    </li>
 | 
				
			||||||
                    <li>
 | 
					                    <li>
 | 
				
			||||||
                        <a href="doc/LLM/Quickstart/install_windows_gpu.html">Install BigDL-LLM on Windows with Intel GPU</a>
 | 
					                        <a href="doc/LLM/Quickstart/install_windows_gpu.html">Install BigDL-LLM on Windows with Intel GPU</a>
 | 
				
			||||||
                    </li>
 | 
					                    </li>
 | 
				
			||||||
 | 
					                    <li>
 | 
				
			||||||
 | 
					                        <a href="doc/LLM/Quickstart/docker_windows_gpu.html">Install BigDL-LLM in Docker on Windows with Intel GPU</a>
 | 
				
			||||||
 | 
					                    </li>
 | 
				
			||||||
                    <li>
 | 
					                    <li>
 | 
				
			||||||
                        <a href="doc/LLM/Quickstart/webui_quickstart.html">Use Text Generation WebUI on Windows with Intel GPU</a>
 | 
					                        <a href="doc/LLM/Quickstart/webui_quickstart.html">Use Text Generation WebUI on Windows with Intel GPU</a>
 | 
				
			||||||
                    </li>
 | 
					                    </li>
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
| 
						 | 
					@ -38,7 +38,9 @@ subtrees:
 | 
				
			||||||
            title: "Quickstart"
 | 
					            title: "Quickstart"
 | 
				
			||||||
            subtrees:
 | 
					            subtrees:
 | 
				
			||||||
              - entries:
 | 
					              - entries:
 | 
				
			||||||
 | 
					                - file: doc/LLM/Quickstart/install_linux_gpu
 | 
				
			||||||
                - file: doc/LLM/Quickstart/install_windows_gpu
 | 
					                - file: doc/LLM/Quickstart/install_windows_gpu
 | 
				
			||||||
 | 
					                - file: doc/LLM/Quickstart/docker_windows_gpu
 | 
				
			||||||
                - file: doc/LLM/Quickstart/webui_quickstart
 | 
					                - file: doc/LLM/Quickstart/webui_quickstart
 | 
				
			||||||
                - file: doc/LLM/Quickstart/benchmark_quickstart
 | 
					                - file: doc/LLM/Quickstart/benchmark_quickstart
 | 
				
			||||||
                - file: doc/LLM/Quickstart/llama_cpp_quickstart
 | 
					                - file: doc/LLM/Quickstart/llama_cpp_quickstart
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
| 
						 | 
					@ -7,7 +7,9 @@ BigDL-LLM Quickstart
 | 
				
			||||||
 | 
					
 | 
				
			||||||
This section includes efficient guide to show you how to:
 | 
					This section includes efficient guide to show you how to:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					* `Install BigDL-LLM on Linux with Intel GPU <./install_linux_gpu.html>`_
 | 
				
			||||||
* `Install BigDL-LLM on Windows with Intel GPU <./install_windows_gpu.html>`_
 | 
					* `Install BigDL-LLM on Windows with Intel GPU <./install_windows_gpu.html>`_
 | 
				
			||||||
 | 
					* `Install BigDL-LLM in Docker on Windows with Intel GPU <./docker_windows_gpu.html>`_
 | 
				
			||||||
* `Use Text Generation WebUI on Windows with Intel GPU <./webui_quickstart.html>`_
 | 
					* `Use Text Generation WebUI on Windows with Intel GPU <./webui_quickstart.html>`_
 | 
				
			||||||
* `Conduct Performance Benchmarking with BigDL-LLM <./benchmark_quickstart.html>`_
 | 
					* `Conduct Performance Benchmarking with BigDL-LLM <./benchmark_quickstart.html>`_
 | 
				
			||||||
* `Use llama.cpp with BigDL-LLM on Intel GPU <./llama_cpp_quickstart.html>`_
 | 
					* `Use llama.cpp with BigDL-LLM on Intel GPU <./llama_cpp_quickstart.html>`_
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
| 
						 | 
					@ -2,14 +2,12 @@
 | 
				
			||||||
 | 
					
 | 
				
			||||||
This guide demonstrates how to install BigDL-LLM on Linux with Intel GPUs. It applies to Intel Data Center GPU Flex Series and Max Series, as well as Intel Arc Series GPU.
 | 
					This guide demonstrates how to install BigDL-LLM on Linux with Intel GPUs. It applies to Intel Data Center GPU Flex Series and Max Series, as well as Intel Arc Series GPU.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
BigDL-LLM currently supports the Ubuntu 20.04 operating system and later, and supports PyTorch 2.0 and PyTorch 2.1 on Linux. This example installs BigDL-LLM with PyTorch 2.1 using `pip`. For more details and other options like installing with wheel, please refer to the [Installation Webpage](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux).
 | 
					BigDL-LLM currently supports the Ubuntu 20.04 operating system and later, and supports PyTorch 2.0 and PyTorch 2.1 on Linux. This page demonstrates BigDL-LLM with PyTorch 2.1. Check the [Installation](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux) page for more details.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## Install Intel GPU Driver
 | 
					## Install Intel GPU Driver
 | 
				
			||||||
 | 
					
 | 
				
			||||||
This guide demonstrates how to install driver on linux with **kernel version 6.2** on Intel GPU.
 | 
					### For Linux kernel 6.2
 | 
				
			||||||
We assume that you have the 6.2 kernel on your linux machine.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
* Install arc driver
 | 
					* Install arc driver
 | 
				
			||||||
    ```bash
 | 
					    ```bash
 | 
				
			||||||
| 
						 | 
					@ -20,7 +18,7 @@ We assume that you have the 6.2 kernel on your linux machine.
 | 
				
			||||||
    sudo tee /etc/apt/sources.list.d/intel-gpu-jammy.list
 | 
					    sudo tee /etc/apt/sources.list.d/intel-gpu-jammy.list
 | 
				
			||||||
    ```
 | 
					    ```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/driver_install1.png" width=100%; />
 | 
					  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/wget.png" width=100%; />
 | 
				
			||||||
 | 
					
 | 
				
			||||||
* Install drivers
 | 
					* Install drivers
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -45,9 +43,9 @@ We assume that you have the 6.2 kernel on your linux machine.
 | 
				
			||||||
    sudo reboot
 | 
					    sudo reboot
 | 
				
			||||||
    ```
 | 
					    ```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/driver_install2.png" width=100%; />
 | 
					  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/i915.png" width=100%; />
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/driver_install3.png" width=100%; />
 | 
					  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/gawk.png" width=100%; />
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
* Configure permissions
 | 
					* Configure permissions
 | 
				
			||||||
| 
						 | 
					@ -63,7 +61,7 @@ We assume that you have the 6.2 kernel on your linux machine.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## Setup Python Environment
 | 
					## Setup Python Environment
 | 
				
			||||||
 | 
					
 | 
				
			||||||
* Install the Miniconda as follows
 | 
					Install the Miniconda as follows if you don't have conda installed on your machine:
 | 
				
			||||||
  ```bash
 | 
					  ```bash
 | 
				
			||||||
  wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
 | 
					  wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -75,7 +73,7 @@ We assume that you have the 6.2 kernel on your linux machine.
 | 
				
			||||||
  conda --version
 | 
					  conda --version
 | 
				
			||||||
  # rm Miniconda3-latest-Linux-x86_64.sh # if you don't need this file any longer
 | 
					  # rm Miniconda3-latest-Linux-x86_64.sh # if you don't need this file any longer
 | 
				
			||||||
  ```
 | 
					  ```
 | 
				
			||||||
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/python_env1.png" alt="image-20240221102252569" width=100%; />
 | 
					  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/install_conda.png" alt="image-20240221102252569" width=100%; />
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## Install oneAPI 
 | 
					## Install oneAPI 
 | 
				
			||||||
| 
						 | 
					@ -88,9 +86,9 @@ We assume that you have the 6.2 kernel on your linux machine.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  sudo apt install intel-basekit
 | 
					  sudo apt install intel-basekit
 | 
				
			||||||
  ```
 | 
					  ```
 | 
				
			||||||
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/install_oneapi_1.png" alt="image-20240221102252565" width=100%; />
 | 
					  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/oneapi.png" alt="image-20240221102252565" width=100%; />
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/install_oneapi_2.png" alt="image-20240221102252565" width=100%; />
 | 
					  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/basekit.png" alt="image-20240221102252565" width=100%; />
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## Install `bigdl-llm`
 | 
					## Install `bigdl-llm`
 | 
				
			||||||
| 
						 | 
					@ -103,24 +101,24 @@ We assume that you have the 6.2 kernel on your linux machine.
 | 
				
			||||||
  pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://developer.intel.com/ipex-whl-stable-xpu
 | 
					  pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://developer.intel.com/ipex-whl-stable-xpu
 | 
				
			||||||
  ```
 | 
					  ```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/big_dl1.png" alt="image-20240221102252564" width=100%; />
 | 
					  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/create_conda_env.png" alt="image-20240221102252564" width=100%; />
 | 
				
			||||||
  
 | 
					  
 | 
				
			||||||
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/big_dl2.png" alt="image-20240221102252564" width=100%; />
 | 
					  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/create_conda_env.png" alt="image-20240221102252564" width=100%; />
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
* You can verfy if bigdl-llm is successfully by simply importing a few classes from the library. For example, execute the following import command in terminal:
 | 
					* You can verify if bigdl-llm is successfully installed by simply importing a few classes from the library. For example, execute the following import command in the terminal:
 | 
				
			||||||
  ```bash
 | 
					  ```bash
 | 
				
			||||||
  source /opt/intel/oneapi/setvars.sh
 | 
					  source /opt/intel/oneapi/setvars.sh
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  python
 | 
					  python
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  > from bigdl.llm.transformers import AutoModel,AutoModelForCausalLM
 | 
					  > from bigdl.llm.transformers import AutoModel, AutoModelForCausalLM
 | 
				
			||||||
  ```
 | 
					  ```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/big_dl3.png" alt="image-20240221102252562" width=100%; />
 | 
					  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/verify_bigdl_import.png" alt="image-20240221102252562" width=100%; />
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## Runtime Configuration
 | 
					## Runtime Configurations
 | 
				
			||||||
 | 
					
 | 
				
			||||||
To use GPU acceleration on Linux, several environment variables are required or recommended before running a GPU example.
 | 
					To use GPU acceleration on Linux, several environment variables are required or recommended before running a GPU example.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -147,7 +145,7 @@ To use GPU acceleration on Linux, several environment variables are required or
 | 
				
			||||||
  export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
 | 
					  export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
 | 
				
			||||||
  export ENABLE_SDP_FUSION=1
 | 
					  export ENABLE_SDP_FUSION=1
 | 
				
			||||||
  ```
 | 
					  ```
 | 
				
			||||||
  Please note that libtcmalloc.so can be installed by conda install -c conda-forge -y gperftools=2.10
 | 
					  Please note that `libtcmalloc.so` can be installed by ```conda install -c conda-forge -y gperftools=2.10```.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## A Quick Example
 | 
					## A Quick Example
 | 
				
			||||||
| 
						 | 
					@ -213,5 +211,5 @@ Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface
 | 
				
			||||||
## Tips & Troubleshooting
 | 
					## Tips & Troubleshooting
 | 
				
			||||||
 | 
					
 | 
				
			||||||
### Warmup for optimial performance on first run
 | 
					### Warmup for optimial performance on first run
 | 
				
			||||||
When running LLMs on GPU for the first time, you might notice the performance is lower than expected, with delays up to several minutes before the first token is generated. This delay occurs because the GPU kernels require compilation and initialization, which varies across different GPU models. To achieve optimal and consistent performance, we recommend a one-time warm-up by running `model.generate(...)` an additional time before starting your actual generation tasks. If you're developing an application, you can incorporate this warmup step into start-up or loading routine to enhance the user experience.
 | 
					When running LLMs on GPU for the first time, you might notice the performance is lower than expected, with delays up to several minutes before the first token is generated. This delay occurs because the GPU kernels require compilation and initialization, which varies across different GPU types. To achieve optimal and consistent performance, we recommend a one-time warm-up by running `model.generate(...)` an additional time before starting your actual generation tasks. If you're developing an application, you can incorporate this warmup step into start-up or loading routine to enhance the user experience.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
		Loading…
	
		Reference in a new issue