ipex-llm/docker/llm/README.md at 2f14f53b1c92c906d330a39dcfd50c94c39318f5

Lilac09 326ef7f491 add README for llm-inference-cpu (#9147 )

* add README for llm-inference-cpu

* modify README

* add README for llm-inference-cpu on Windows

2023-10-16 10:27:44 +08:00

2.9 KiB

Raw Blame History

Getting started with BigDL LLM on Windows

Install docker

New users can quickly get started with Docker using this official link.

For Windows users, make sure Hyper-V is enabled on your computer. The instructions for installing on Windows can be accessed from here.

Pull bigdl-llm-cpu image

To pull image from hub, you can execute command on console:

docker pull intelanalytics/bigdl-llm-cpu:2.4.0-SNAPSHOT

to check if the image is successfully downloaded, you can use:

docker images | sls intelanalytics/bigdl-llm-cpu:2.4.0-SNAPSHOT

Start bigdl-llm-cpu container

To run the image and do inference, you could create and run a bat script on Windows.

An example could be:

@echo off
set DOCKER_IMAGE=intelanalytics/bigdl-llm-cpu:2.4.0-SNAPSHOT
set CONTAINER_NAME=my_container
set MODEL_PATH=D:/llm/models[change to your model path]

:: Run the Docker container
docker run -itd ^
    --net=host ^
    --cpuset-cpus="0-7" ^
    --cpuset-mems="0" ^
    --memory="8G" ^
    --name=%CONTAINER_NAME% ^
    -v %MODEL_PATH%:/llm/models ^
    %DOCKER_IMAGE%

After the container is booted, you could get into the container through docker exec.

docker exec -it my_container bash

To run inference using BigDL-LLM using cpu, you could refer to this documentation.

Getting started with chat

chat.py can be used to initiate a conversation with a specified model. The file is under directory '/llm'.

You can download models and bind the model directory from host machine to container when start a container.

After entering the container through docker exec, you can run chat.py by:

cd /llm
python chat.py --model-path YOUR_MODEL_PATH

If your model is chatglm-6b and mounted on /llm/models, you can excute:

python chat.py --model-path /llm/models/chatglm-6b

Here is a demostration:

Getting started with tutorials

You could start a jupyter-lab serving to explore bigdl-llm-tutorial which can help you build a more sophisticated Chatbo.

To start serving, run the script under '/llm':

cd /llm
./start-notebook.sh [--port EXPECTED_PORT]

You could assign a port to serving, or the default port 12345 will be assigned.

If you use host network mode when booted the container, after successfully running service, you can access http://127.0.0.1:12345/lab to get into tutorial, or you should bind the correct ports between container and host.

Here is a demostration of how to use tutorial in explorer:

2.9 KiB Raw Blame History