* add README for llm-inference-cpu * modify README * add README for llm-inference-cpu on Windows
2.9 KiB
Getting started with BigDL LLM on Windows
Install docker
New users can quickly get started with Docker using this official link.
For Windows users, make sure Hyper-V is enabled on your computer. The instructions for installing on Windows can be accessed from here.
Pull bigdl-llm-cpu image
To pull image from hub, you can execute command on console:
docker pull intelanalytics/bigdl-llm-cpu:2.4.0-SNAPSHOT
to check if the image is successfully downloaded, you can use:
docker images | sls intelanalytics/bigdl-llm-cpu:2.4.0-SNAPSHOT
Start bigdl-llm-cpu container
To run the image and do inference, you could create and run a bat script on Windows.
An example could be:
@echo off
set DOCKER_IMAGE=intelanalytics/bigdl-llm-cpu:2.4.0-SNAPSHOT
set CONTAINER_NAME=my_container
set MODEL_PATH=D:/llm/models[change to your model path]
:: Run the Docker container
docker run -itd ^
--net=host ^
--cpuset-cpus="0-7" ^
--cpuset-mems="0" ^
--memory="8G" ^
--name=%CONTAINER_NAME% ^
-v %MODEL_PATH%:/llm/models ^
%DOCKER_IMAGE%
After the container is booted, you could get into the container through docker exec.
docker exec -it my_container bash
To run inference using BigDL-LLM using cpu, you could refer to this documentation.
Getting started with chat
chat.py can be used to initiate a conversation with a specified model. The file is under directory '/llm'.
You can download models and bind the model directory from host machine to container when start a container.
After entering the container through docker exec, you can run chat.py by:
cd /llm
python chat.py --model-path YOUR_MODEL_PATH
If your model is chatglm-6b and mounted on /llm/models, you can excute:
python chat.py --model-path /llm/models/chatglm-6b
Here is a demostration:
Getting started with tutorials
You could start a jupyter-lab serving to explore bigdl-llm-tutorial which can help you build a more sophisticated Chatbo.
To start serving, run the script under '/llm':
cd /llm
./start-notebook.sh [--port EXPECTED_PORT]
You could assign a port to serving, or the default port 12345 will be assigned.
If you use host network mode when booted the container, after successfully running service, you can access http://127.0.0.1:12345/lab to get into tutorial, or you should bind the correct ports between container and host.
Here is a demostration of how to use tutorial in explorer: