ipex-llm/docker/llm/serving/cpu/docker
Lilac09 2c2bc959ad add tools into previously built images (#9317)
* modify Dockerfile

* manually build

* modify Dockerfile

* add chat.py into inference-xpu

* add benchmark into inference-cpu

* manually build

* add benchmark into inference-cpu

* add benchmark into inference-cpu

* add benchmark into inference-cpu

* add chat.py into inference-xpu

* add chat.py into inference-xpu

* change ADD to COPY in dockerfile

* fix dependency issue

* temporarily remove run-spr in llm-cpu

* temporarily remove run-spr in llm-cpu
2023-10-31 16:35:18 +08:00
..
Dockerfile add tools into previously built images (#9317) 2023-10-31 16:35:18 +08:00
entrypoint.sh Add Kubernetes support for BigDL-LLM-serving CPU. (#9071) 2023-10-07 09:37:48 +08:00
README.md Create serving images (#9048) 2023-09-25 15:51:45 +08:00

Build/Use BigDL-LLM-serving cpu image

Build Image

docker build \
  --build-arg http_proxy=.. \
  --build-arg https_proxy=.. \
  --build-arg no_proxy=.. \
  --rm --no-cache -t intelanalytics/bigdl-llm-serving-cpu:2.4.0-SNAPSHOT .

Use the image for doing cpu serving

You could use the following bash script to start the container. Please be noted that the CPU config is specified for Xeon CPUs, change it accordingly if you are not using a Xeon CPU.

#/bin/bash
export DOCKER_IMAGE=intelanalytics/bigdl-llm-serving-cpu:2.4.0-SNAPSHOT

sudo docker run -itd \
        --net=host \
        --cpuset-cpus="0-47" \
        --cpuset-mems="0" \
        --memory="32G" \
        --name=CONTAINER_NAME \
        --shm-size="16g" \
        $DOCKER_IMAGE

After the container is booted, you could get into the container through docker exec.

To run model-serving using BigDL-LLM as backend, you can refer to this document.