diff --git a/docker/llm/finetune/lora/cpu/docker/README.md b/docker/llm/finetune/lora/cpu/docker/README.md index d93f3f2a..de5df38f 100644 --- a/docker/llm/finetune/lora/cpu/docker/README.md +++ b/docker/llm/finetune/lora/cpu/docker/README.md @@ -5,7 +5,7 @@ You can download directly from Dockerhub like: ```bash -docker pull intelanalytics/ipex-llm-finetune-lora-cpu:2.5.0-SNAPSHOT +docker pull intelanalytics/ipex-llm-finetune-lora-cpu:2.1.0-SNAPSHOT ``` Or build the image from source: @@ -17,7 +17,7 @@ export HTTPS_PROXY=your_https_proxy docker build \ --build-arg http_proxy=${HTTP_PROXY} \ --build-arg https_proxy=${HTTPS_PROXY} \ - -t intelanalytics/ipex-llm-finetune-lora-cpu:2.5.0-SNAPSHOT \ + -t intelanalytics/ipex-llm-finetune-lora-cpu:2.1.0-SNAPSHOT \ -f ./Dockerfile . ``` @@ -33,7 +33,7 @@ docker run -itd \ -e WORKER_COUNT_DOCKER=your_worker_count \ -v your_downloaded_base_model_path:/ipex_llm/model \ -v your_downloaded_data_path:/ipex_llm/data/alpaca_data_cleaned_archive.json \ - intelanalytics/ipex-llm-finetune-lora-cpu:2.5.0-SNAPSHOT \ + intelanalytics/ipex-llm-finetune-lora-cpu:2.1.0-SNAPSHOT \ bash ``` diff --git a/docker/llm/finetune/lora/cpu/kubernetes/values.yaml b/docker/llm/finetune/lora/cpu/kubernetes/values.yaml index b082d0aa..aebfd767 100644 --- a/docker/llm/finetune/lora/cpu/kubernetes/values.yaml +++ b/docker/llm/finetune/lora/cpu/kubernetes/values.yaml @@ -1,4 +1,4 @@ -imageName: intelanalytics/ipex-llm-finetune-lora-cpu:2.5.0-SNAPSHOT +imageName: intelanalytics/ipex-llm-finetune-lora-cpu:2.1.0-SNAPSHOT trainerNum: 8 microBatchSize: 8 nfsServerIp: your_nfs_server_ip diff --git a/docker/llm/finetune/qlora/xpu/docker/README.md b/docker/llm/finetune/qlora/xpu/docker/README.md index 56926293..6575fc27 100644 --- a/docker/llm/finetune/qlora/xpu/docker/README.md +++ b/docker/llm/finetune/qlora/xpu/docker/README.md @@ -7,7 +7,7 @@ The following shows how to fine-tune LLM with Quantization (QLoRA built on IPEX- You can download directly from Dockerhub like: ```bash -docker pull intelanalytics/ipex-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT +docker pull intelanalytics/ipex-llm-finetune-qlora-xpu:2.1.0-SNAPSHOT ``` Or build the image from source: @@ -19,7 +19,7 @@ export HTTPS_PROXY=your_https_proxy docker build \ --build-arg http_proxy=${HTTP_PROXY} \ --build-arg https_proxy=${HTTPS_PROXY} \ - -t intelanalytics/ipex-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT \ + -t intelanalytics/ipex-llm-finetune-qlora-xpu:2.1.0-SNAPSHOT \ -f ./Dockerfile . ``` @@ -43,7 +43,7 @@ docker run -itd \ -v $BASE_MODE_PATH:/model \ -v $DATA_PATH:/data/alpaca-cleaned \ --shm-size="16g" \ - intelanalytics/ipex-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT + intelanalytics/ipex-llm-fintune-qlora-xpu:2.1.0-SNAPSHOT ``` The download and mount of base model and data to a docker container demonstrates a standard fine-tuning process. You can skip this step for a quick start, and in this way, the fine-tuning codes will automatically download the needed files: @@ -60,7 +60,7 @@ docker run -itd \ -e http_proxy=${HTTP_PROXY} \ -e https_proxy=${HTTPS_PROXY} \ --shm-size="16g" \ - intelanalytics/ipex-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT + intelanalytics/ipex-llm-fintune-qlora-xpu:2.1.0-SNAPSHOT ``` However, we do recommend you to handle them manually, because the automatical download can be blocked by Internet access and Huggingface authentication etc. according to different environment, and the manual method allows you to fine-tune in a custom way (with different base model and dataset). diff --git a/docker/llm/serving/cpu/docker/Dockerfile b/docker/llm/serving/cpu/docker/Dockerfile index 6c5c4684..06ddb7f5 100644 --- a/docker/llm/serving/cpu/docker/Dockerfile +++ b/docker/llm/serving/cpu/docker/Dockerfile @@ -1,4 +1,4 @@ -FROM intelanalytics/ipex-llm-cpu:2.5.0-SNAPSHOT +FROM intelanalytics/ipex-llm-cpu:2.1.0-SNAPSHOT ARG http_proxy ARG https_proxy diff --git a/docker/llm/serving/cpu/docker/README.md b/docker/llm/serving/cpu/docker/README.md index f56b93a4..b71c8578 100644 --- a/docker/llm/serving/cpu/docker/README.md +++ b/docker/llm/serving/cpu/docker/README.md @@ -6,7 +6,7 @@ docker build \ --build-arg http_proxy=.. \ --build-arg https_proxy=.. \ --build-arg no_proxy=.. \ - --rm --no-cache -t intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT . + --rm --no-cache -t intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT . ``` ### Use the image for doing cpu serving @@ -16,7 +16,7 @@ You could use the following bash script to start the container. Please be noted ```bash #/bin/bash -export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT +export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT sudo docker run -itd \ --net=host \ @@ -36,7 +36,7 @@ Also you can set environment variables and start arguments while running a conta To start a controller container: ```bash #/bin/bash -export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT +export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT controller_host=localhost controller_port=23000 api_host=localhost @@ -59,7 +59,7 @@ sudo docker run -itd \ To start a worker container: ```bash #/bin/bash -export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT +export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT export MODEL_PATH=YOUR_MODEL_PATH controller_host=localhost controller_port=23000 diff --git a/docker/llm/serving/cpu/kubernetes/README.md b/docker/llm/serving/cpu/kubernetes/README.md index 5a337b45..7e8cb0e5 100644 --- a/docker/llm/serving/cpu/kubernetes/README.md +++ b/docker/llm/serving/cpu/kubernetes/README.md @@ -2,7 +2,7 @@ ## Image -To deploy IPEX-LLM-serving cpu in Kubernetes environment, please use this image: `intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT` +To deploy IPEX-LLM-serving cpu in Kubernetes environment, please use this image: `intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT` ## Before deployment @@ -73,7 +73,7 @@ spec: dnsPolicy: "ClusterFirst" containers: - name: fastchat-controller # fixed - image: intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT + image: intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT imagePullPolicy: IfNotPresent env: - name: CONTROLLER_HOST # fixed @@ -146,7 +146,7 @@ spec: dnsPolicy: "ClusterFirst" containers: - name: fastchat-worker # fixed - image: intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT + image: intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT imagePullPolicy: IfNotPresent env: - name: CONTROLLER_HOST # fixed diff --git a/docker/llm/serving/cpu/kubernetes/deployment.yaml b/docker/llm/serving/cpu/kubernetes/deployment.yaml index 71c3ad42..d1aaf5c1 100644 --- a/docker/llm/serving/cpu/kubernetes/deployment.yaml +++ b/docker/llm/serving/cpu/kubernetes/deployment.yaml @@ -24,7 +24,7 @@ spec: dnsPolicy: "ClusterFirst" containers: - name: fastchat-controller # fixed - image: intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT + image: intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT imagePullPolicy: IfNotPresent env: - name: CONTROLLER_HOST # fixed @@ -91,7 +91,7 @@ spec: dnsPolicy: "ClusterFirst" containers: - name: fastchat-worker # fixed - image: intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT + image: intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT imagePullPolicy: IfNotPresent env: - name: CONTROLLER_HOST # fixed diff --git a/docker/llm/serving/xpu/docker/Dockerfile b/docker/llm/serving/xpu/docker/Dockerfile index 87cb85b8..a946d619 100644 --- a/docker/llm/serving/xpu/docker/Dockerfile +++ b/docker/llm/serving/xpu/docker/Dockerfile @@ -1,4 +1,4 @@ -FROM intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT +FROM intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT ARG http_proxy ARG https_proxy diff --git a/docker/llm/serving/xpu/docker/README.md b/docker/llm/serving/xpu/docker/README.md index 16822dff..2933a34f 100644 --- a/docker/llm/serving/xpu/docker/README.md +++ b/docker/llm/serving/xpu/docker/README.md @@ -6,7 +6,7 @@ docker build \ --build-arg http_proxy=.. \ --build-arg https_proxy=.. \ --build-arg no_proxy=.. \ - --rm --no-cache -t intelanalytics/ipex-llm-serving-xpu:2.5.0-SNAPSHOT . + --rm --no-cache -t intelanalytics/ipex-llm-serving-xpu:2.1.0-SNAPSHOT . ``` @@ -18,7 +18,7 @@ To map the `xpu` into the container, you need to specify `--device=/dev/dri` whe An example could be: ```bash #/bin/bash -export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:2.5.0-SNAPSHOT +export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:2.1.0-SNAPSHOT sudo docker run -itd \ --net=host \ diff --git a/docs/readthedocs/source/doc/LLM/Quickstart/docker_windows_gpu.md b/docs/readthedocs/source/doc/LLM/Quickstart/docker_windows_gpu.md index f19283ef..fd5a52c3 100644 --- a/docs/readthedocs/source/doc/LLM/Quickstart/docker_windows_gpu.md +++ b/docs/readthedocs/source/doc/LLM/Quickstart/docker_windows_gpu.md @@ -55,7 +55,7 @@ It applies to Intel Core Core 12 - 14 gen integrated GPUs (iGPUs) and Intel Arc ### 1. Prepare ipex-llm-xpu Docker Image Run the following command in WSL: ```bash -docker pull intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT +docker pull intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT ``` This step will take around 20 minutes depending on your network. @@ -64,7 +64,7 @@ This step will take around 20 minutes depending on your network. To map the xpu into the container, an example (docker_setup.sh) could be: ```bash #/bin/bash -export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT +export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT export CONTAINER_NAME=my_container export MODEL_PATH=/llm/models[change to your model path]