diff --git a/docker/llm/README.md b/docker/llm/README.md
index 1f418bb9..eba61568 100644
--- a/docker/llm/README.md
+++ b/docker/llm/README.md
@@ -1,18 +1,18 @@
-# Getting started with BigDL-LLM in Docker
+# Getting started with IPEX-LLM in Docker
 
 ### Index
-- [Docker installation guide for BigDL-LLM on CPU](#docker-installation-guide-for-bigdl-llm-on-cpu)
-    - [BigDL-LLM on Windows](#bigdl-llm-on-windows)
-    - [BigDL-LLM on Linux/MacOS](#bigdl-llm-on-linuxmacos)
-- [Docker installation guide for BigDL LLM on XPU](#docker-installation-guide-for-bigdl-llm-on-xpu) 
-- [Docker installation guide for BigDL LLM Serving on CPU](#docker-installation-guide-for-bigdl-llm-serving-on-cpu) 
-- [Docker installation guide for BigDL LLM Serving on XPU](#docker-installation-guide-for-bigdl-llm-serving-on-xpu) 
-- [Docker installation guide for BigDL LLM Fine Tuning on CPU](#docker-installation-guide-for-bigdl-llm-fine-tuning-on-cpu) 
-- [Docker installation guide for BigDL LLM Fine Tuning on XPU](#docker-installation-guide-for-bigdl-llm-fine-tuning-on-xpu) 
+- [Docker installation guide for IPEX-LLM on CPU](#docker-installation-guide-for-ipex-llm-on-cpu)
+    - [IPEX-LLM on Windows](#ipex-llm-on-windows)
+    - [IPEX-LLM on Linux/MacOS](#ipex-llm-on-linuxmacos)
+- [Docker installation guide for IPEX LLM on XPU](#docker-installation-guide-for-ipex-llm-on-xpu) 
+- [Docker installation guide for IPEX LLM Serving on CPU](#docker-installation-guide-for-ipex-llm-serving-on-cpu) 
+- [Docker installation guide for IPEX LLM Serving on XPU](#docker-installation-guide-for-ipex-llm-serving-on-xpu) 
+- [Docker installation guide for IPEX LLM Fine Tuning on CPU](#docker-installation-guide-for-ipex-llm-fine-tuning-on-cpu) 
+- [Docker installation guide for IPEX LLM Fine Tuning on XPU](#docker-installation-guide-for-ipex-llm-fine-tuning-on-xpu) 
 
-## Docker installation guide for BigDL-LLM on CPU
+## Docker installation guide for IPEX-LLM on CPU
 
-### BigDL-LLM on Windows
+### IPEX-LLM on Windows
 
 #### Install docker
 
@@ -23,26 +23,26 @@ The instructions for installing can be accessed from
 [here](https://docs.docker.com/desktop/install/windows-install/).
 
 
-#### Pull bigdl-llm-cpu image
+#### Pull ipex-llm-cpu image
 
 To pull image from hub, you can execute command on console:
 ```bash
-docker pull intelanalytics/bigdl-llm-cpu:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-cpu:2.5.0-SNAPSHOT
 ```
 to check if the image is successfully downloaded, you can use:
 ```powershell
-docker images | sls intelanalytics/bigdl-llm-cpu:2.5.0-SNAPSHOT
+docker images | sls intelanalytics/ipex-llm-cpu:2.5.0-SNAPSHOT
 ```
 
 
-#### Start bigdl-llm-cpu container
+#### Start ipex-llm-cpu container
 
 To run the image and do inference, you could create and run a bat script on Windows.
 
 An example on Windows could be:
 ```bat
 @echo off
-set DOCKER_IMAGE=intelanalytics/bigdl-llm-cpu:2.5.0-SNAPSHOT
+set DOCKER_IMAGE=intelanalytics/ipex-llm-cpu:2.5.0-SNAPSHOT
 set CONTAINER_NAME=my_container
 set MODEL_PATH=D:/llm/models[change to your model path]
 
@@ -62,7 +62,7 @@ After the container is booted, you could get into the container through `docker
 docker exec -it my_container bash
 ```
 
-To run inference using `BigDL-LLM` using cpu, you could refer to this [documentation](https://github.com/intel-analytics/BigDL/tree/main/python/llm#cpu-int4).
+To run inference using `IPEX-LLM` using cpu, you could refer to this [documentation](https://github.com/intel-analytics/IPEX/tree/main/python/llm#cpu-int4).
 
 
 #### Getting started with chat
@@ -89,7 +89,7 @@ Here is a demostration:
 
 #### Getting started with tutorials
 
-You could start a jupyter-lab serving to explore bigdl-llm-tutorial which can help you build a more sophisticated Chatbo.
+You could start a jupyter-lab serving to explore ipex-llm-tutorial which can help you build a more sophisticated Chatbo.
 
 To start serving,  run the script under '/llm':
 ```bash
@@ -107,12 +107,12 @@ Here is a demostration of how to use tutorial in explorer:
 
 </a>
 
-### BigDL-LLM on Linux/MacOS
+### IPEX-LLM on Linux/MacOS
 
 To run container on Linux/MacOS:
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-cpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-cpu:2.5.0-SNAPSHOT
 export CONTAINER_NAME=my_container
 export MODEL_PATH=/llm/models[change to your model path]
 
@@ -126,23 +126,23 @@ docker run -itd \
     $DOCKER_IMAGE
 ```
 
-Also, you could use chat.py and bigdl-llm-tutorial for development.
+Also, you could use chat.py and ipex-llm-tutorial for development.
 
 [Getting started with chat](#getting-started-with-chat)
 
 [Getting started with tutorials](#getting-started-with-tutorials)
 
-## Docker installation guide for BigDL LLM on XPU
+## Docker installation guide for IPEX LLM on XPU
 
 First, pull docker image from docker hub:
 ```
-docker pull intelanalytics/bigdl-llm-xpu:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT
 ```
 To map the xpu into the container, you need to specify --device=/dev/dri when booting the container.
 An example could be:
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-xpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT
 export CONTAINER_NAME=my_container
 export MODEL_PATH=/llm/models[change to your model path]
 
@@ -168,20 +168,20 @@ root@arda-arc12:/# sycl-ls
 [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.26241]
 ```
 
-To run inference using `BigDL-LLM` using xpu, you could refer to this [documentation](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU).
+To run inference using `IPEX-LLM` using xpu, you could refer to this [documentation](https://github.com/intel-analytics/IPEX/tree/main/python/llm/example/GPU).
 
-## Docker installation guide for BigDL LLM Serving on CPU
+## Docker installation guide for IPEX LLM Serving on CPU
 
 ### Boot container
 
 Pull image:
 ```
-docker pull intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
 ```
 
 You could use the following bash script to start the container. Please be noted that the CPU config is specified for Xeon CPUs, change it accordingly if you are not using a Xeon CPU.
 ```bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
 export CONTAINER_NAME=my_container
 export MODEL_PATH=/llm/models[change to your model path]
 
@@ -198,14 +198,11 @@ After the container is booted, you could get into the container through `docker
 
 ### Models
 
-Using BigDL-LLM in FastChat does not impose any new limitations on model usage. Therefore, all Hugging Face Transformer models can be utilized in FastChat.
+Using IPEX-LLM in FastChat does not impose any new limitations on model usage. Therefore, all Hugging Face Transformer models can be utilized in FastChat.
 
-FastChat determines the Model adapter to use through path matching. Therefore, in order to load models using BigDL-LLM, you need to make some modifications to the model's name.
+FastChat determines the Model adapter to use through path matching. Therefore, in order to load models using IPEX-LLM, you need to make some modifications to the model's name.
 
-For instance, assuming you have downloaded the `llama-7b-hf` from [HuggingFace](https://huggingface.co/decapoda-research/llama-7b-hf).  Then, to use the `BigDL-LLM` as backend, you need to change name from `llama-7b-hf` to `bigdl-7b`.
-The key point here is that the model's path should include "bigdl" and should not include paths matched by other model adapters.
-
-A special case is `ChatGLM` models. For these models, you do not need to do any changes after downloading the model and the `BigDL-LLM` backend will be used automatically.
+A special case is `ChatGLM` models. For these models, you do not need to do any changes after downloading the model and the `IPEX-LLM` backend will be used automatically.
 
 
 ### Start the service
@@ -237,11 +234,11 @@ python3 -m fastchat.serve.gradio_web_server
 
 This is the user interface that users will interact with.
 
-By following these steps, you will be able to serve your models using the web UI with `BigDL-LLM` as the backend. You can open your browser and chat with a model now.
+By following these steps, you will be able to serve your models using the web UI with `IPEX-LLM` as the backend. You can open your browser and chat with a model now.
 
 #### Serving with OpenAI-Compatible RESTful APIs
 
-To start an OpenAI API server that provides compatible APIs using `BigDL-LLM` backend, you need three main components: an OpenAI API Server that serves the in-coming requests, model workers that host one or more models, and a controller to coordinate the web server and model workers.
+To start an OpenAI API server that provides compatible APIs using `IPEX-LLM` backend, you need three main components: an OpenAI API Server that serves the in-coming requests, model workers that host one or more models, and a controller to coordinate the web server and model workers.
 
 First, launch the controller
 
@@ -262,13 +259,13 @@ python3 -m fastchat.serve.openai_api_server --host localhost --port 8000
 ```
 
 
-## Docker installation guide for BigDL LLM Serving on XPU
+## Docker installation guide for IPEX LLM Serving on XPU
 
 ### Boot container
 
 Pull image:
 ```
-docker pull intelanalytics/bigdl-llm-serving-xpu:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-serving-xpu:2.5.0-SNAPSHOT
 ```
 
 To map the `xpu` into the container, you need to specify `--device=/dev/dri` when booting the container.
@@ -276,7 +273,7 @@ To map the `xpu` into the container, you need to specify `--device=/dev/dri` whe
 An example could be:
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
 export CONTAINER_NAME=my_container
 export MODEL_PATH=/llm/models[change to your model path]
 export SERVICE_MODEL_PATH=/llm/models/chatglm2-6b[a specified model path for running service]
@@ -331,11 +328,11 @@ python3 -m fastchat.serve.gradio_web_server
 
 This is the user interface that users will interact with.
 
-By following these steps, you will be able to serve your models using the web UI with `BigDL-LLM` as the backend. You can open your browser and chat with a model now.
+By following these steps, you will be able to serve your models using the web UI with `IPEX-LLM` as the backend. You can open your browser and chat with a model now.
 
 #### Serving with OpenAI-Compatible RESTful APIs
 
-To start an OpenAI API server that provides compatible APIs using `BigDL-LLM` backend, you need three main components: an OpenAI API Server that serves the in-coming requests, model workers that host one or more models, and a controller to coordinate the web server and model workers.
+To start an OpenAI API server that provides compatible APIs using `IPEX-LLM` backend, you need three main components: an OpenAI API Server that serves the in-coming requests, model workers that host one or more models, and a controller to coordinate the web server and model workers.
 
 First, launch the controller
 
@@ -355,7 +352,7 @@ Finally, launch the RESTful API server
 python3 -m fastchat.serve.openai_api_server --host localhost --port 8000
 ```
 
-## Docker installation guide for BigDL LLM Fine Tuning on CPU
+## Docker installation guide for IPEX LLM Fine Tuning on CPU
 
 ### 1. Prepare Docker Image
 
@@ -363,10 +360,10 @@ You can download directly from Dockerhub like:
 
 ```bash
 # For standalone
-docker pull intelanalytics/bigdl-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
 
 # For k8s
-docker pull intelanalytics/bigdl-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT
 ```
 
 Or build the image from source:
@@ -379,7 +376,7 @@ export HTTPS_PROXY=your_https_proxy
 docker build \
   --build-arg http_proxy=${HTTP_PROXY} \
   --build-arg https_proxy=${HTTPS_PROXY} \
-  -t intelanalytics/bigdl-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT \
+  -t intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT \
   -f ./Dockerfile .
 
 # For k8s
@@ -389,7 +386,7 @@ export HTTPS_PROXY=your_https_proxy
 docker build \
   --build-arg http_proxy=${HTTP_PROXY} \
   --build-arg https_proxy=${HTTPS_PROXY} \
-  -t intelanalytics/bigdl-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT \
+  -t intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT \
   -f ./Dockerfile.k8s .
 ```
 
@@ -405,12 +402,12 @@ export HTTPS_PROXY=your_https_proxy
 
 docker run -itd \
    --net=host \
-   --name=bigdl-llm-fintune-qlora-cpu \
+   --name=ipex-llm-fintune-qlora-cpu \
    -e http_proxy=${HTTP_PROXY} \
    -e https_proxy=${HTTPS_PROXY} \
-   -v $BASE_MODE_PATH:/bigdl/model \
-   -v $DATA_PATH:/bigdl/data/alpaca-cleaned \
-   intelanalytics/bigdl-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
+   -v $BASE_MODE_PATH:/ipex_llm/model \
+   -v $DATA_PATH:/ipex_llm/data/alpaca-cleaned \
+   intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
 ```
 
 The download and mount of base model and data to a docker container demonstrates a standard fine-tuning process. You can skip this step for a quick start, and in this way, the fine-tuning codes will automatically download the needed files:
@@ -421,10 +418,10 @@ export HTTPS_PROXY=your_https_proxy
 
 docker run -itd \
    --net=host \
-   --name=bigdl-llm-fintune-qlora-cpu \
+   --name=ipex-llm-fintune-qlora-cpu \
    -e http_proxy=${HTTP_PROXY} \
    -e https_proxy=${HTTPS_PROXY} \
-   intelanalytics/bigdl-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
+   intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
 ```
 
 However, we do recommend you to handle them manually, because the automatical download can be blocked by Internet access and Huggingface authentication etc. according to different environment, and the manual method allows you to fine-tune in a custom way (with different base model and dataset).
@@ -434,14 +431,14 @@ However, we do recommend you to handle them manually, because the automatical do
 Enter the running container:
 
 ```bash
-docker exec -it bigdl-llm-fintune-qlora-cpu bash
+docker exec -it ipex-llm-fintune-qlora-cpu bash
 ```
 
 Then, start QLoRA fine-tuning:
 If the machine memory is not enough, you can try to set `use_gradient_checkpointing=True`.
 
 ```bash
-cd /bigdl
+cd /ipex_llm
 bash start-qlora-finetuning-on-cpu.sh
 ```
 
@@ -473,16 +470,16 @@ python ./export_merged_model.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --
 
 Then you can use `./outputs/checkpoint-200-merged` as a normal huggingface transformer model to do inference.
 
-## Docker installation guide for BigDL LLM Fine Tuning on XPU
+## Docker installation guide for IPEX LLM Fine Tuning on XPU
 
-The following shows how to fine-tune LLM with Quantization (QLoRA built on BigDL-LLM 4bit optimizations) in a docker environment, which is accelerated by Intel XPU.
+The following shows how to fine-tune LLM with Quantization (QLoRA built on IPEX-LLM 4bit optimizations) in a docker environment, which is accelerated by Intel XPU.
 
 ### 1. Prepare Docker Image
 
 You can download directly from Dockerhub like:
 
 ```bash
-docker pull intelanalytics/bigdl-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT
 ```
 
 Or build the image from source:
@@ -494,7 +491,7 @@ export HTTPS_PROXY=your_https_proxy
 docker build \
   --build-arg http_proxy=${HTTP_PROXY} \
   --build-arg https_proxy=${HTTPS_PROXY} \
-  -t intelanalytics/bigdl-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT \
+  -t intelanalytics/ipex-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT \
   -f ./Dockerfile .
 ```
 
@@ -512,13 +509,13 @@ docker run -itd \
    --net=host \
    --device=/dev/dri \
    --memory="32G" \
-   --name=bigdl-llm-fintune-qlora-xpu \
+   --name=ipex-llm-fintune-qlora-xpu \
    -e http_proxy=${HTTP_PROXY} \
    -e https_proxy=${HTTPS_PROXY} \
    -v $BASE_MODE_PATH:/model \
    -v $DATA_PATH:/data/alpaca-cleaned \
    --shm-size="16g" \
-   intelanalytics/bigdl-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT
+   intelanalytics/ipex-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT
 ```
 
 The download and mount of base model and data to a docker container demonstrates a standard fine-tuning process. You can skip this step for a quick start, and in this way, the fine-tuning codes will automatically download the needed files:
@@ -531,11 +528,11 @@ docker run -itd \
    --net=host \
    --device=/dev/dri \
    --memory="32G" \
-   --name=bigdl-llm-fintune-qlora-xpu \
+   --name=ipex-llm-fintune-qlora-xpu \
    -e http_proxy=${HTTP_PROXY} \
    -e https_proxy=${HTTPS_PROXY} \
    --shm-size="16g" \
-   intelanalytics/bigdl-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT
+   intelanalytics/ipex-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT
 ```
 
 However, we do recommend you to handle them manually, because the automatical download can be blocked by Internet access and Huggingface authentication etc. according to different environment, and the manual method allows you to fine-tune in a custom way (with different base model and dataset).
@@ -545,7 +542,7 @@ However, we do recommend you to handle them manually, because the automatical do
 Enter the running container:
 
 ```bash
-docker exec -it bigdl-llm-fintune-qlora-xpu bash
+docker exec -it ipex-llm-fintune-qlora-xpu bash
 ```
 
 Then, start QLoRA fine-tuning:
diff --git a/docker/llm/finetune/lora/cpu/README.md b/docker/llm/finetune/lora/cpu/README.md
index 47ab9d24..bd11e84b 100644
--- a/docker/llm/finetune/lora/cpu/README.md
+++ b/docker/llm/finetune/lora/cpu/README.md
@@ -2,13 +2,13 @@
 
 [Alpaca Lora](https://github.com/tloen/alpaca-lora/tree/main) uses [low-rank adaption](https://arxiv.org/pdf/2106.09685.pdf) to speed up the finetuning process of base model [Llama2-7b](https://huggingface.co/meta-llama/Llama-2-7b), and tries to reproduce the standard Alpaca, a general finetuned LLM. This is on top of Hugging Face transformers with Pytorch backend, which natively requires a number of expensive GPU resources and takes significant time.
 
-By constract, BigDL here provides a CPU optimization to accelerate the lora finetuning of Llama2-7b, in the power of mixed-precision and distributed training. Detailedly, [Intel OneCCL](https://www.intel.com/content/www/us/en/developer/tools/oneapi/oneccl.html), an available Hugging Face backend, is able to speed up the Pytorch computation with BF16 datatype on CPUs, as well as parallel processing on Kubernetes enabled by [Intel MPI](https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html). 
+By constract, IPEX-LLM here provides a CPU optimization to accelerate the lora finetuning of Llama2-7b, in the power of mixed-precision and distributed training. Detailedly, [Intel OneCCL](https://www.intel.com/content/www/us/en/developer/tools/oneapi/oneccl.html), an available Hugging Face backend, is able to speed up the Pytorch computation with BF16 datatype on CPUs, as well as parallel processing on Kubernetes enabled by [Intel MPI](https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html). 
 
 The architecture is illustrated in the following:
 
 ![image](https://llm-assets.readthedocs.io/en/latest/_images/llm-finetune-lora-cpu-k8s.png)
 
-As above, BigDL implements its MPI training with [Kubeflow MPI operator](https://github.com/kubeflow/mpi-operator/tree/master), which encapsulates the deployment as MPIJob CRD, and assists users to handle the construction of a MPI worker cluster on Kubernetes, such as public key distribution, SSH connection, and log collection. 
+As above, IPEX-LLM implements its MPI training with [Kubeflow MPI operator](https://github.com/kubeflow/mpi-operator/tree/master), which encapsulates the deployment as MPIJob CRD, and assists users to handle the construction of a MPI worker cluster on Kubernetes, such as public key distribution, SSH connection, and log collection. 
 
 Now, let's go to deploy a Lora finetuning to create a LLM from Llama2-7b.
 
@@ -20,7 +20,7 @@ Follow [here](https://github.com/kubeflow/mpi-operator/tree/master#installation)
 
 ### 2. Download Image, Base Model and Finetuning Data
 
-Follow [here](https://github.com/intel-analytics/BigDL/tree/main/docker/llm/finetune/lora/docker#prepare-bigdl-image-for-lora-finetuning) to prepare BigDL Lora Finetuning image in your cluster.
+Follow [here](https://github.com/intel-analytics/IPEX-LLM/tree/main/docker/llm/finetune/lora/docker#prepare-ipex-llm-image-for-lora-finetuning) to prepare IPEX-LLM Lora Finetuning image in your cluster.
 
 As finetuning is from a base model, first download [Llama2-7b model from the public download site of Hugging Face](https://huggingface.co/meta-llama/Llama-2-7b). Then, download [cleaned alpaca data](https://raw.githubusercontent.com/tloen/alpaca-lora/main/alpaca_data_cleaned_archive.json), which contains all kinds of general knowledge and has already been cleaned. Next, move the downloaded files to a shared directory on your NFS server.
 
@@ -34,12 +34,12 @@ After preparing parameters in `./kubernetes/values.yaml`, submit the job as befl
 
 ```bash
 cd ./kubernetes
-helm install bigdl-lora-finetuning .
+helm install ipex-llm-lora-finetuning .
 ```
 
 ### 4. Check Deployment
 ```bash
-kubectl get all -n bigdl-lora-finetuning # you will see launcher and worker pods running
+kubectl get all -n ipex-llm-lora-finetuning # you will see launcher and worker pods running
 ```
 
 ### 5. Check Finetuning Process
@@ -47,8 +47,8 @@ kubectl get all -n bigdl-lora-finetuning # you will see launcher and worker pods
 After deploying successfully, you can find a launcher pod, and then go inside this pod and check the logs collected from all workers.
 
 ```bash
-kubectl get all -n bigdl-lora-finetuning # you will see a launcher pod
-kubectl exec -it <launcher_pod_name> bash -n bigdl-ppml-finetuning # enter launcher pod
+kubectl get all -n ipex-llm-lora-finetuning # you will see a launcher pod
+kubectl exec -it <launcher_pod_name> bash -n ipex-llm-lora-finetuning # enter launcher pod
 cat launcher.log # display logs collected from other workers
 ```
 
diff --git a/docker/llm/finetune/lora/cpu/docker/Dockerfile b/docker/llm/finetune/lora/cpu/docker/Dockerfile
index 1d6d919d..4b6f51b9 100644
--- a/docker/llm/finetune/lora/cpu/docker/Dockerfile
+++ b/docker/llm/finetune/lora/cpu/docker/Dockerfile
@@ -12,13 +12,13 @@ FROM mpioperator/intel as builder
 ARG http_proxy
 ARG https_proxy
 ENV PIP_NO_CACHE_DIR=false
-COPY ./requirements.txt /bigdl/requirements.txt
+COPY ./requirements.txt /ipex_llm/requirements.txt
 
 # add public key
 COPY --from=key-getter /root/intel-oneapi-archive-keyring.gpg /usr/share/keyrings/intel-oneapi-archive-keyring.gpg
 RUN echo "deb [signed-by=/usr/share/keyrings/intel-oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main " > /etc/apt/sources.list.d/oneAPI.list
 
-RUN mkdir /bigdl/data && mkdir /bigdl/model && \
+RUN mkdir /ipex_llm/data && mkdir /ipex_llm/model && \
 # install pytorch 2.0.1
     apt-get update && \
     apt-get install -y python3-pip python3.9-dev python3-wheel git software-properties-common && \
@@ -29,12 +29,12 @@ RUN mkdir /bigdl/data && mkdir /bigdl/model && \
     pip install intel_extension_for_pytorch==2.0.100 && \
     pip install oneccl_bind_pt -f https://developer.intel.com/ipex-whl-stable && \
 # install transformers etc.
-    cd /bigdl && \
+    cd /ipex_llm && \
     git clone https://github.com/huggingface/transformers.git && \
     cd transformers && \
     git reset --hard 057e1d74733f52817dc05b673a340b4e3ebea08c && \
     pip install . && \
-    pip install -r /bigdl/requirements.txt && \
+    pip install -r /ipex_llm/requirements.txt && \
 # install python
     add-apt-repository ppa:deadsnakes/ppa -y && \
     apt-get install -y python3.9 && \
@@ -56,9 +56,9 @@ RUN mkdir /bigdl/data && mkdir /bigdl/model && \
     echo "    UserKnownHostsFile /dev/null" >> /etc/ssh/ssh_config && \
     sed -i 's/#\(StrictModes \).*/\1no/g' /etc/ssh/sshd_config
 
-COPY ./bigdl-lora-finetuing-entrypoint.sh /bigdl/bigdl-lora-finetuing-entrypoint.sh
-COPY ./lora_finetune.py /bigdl/lora_finetune.py
+COPY ./ipex-llm-lora-finetuing-entrypoint.sh /ipex_llm/ipex-llm-lora-finetuing-entrypoint.sh
+COPY ./lora_finetune.py /ipex_llm/lora_finetune.py
 
-RUN chown -R mpiuser /bigdl
+RUN chown -R mpiuser /ipex_llm
 USER mpiuser
 ENTRYPOINT ["/bin/bash"]
diff --git a/docker/llm/finetune/lora/cpu/docker/README.md b/docker/llm/finetune/lora/cpu/docker/README.md
index 93013bc0..d93f3f2a 100644
--- a/docker/llm/finetune/lora/cpu/docker/README.md
+++ b/docker/llm/finetune/lora/cpu/docker/README.md
@@ -1,11 +1,11 @@
 ## Fine-tune LLM with One CPU
 
-### 1. Prepare BigDL image for Lora Finetuning
+### 1. Prepare IPEX LLM image for Lora Finetuning
 
 You can download directly from Dockerhub like:
 
 ```bash
-docker pull intelanalytics/bigdl-llm-finetune-lora-cpu:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-finetune-lora-cpu:2.5.0-SNAPSHOT
 ```
 
 Or build the image from source:
@@ -17,7 +17,7 @@ export HTTPS_PROXY=your_https_proxy
 docker build \
   --build-arg http_proxy=${HTTP_PROXY} \
   --build-arg https_proxy=${HTTPS_PROXY} \
-  -t intelanalytics/bigdl-llm-finetune-lora-cpu:2.5.0-SNAPSHOT \
+  -t intelanalytics/ipex-llm-finetune-lora-cpu:2.5.0-SNAPSHOT \
   -f ./Dockerfile .
 ```
 
@@ -27,13 +27,13 @@ Here, we try to finetune [Llama2-7b](https://huggingface.co/meta-llama/Llama-2-7
 
 ```
 docker run -itd \
- --name=bigdl-llm-fintune-lora-cpu \
+ --name=ipex-llm-fintune-lora-cpu \
  --cpuset-cpus="your_expected_range_of_cpu_numbers" \
  -e STANDALONE_DOCKER=TRUE \
  -e WORKER_COUNT_DOCKER=your_worker_count \
- -v your_downloaded_base_model_path:/bigdl/model \
- -v your_downloaded_data_path:/bigdl/data/alpaca_data_cleaned_archive.json \
- intelanalytics/bigdl-llm-finetune-lora-cpu:2.5.0-SNAPSHOT \
+ -v your_downloaded_base_model_path:/ipex_llm/model \
+ -v your_downloaded_data_path:/ipex_llm/data/alpaca_data_cleaned_archive.json \
+ intelanalytics/ipex-llm-finetune-lora-cpu:2.5.0-SNAPSHOT \
  bash
 ```
 
@@ -44,21 +44,21 @@ You can adjust the configuration according to your own environment. After our te
 Enter the running container:
 
 ```
-docker exec -it bigdl-llm-fintune-lora-cpu bash
+docker exec -it ipex-llm-fintune-lora-cpu bash
 ```
 
 Then, run the script to start finetuning:
 
 ```
-bash /bigdl/bigdl-lora-finetuing-entrypoint.sh
+bash /ipex_llm/ipex-llm-lora-finetuing-entrypoint.sh
 ```
 
 After minutes, it is expected to get results like:
 
 ```
 Training Alpaca-LoRA model with params:
-base_model: /bigdl/model/
-data_path: /bigdl/data/alpaca_data_cleaned_archive.json
+base_model: /ipex_llm/model/
+data_path: /ipex_llm/data/alpaca_data_cleaned_archive.json
 output_dir: /home/mpiuser/finetuned_model
 batch_size: 128
 micro_batch_size: 8
diff --git a/docker/llm/finetune/lora/cpu/docker/bigdl-lora-finetuing-entrypoint.sh b/docker/llm/finetune/lora/cpu/docker/ipex-llm-lora-finetuing-entrypoint.sh
similarity index 86%
rename from docker/llm/finetune/lora/cpu/docker/bigdl-lora-finetuing-entrypoint.sh
rename to docker/llm/finetune/lora/cpu/docker/ipex-llm-lora-finetuing-entrypoint.sh
index 3bd2305a..fab52a47 100644
--- a/docker/llm/finetune/lora/cpu/docker/bigdl-lora-finetuing-entrypoint.sh
+++ b/docker/llm/finetune/lora/cpu/docker/ipex-llm-lora-finetuing-entrypoint.sh
@@ -15,9 +15,9 @@ then
      -genv KMP_AFFINITY="granularity=fine,none" \
      -genv KMP_BLOCKTIME=1 \
      -genv TF_ENABLE_ONEDNN_OPTS=1 \
-     python /bigdl/lora_finetune.py \
-       --base_model '/bigdl/model/'  \
-       --data_path "/bigdl/data/alpaca_data_cleaned_archive.json" \
+     python /ipex_llm/lora_finetune.py \
+       --base_model '/ipex_llm/model/'  \
+       --data_path "/ipex_llm/data/alpaca_data_cleaned_archive.json" \
        --output_dir "/home/mpiuser/finetuned_model" \
        --micro_batch_size 8 \
        --bf16 
@@ -29,7 +29,7 @@ else
   if [ "$WORKER_ROLE" = "launcher" ]
   then
     sed "s/:1/ /g" /etc/mpi/hostfile > /home/mpiuser/hostfile
-    export DATA_PATH="/bigdl/data/$DATA_SUB_PATH"
+    export DATA_PATH="/ipex_llm/data/$DATA_SUB_PATH"
     sleep 10
     mpirun \
       -n $WORLD_SIZE \
@@ -40,8 +40,8 @@ else
       -genv KMP_AFFINITY="granularity=fine,none" \
       -genv KMP_BLOCKTIME=1 \
       -genv TF_ENABLE_ONEDNN_OPTS=1 \
-      python /bigdl/lora_finetune.py \
-        --base_model '/bigdl/model/'  \
+      python /ipex_llm/lora_finetune.py \
+        --base_model '/ipex_llm/model/'  \
         --data_path "$DATA_PATH" \
         --output_dir "/home/mpiuser/finetuned_model" \
         --micro_batch_size $MICRO_BATCH_SIZE \
diff --git a/docker/llm/finetune/lora/cpu/kubernetes/Chart.yaml b/docker/llm/finetune/lora/cpu/kubernetes/Chart.yaml
index dead414b..bbb69e9b 100644
--- a/docker/llm/finetune/lora/cpu/kubernetes/Chart.yaml
+++ b/docker/llm/finetune/lora/cpu/kubernetes/Chart.yaml
@@ -1,6 +1,6 @@
 apiVersion: v2
 name: trusted-fintune-service
-description: A Helm chart for BigDL PPML Trusted BigData Service on Kubernetes
+description: A Helm chart for IPEX-LLM Finetuning Service on Kubernetes
 type: application
 version: 1.1.27
 appVersion: "1.16.0"
diff --git a/docker/llm/finetune/lora/cpu/kubernetes/templates/bigdl-lora-finetuning-job.yaml b/docker/llm/finetune/lora/cpu/kubernetes/templates/ipex-llm-lora-finetuning-job.yaml
similarity index 75%
rename from docker/llm/finetune/lora/cpu/kubernetes/templates/bigdl-lora-finetuning-job.yaml
rename to docker/llm/finetune/lora/cpu/kubernetes/templates/ipex-llm-lora-finetuning-job.yaml
index 34d7170a..972eda57 100644
--- a/docker/llm/finetune/lora/cpu/kubernetes/templates/bigdl-lora-finetuning-job.yaml
+++ b/docker/llm/finetune/lora/cpu/kubernetes/templates/ipex-llm-lora-finetuning-job.yaml
@@ -1,8 +1,8 @@
 apiVersion: kubeflow.org/v2beta1
 kind: MPIJob
 metadata:
-  name: bigdl-lora-finetuning-job
-  namespace: bigdl-lora-finetuning
+  name: ipex-llm-lora-finetuning-job
+  namespace: ipex-llm-lora-finetuning
 spec:
   slotsPerWorker: 1
   runPolicy:
@@ -20,10 +20,10 @@ spec:
                claimName: nfs-pvc
            containers:
            - image: {{ .Values.imageName }}
-             name: bigdl-ppml-finetuning-launcher
+             name: ipex-llm-lora-finetuning-launcher
              securityContext:
               runAsUser: 1000
-             command: ['sh' , '-c', 'bash /bigdl/bigdl-lora-finetuing-entrypoint.sh']
+             command: ['sh' , '-c', 'bash /ipex_llm/ipex-llm-lora-finetuing-entrypoint.sh']
              env:
              - name: WORKER_ROLE
                value: "launcher"
@@ -34,7 +34,7 @@ spec:
              - name: MASTER_PORT
                value: "42679"
              - name: MASTER_ADDR
-               value: "bigdl-lora-finetuning-job-worker-0.bigdl-lora-finetuning-job-worker"
+               value: "ipex-llm-lora-finetuning-job-worker-0.ipex-llm-lora-finetuning-job-worker"
              - name: DATA_SUB_PATH
                value: "{{ .Values.dataSubPath }}"
              - name: OMP_NUM_THREADS
@@ -46,20 +46,20 @@ spec:
              volumeMounts:
              - name: nfs-storage
                subPath: {{ .Values.modelSubPath }}
-               mountPath: /bigdl/model
+               mountPath: /ipex_llm/model
              - name: nfs-storage
                subPath: {{ .Values.dataSubPath }}
-               mountPath: "/bigdl/data/{{ .Values.dataSubPath }}"
+               mountPath: "/ipex_llm/data/{{ .Values.dataSubPath }}"
     Worker:
       replicas: {{ .Values.trainerNum }}
       template:
         spec:
           containers:
           - image: {{ .Values.imageName }}
-            name: bigdl-ppml-finetuning-worker
+            name: ipex-llm-lora-finetuning-worker
             securityContext:
               runAsUser: 1000
-            command: ['sh' , '-c', 'bash /bigdl/bigdl-lora-finetuing-entrypoint.sh']
+            command: ['sh' , '-c', 'bash /ipex_llm/ipex-llm-lora-finetuing-entrypoint.sh']
             env:
             - name: WORKER_ROLE
               value: "trainer"
@@ -70,7 +70,7 @@ spec:
             - name: MASTER_PORT
               value: "42679"
             - name: MASTER_ADDR
-              value: "bigdl-lora-finetuning-job-worker-0.bigdl-lora-finetuning-job-worker"
+              value: "ipex-llm-lora-finetuning-job-worker-0.ipex-llm-lora-finetuning-job-worker"
             - name: LOCAL_POD_NAME
               valueFrom:
                 fieldRef:
@@ -78,10 +78,10 @@ spec:
             volumeMounts:
             - name: nfs-storage
               subPath: {{ .Values.modelSubPath }}
-              mountPath: /bigdl/model
+              mountPath: /ipex_llm/model
             - name: nfs-storage
               subPath: {{ .Values.dataSubPath }}
-              mountPath: "/bigdl/data/{{ .Values.dataSubPath }}"
+              mountPath: "/ipex_llm/data/{{ .Values.dataSubPath }}"
             resources:
               requests:
                 cpu: {{ .Values.cpuPerPod }}
diff --git a/docker/llm/finetune/qlora/cpu/kubernetes/templates/bigdl-qlora-finetuning-namespace.yaml b/docker/llm/finetune/lora/cpu/kubernetes/templates/ipex-llm-lora-finetuning-namespace.yaml
similarity index 55%
rename from docker/llm/finetune/qlora/cpu/kubernetes/templates/bigdl-qlora-finetuning-namespace.yaml
rename to docker/llm/finetune/lora/cpu/kubernetes/templates/ipex-llm-lora-finetuning-namespace.yaml
index c873aa9f..8ca99ec5 100644
--- a/docker/llm/finetune/qlora/cpu/kubernetes/templates/bigdl-qlora-finetuning-namespace.yaml
+++ b/docker/llm/finetune/lora/cpu/kubernetes/templates/ipex-llm-lora-finetuning-namespace.yaml
@@ -1,4 +1,4 @@
 apiVersion: v1
 kind: Namespace
 metadata:
-  name: bigdl-qlora-finetuning
+  name: ipex-llm-lora-finetuning
diff --git a/docker/llm/finetune/lora/cpu/kubernetes/templates/nfs-pv.yaml b/docker/llm/finetune/lora/cpu/kubernetes/templates/nfs-pv.yaml
index 63e90ba7..38e8e722 100644
--- a/docker/llm/finetune/lora/cpu/kubernetes/templates/nfs-pv.yaml
+++ b/docker/llm/finetune/lora/cpu/kubernetes/templates/nfs-pv.yaml
@@ -1,8 +1,8 @@
 apiVersion: v1
 kind: PersistentVolume
 metadata:
-  name: nfs-pv-bigdl-lora-finetuning
-  namespace: bigdl-lora-finetuning
+  name: nfs-pv-ipex-llm-lora-finetuning
+  namespace: ipex-llm-lora-finetuning
 spec:
   capacity:
     storage: 15Gi
diff --git a/docker/llm/finetune/lora/cpu/kubernetes/templates/nfs-pvc.yaml b/docker/llm/finetune/lora/cpu/kubernetes/templates/nfs-pvc.yaml
index 5c2284eb..a65bcdb9 100644
--- a/docker/llm/finetune/lora/cpu/kubernetes/templates/nfs-pvc.yaml
+++ b/docker/llm/finetune/lora/cpu/kubernetes/templates/nfs-pvc.yaml
@@ -2,7 +2,7 @@ kind: PersistentVolumeClaim
 apiVersion: v1
 metadata:
   name: nfs-pvc
-  namespace: bigdl-lora-finetuning
+  namespace: ipex-llm-lora-finetuning
 spec:
   accessModes:
   - ReadWriteOnce
diff --git a/docker/llm/finetune/lora/cpu/kubernetes/values.yaml b/docker/llm/finetune/lora/cpu/kubernetes/values.yaml
index 36e137b8..b082d0aa 100644
--- a/docker/llm/finetune/lora/cpu/kubernetes/values.yaml
+++ b/docker/llm/finetune/lora/cpu/kubernetes/values.yaml
@@ -1,4 +1,4 @@
-imageName: intelanalytics/bigdl-llm-finetune-lora-cpu:2.5.0-SNAPSHOT
+imageName: intelanalytics/ipex-llm-finetune-lora-cpu:2.5.0-SNAPSHOT
 trainerNum: 8
 microBatchSize: 8
 nfsServerIp: your_nfs_server_ip
diff --git a/docker/llm/finetune/qlora/cpu/docker/Dockerfile b/docker/llm/finetune/qlora/cpu/docker/Dockerfile
index 2cfb1645..3cf2adc2 100644
--- a/docker/llm/finetune/qlora/cpu/docker/Dockerfile
+++ b/docker/llm/finetune/qlora/cpu/docker/Dockerfile
@@ -18,7 +18,7 @@ ENV TRANSFORMERS_COMMIT_ID=95fe0f5
 COPY --from=key-getter /root/intel-oneapi-archive-keyring.gpg /usr/share/keyrings/intel-oneapi-archive-keyring.gpg
 RUN echo "deb [signed-by=/usr/share/keyrings/intel-oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main " > /etc/apt/sources.list.d/oneAPI.list
 
-RUN mkdir -p /bigdl/data && mkdir -p /bigdl/model && \
+RUN mkdir -p /ipex_llm/data && mkdir -p /ipex_llm/model && \
     # install pytorch 2.1.0
     apt-get update && \
     apt-get install -y --no-install-recommends python3-pip python3.9-dev python3-wheel python3.9-distutils git software-properties-common && \
@@ -27,8 +27,8 @@ RUN mkdir -p /bigdl/data && mkdir -p /bigdl/model && \
     pip3 install --upgrade pip && \
     export PIP_DEFAULT_TIMEOUT=100 && \
     pip install --upgrade torch==2.1.0 && \
-    # install CPU bigdl-llm
-    pip3 install --pre --upgrade bigdl-llm[all] && \
+    # install CPU ipex-llm
+    pip3 install --pre --upgrade ipex-llm[all] && \
     # install ipex and oneccl
     pip install https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/cpu/intel_extension_for_pytorch-2.1.0%2Bcpu-cp39-cp39-linux_x86_64.whl && \
     pip install oneccl_bind_pt -f https://developer.intel.com/ipex-whl-stable && \
@@ -41,16 +41,16 @@ RUN mkdir -p /bigdl/data && mkdir -p /bigdl/model && \
     apt-get update && apt-get install -y curl wget gpg gpg-agent software-properties-common libunwind8-dev && \
     # get qlora example code
     ln -s /usr/bin/python3 /usr/bin/python && \
-    cd /bigdl && \
-    git clone https://github.com/intel-analytics/BigDL.git && \
-    mv BigDL/python/llm/example/CPU/QLoRA-FineTuning/* . && \
+    cd /ipex_llm && \
+    git clone https://github.com/intel-analytics/IPEX-LLM.git && \
+    mv IPEX-LLM/python/llm/example/CPU/QLoRA-FineTuning/* . && \
     mkdir -p /GPU/LLM-Finetuning && \
-    mv BigDL/python/llm/example/GPU/LLM-Finetuning/common /GPU/LLM-Finetuning/common && \
-    rm -r BigDL && \
-    chown -R mpiuser /bigdl
+    mv IPEX-LLM/python/llm/example/GPU/LLM-Finetuning/common /GPU/LLM-Finetuning/common && \
+    rm -r IPEX-LLM && \
+    chown -R mpiuser /ipex_llm
 
 # for standalone
-COPY ./start-qlora-finetuning-on-cpu.sh /bigdl/start-qlora-finetuning-on-cpu.sh
+COPY ./start-qlora-finetuning-on-cpu.sh /ipex_llm/start-qlora-finetuning-on-cpu.sh
 
 USER mpiuser
 
diff --git a/docker/llm/finetune/qlora/cpu/docker/Dockerfile.k8s b/docker/llm/finetune/qlora/cpu/docker/Dockerfile.k8s
index d2991985..71a8a5e1 100644
--- a/docker/llm/finetune/qlora/cpu/docker/Dockerfile.k8s
+++ b/docker/llm/finetune/qlora/cpu/docker/Dockerfile.k8s
@@ -19,7 +19,7 @@ ENV TRANSFORMERS_COMMIT_ID=95fe0f5
 COPY --from=key-getter /root/intel-oneapi-archive-keyring.gpg /usr/share/keyrings/intel-oneapi-archive-keyring.gpg
 RUN echo "deb [signed-by=/usr/share/keyrings/intel-oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main " > /etc/apt/sources.list.d/oneAPI.list
 
-RUN mkdir -p /bigdl/data && mkdir -p /bigdl/model && \
+RUN mkdir -p /ipex_llm/data && mkdir -p /ipex_llm/model && \
     apt-get update && \
     apt install -y --no-install-recommends openssh-server openssh-client libcap2-bin gnupg2 ca-certificates \ 
     python3-pip python3.9-dev python3-wheel python3.9-distutils git software-properties-common && \
@@ -40,8 +40,8 @@ RUN mkdir -p /bigdl/data && mkdir -p /bigdl/model && \
     pip3 install --upgrade pip && \
     export PIP_DEFAULT_TIMEOUT=100 && \
     pip install --upgrade torch==2.1.0 --index-url https://download.pytorch.org/whl/cpu && \
-    # install CPU bigdl-llm
-    pip3 install --pre --upgrade bigdl-llm[all] && \
+    # install CPU ipex-llm
+    pip3 install --pre --upgrade ipex-llm[all] && \
     # install ipex and oneccl
     pip install intel_extension_for_pytorch==2.0.100 && \
     pip install oneccl_bind_pt -f https://developer.intel.com/ipex-whl-stable && \
@@ -59,14 +59,14 @@ RUN mkdir -p /bigdl/data && mkdir -p /bigdl/model && \
     rm -rf /var/lib/apt/lists/* && \
     # get qlora example code
     ln -s /usr/bin/python3 /usr/bin/python && \
-    cd /bigdl && \
-    git clone https://github.com/intel-analytics/BigDL.git && \
-    mv BigDL/python/llm/example/CPU/QLoRA-FineTuning/* . && \
-    rm -r BigDL && \
-    chown -R mpiuser /bigdl
+    cd /ipex_llm && \
+    git clone https://github.com/intel-analytics/IPEX-LLM.git && \
+    mv IPEX-LLM/python/llm/example/CPU/QLoRA-FineTuning/* . && \
+    rm -r IPEX-LLM && \
+    chown -R mpiuser /ipex_llm
 
 # for k8s
-COPY ./bigdl-qlora-finetuing-entrypoint.sh /bigdl/bigdl-qlora-finetuing-entrypoint.sh
+COPY ./ipex-llm-qlora-finetuing-entrypoint.sh /ipex_llm/ipex-llm-qlora-finetuing-entrypoint.sh
 
 USER mpiuser
 
diff --git a/docker/llm/finetune/qlora/cpu/docker/README.md b/docker/llm/finetune/qlora/cpu/docker/README.md
index 98cb1dfe..16e6e11d 100644
--- a/docker/llm/finetune/qlora/cpu/docker/README.md
+++ b/docker/llm/finetune/qlora/cpu/docker/README.md
@@ -1,6 +1,6 @@
-## Fine-tune LLM with BigDL LLM Container
+## Fine-tune LLM with IPEX LLM Container
 
-The following shows how to fine-tune LLM with Quantization (QLoRA built on BigDL-LLM 4bit optimizations) in a docker environment, which is accelerated by Intel CPU.
+The following shows how to fine-tune LLM with Quantization (QLoRA built on IPEX-LLM 4bit optimizations) in a docker environment, which is accelerated by Intel CPU.
 
 ### 1. Prepare Docker Image
 
@@ -8,10 +8,10 @@ You can download directly from Dockerhub like:
 
 ```bash
 # For standalone
-docker pull intelanalytics/bigdl-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
 
 # For k8s
-docker pull intelanalytics/bigdl-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT
 ```
 
 Or build the image from source:
@@ -24,7 +24,7 @@ export HTTPS_PROXY=your_https_proxy
 docker build \
   --build-arg http_proxy=${HTTP_PROXY} \
   --build-arg https_proxy=${HTTPS_PROXY} \
-  -t intelanalytics/bigdl-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT \
+  -t intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT \
   -f ./Dockerfile .
 
 # For k8s
@@ -34,7 +34,7 @@ export HTTPS_PROXY=your_https_proxy
 docker build \
   --build-arg http_proxy=${HTTP_PROXY} \
   --build-arg https_proxy=${HTTPS_PROXY} \
-  -t intelanalytics/bigdl-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT \
+  -t intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT \
   -f ./Dockerfile.k8s .
 ```
 
@@ -50,12 +50,12 @@ export HTTPS_PROXY=your_https_proxy
 
 docker run -itd \
    --net=host \
-   --name=bigdl-llm-fintune-qlora-cpu \
+   --name=ipex-llm-fintune-qlora-cpu \
    -e http_proxy=${HTTP_PROXY} \
    -e https_proxy=${HTTPS_PROXY} \
-   -v $BASE_MODE_PATH:/bigdl/model \
-   -v $DATA_PATH:/bigdl/data/alpaca-cleaned \
-   intelanalytics/bigdl-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
+   -v $BASE_MODE_PATH:/ipex_llm/model \
+   -v $DATA_PATH:/ipex_llm/data/alpaca-cleaned \
+   intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
 ```
 
 The download and mount of base model and data to a docker container demonstrates a standard fine-tuning process. You can skip this step for a quick start, and in this way, the fine-tuning codes will automatically download the needed files:
@@ -66,10 +66,10 @@ export HTTPS_PROXY=your_https_proxy
 
 docker run -itd \
    --net=host \
-   --name=bigdl-llm-fintune-qlora-cpu \
+   --name=ipex-llm-fintune-qlora-cpu \
    -e http_proxy=${HTTP_PROXY} \
    -e https_proxy=${HTTPS_PROXY} \
-   intelanalytics/bigdl-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
+   intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
 ```
 
 However, we do recommend you to handle them manually, because the automatical download can be blocked by Internet access and Huggingface authentication etc. according to different environment, and the manual method allows you to fine-tune in a custom way (with different base model and dataset).
@@ -79,14 +79,14 @@ However, we do recommend you to handle them manually, because the automatical do
 Enter the running container:
 
 ```bash
-docker exec -it bigdl-llm-fintune-qlora-cpu bash
+docker exec -it ipex-llm-fintune-qlora-cpu bash
 ```
 
 Then, start QLoRA fine-tuning:
 If the machine memory is not enough, you can try to set `use_gradient_checkpointing=True`.
 
 ```bash
-cd /bigdl
+cd /ipex_llm
 bash start-qlora-finetuning-on-cpu.sh
 ```
 
@@ -120,19 +120,17 @@ Then you can use `./outputs/checkpoint-200-merged` as a normal huggingface trans
 
 ### 4. Start Multi-Porcess Fine-Tuning in One Docker
 
-<img src="https://github.com/Uxito-Ada/BigDL/assets/60865256/f25c43b3-2b24-4476-a0fe-804c0ef3c36c" height="240px"><br>
-
-Multi-process parallelism enables higher performance for QLoRA fine-tuning, e.g. Xeon server series with multi-processor-socket architecture is suitable to run one instance on each QLoRA. This can be done by simply invoke >=2 OneCCL instances in BigDL QLoRA docker:
+Multi-process parallelism enables higher performance for QLoRA fine-tuning, e.g. Xeon server series with multi-processor-socket architecture is suitable to run one instance on each QLoRA. This can be done by simply invoke >=2 OneCCL instances in IPEX-LLM QLoRA docker:
 
 ```bash
 docker run -itd \
- --name=bigdl-llm-fintune-qlora-cpu \
+ --name=ipex-llm-fintune-qlora-cpu \
  --cpuset-cpus="your_expected_range_of_cpu_numbers" \
  -e STANDALONE_DOCKER=TRUE \
  -e WORKER_COUNT_DOCKER=your_worker_count \
- -v your_downloaded_base_model_path:/bigdl/model \
- -v your_downloaded_data_path:/bigdl/data/alpaca_data_cleaned_archive.json \
- intelanalytics/bigdl-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
+ -v your_downloaded_base_model_path:/ipex_llm/model \
+ -v your_downloaded_data_path:/ipex_llm/data/alpaca_data_cleaned_archive.json \
+ intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.5.0-SNAPSHOT
 ```
 
 Note that `STANDALONE_DOCKER` is set to **TRUE** here.
@@ -145,4 +143,4 @@ bash start-qlora-finetuning-on-cpu.sh
 
 ### 5. Start Distributed Fine-Tuning on Kubernetes
 
-Besides multi-process mode, you can also run QLoRA on a kubernetes cluster. please refer [here](https://github.com/intel-analytics/BigDL/blob/main/docker/llm/finetune/qlora/cpu/kubernetes/README.md).
+Besides multi-process mode, you can also run QLoRA on a kubernetes cluster. please refer [here](https://github.com/intel-analytics/IPEX-LLM/blob/main/docker/llm/finetune/qlora/cpu/kubernetes/README.md).
diff --git a/docker/llm/finetune/qlora/cpu/docker/bigdl-qlora-finetuing-entrypoint.sh b/docker/llm/finetune/qlora/cpu/docker/ipex-llm-qlora-finetuing-entrypoint.sh
similarity index 87%
rename from docker/llm/finetune/qlora/cpu/docker/bigdl-qlora-finetuing-entrypoint.sh
rename to docker/llm/finetune/qlora/cpu/docker/ipex-llm-qlora-finetuing-entrypoint.sh
index 3ed37dbb..8d468bbd 100644
--- a/docker/llm/finetune/qlora/cpu/docker/bigdl-qlora-finetuing-entrypoint.sh
+++ b/docker/llm/finetune/qlora/cpu/docker/ipex-llm-qlora-finetuing-entrypoint.sh
@@ -3,8 +3,8 @@
 set -x
 source /opt/intel/oneapi/setvars.sh
 export CCL_WORKER_COUNT=$WORLD_SIZE
-source bigdl-llm-init -t
-cd /bigdl/alpaca-qlora
+source ipex-llm-init -t
+cd /ipex_llm/alpaca-qlora
 if [ "$WORKER_ROLE" = "launcher" ]
 then
   sed "s/:1/ /g" /etc/mpi/hostfile > /home/mpiuser/hostfile
@@ -24,9 +24,9 @@ then
     -genv KMP_AFFINITY="granularity=fine,none" \
     -genv KMP_BLOCKTIME=1 \
     -genv TF_ENABLE_ONEDNN_OPTS=1 \
-    python /bigdl/alpaca-qlora/alpaca_qlora_finetuning_cpu.py \
-      --base_model '/bigdl/model'  \
-      --data_path "/bigdl/data" \
+    python /ipex_llm/alpaca-qlora/alpaca_qlora_finetuning_cpu.py \
+      --base_model '/ipex_llm/model'  \
+      --data_path "/ipex_llm/data" \
       --output_dir "/home/mpiuser/finetuned_model" \
       --batch_size 128 \
       --micro_batch_size $MICRO_BATCH_SIZE \
diff --git a/docker/llm/finetune/qlora/cpu/docker/start-qlora-finetuning-on-cpu.sh b/docker/llm/finetune/qlora/cpu/docker/start-qlora-finetuning-on-cpu.sh
index 0a428334..90e0d885 100644
--- a/docker/llm/finetune/qlora/cpu/docker/start-qlora-finetuning-on-cpu.sh
+++ b/docker/llm/finetune/qlora/cpu/docker/start-qlora-finetuning-on-cpu.sh
@@ -1,10 +1,10 @@
 #!/bin/bash
 set -x
-cd /bigdl
+cd /ipex_llm
 export USE_XETLA=OFF
 export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
 source /opt/intel/oneapi/setvars.sh
-source bigdl-llm-init -t
+source ipex-llm-init -t
 
 if [ -d "./model" ];
 then
diff --git a/docker/llm/finetune/qlora/cpu/kubernetes/Chart.yaml b/docker/llm/finetune/qlora/cpu/kubernetes/Chart.yaml
index 3606401c..2c750b4d 100644
--- a/docker/llm/finetune/qlora/cpu/kubernetes/Chart.yaml
+++ b/docker/llm/finetune/qlora/cpu/kubernetes/Chart.yaml
@@ -1,6 +1,6 @@
 apiVersion: v2
-name: bigdl-fintune-service
-description: A Helm chart for BigDL Finetune Service on Kubernetes
+name: ipex-fintune-service
+description: A Helm chart for IPEX-LLM Finetune Service on Kubernetes
 type: application
 version: 1.1.27
 appVersion: "1.16.0"
diff --git a/docker/llm/finetune/qlora/cpu/kubernetes/README.md b/docker/llm/finetune/qlora/cpu/kubernetes/README.md
index 73fb6491..4279ea66 100644
--- a/docker/llm/finetune/qlora/cpu/kubernetes/README.md
+++ b/docker/llm/finetune/qlora/cpu/kubernetes/README.md
@@ -1,12 +1,10 @@
 ## Run Distributed QLoRA Fine-Tuning on Kubernetes with OneCCL
 
-![image](https://github.com/intel-analytics/BigDL/assets/60865256/825f47d9-c864-4f39-a331-adb1e3cb528e)
-
-BigDL here provides a CPU optimization to accelerate the QLoRA finetuning of Llama2-7b, in the power of mixed-precision and distributed training. Detailedly, [Intel OneCCL](https://www.intel.com/content/www/us/en/developer/tools/oneapi/oneccl.html), an available Hugging Face backend, is able to speed up the Pytorch computation with BF16 datatype on CPUs, as well as parallel processing on Kubernetes enabled by [Intel MPI](https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html). Moreover, advanaced quantization of BigDL-LLM has been applied to improve memory utilization, which makes CPU large-scale fine-tuning possible with runtime NF4 model storage and BF16 computing types.
+IPEX-LLM here provides a CPU optimization to accelerate the QLoRA finetuning of Llama2-7b, in the power of mixed-precision and distributed training. Detailedly, [Intel OneCCL](https://www.intel.com/content/www/us/en/developer/tools/oneapi/oneccl.html), an available Hugging Face backend, is able to speed up the Pytorch computation with BF16 datatype on CPUs, as well as parallel processing on Kubernetes enabled by [Intel MPI](https://www.intel.com/content/www/us/en/developer/tools/oneapi/mpi-library.html). Moreover, advanaced quantization of IPEX-LLM has been applied to improve memory utilization, which makes CPU large-scale fine-tuning possible with runtime NF4 model storage and BF16 computing types.
 
 The architecture is illustrated in the following:
 
-As above, BigDL implements its MPI training with [Kubeflow MPI operator](https://github.com/kubeflow/mpi-operator/tree/master), which encapsulates the deployment as MPIJob CRD, and assists users to handle the construction of a MPI worker cluster on Kubernetes, such as public key distribution, SSH connection, and log collection. 
+As above, IPEX-LLM implements its MPI training with [Kubeflow MPI operator](https://github.com/kubeflow/mpi-operator/tree/master), which encapsulates the deployment as MPIJob CRD, and assists users to handle the construction of a MPI worker cluster on Kubernetes, such as public key distribution, SSH connection, and log collection. 
 
 Now, let's go to deploy a QLoRA finetuning to create a new LLM from Llama2-7b.
 
@@ -18,7 +16,7 @@ Follow [here](https://github.com/kubeflow/mpi-operator/tree/master#installation)
 
 ### 2. Download Image, Base Model and Finetuning Data
 
-Follow [here](https://github.com/intel-analytics/BigDL/tree/main/docker/llm/finetune/qlora/cpu/docker#1-prepare-docker-image) to prepare BigDL QLoRA Finetuning image in your cluster.
+Follow [here](https://github.com/intel-analytics/IPEX-LLM/tree/main/docker/llm/finetune/qlora/cpu/docker#1-prepare-docker-image) to prepare IPEX-LLM QLoRA Finetuning image in your cluster.
 
 As finetuning is from a base model, first download [Llama2-7b model from the public download site of Hugging Face](https://huggingface.co/meta-llama/Llama-2-7b). Then, download [cleaned alpaca data](https://raw.githubusercontent.com/tloen/alpaca-lora/main/alpaca_data_cleaned_archive.json), which contains all kinds of general knowledge and has already been cleaned. Next, move the downloaded files to a shared directory on your NFS server.
 
@@ -32,12 +30,12 @@ After preparing parameters in `./kubernetes/values.yaml`, submit the job as befl
 
 ```bash
 cd ./kubernetes
-helm install bigdl-qlora-finetuning .
+helm install ipex-llm-qlora-finetuning .
 ```
 
 ### 4. Check Deployment
 ```bash
-kubectl get all -n bigdl-qlora-finetuning # you will see launcher and worker pods running
+kubectl get all -n ipex-llm-qlora-finetuning # you will see launcher and worker pods running
 ```
 
 ### 5. Check Finetuning Process
@@ -45,8 +43,8 @@ kubectl get all -n bigdl-qlora-finetuning # you will see launcher and worker pod
 After deploying successfully, you can find a launcher pod, and then go inside this pod and check the logs collected from all workers.
 
 ```bash
-kubectl get all -n bigdl-qlora-finetuning # you will see a launcher pod
-kubectl exec -it <launcher_pod_name> bash -n bigdl-qlora-finetuning # enter launcher pod
+kubectl get all -n ipex-llm-qlora-finetuning # you will see a launcher pod
+kubectl exec -it <launcher_pod_name> bash -n ipex-llm-qlora-finetuning # enter launcher pod
 cat launcher.log # display logs collected from other workers
 ```
 
diff --git a/docker/llm/finetune/lora/cpu/kubernetes/templates/bigdl-lora-finetuning-namespace.yaml b/docker/llm/finetune/qlora/cpu/kubernetes/templates/ipex-llm-finetuning-namespace.yaml
similarity index 54%
rename from docker/llm/finetune/lora/cpu/kubernetes/templates/bigdl-lora-finetuning-namespace.yaml
rename to docker/llm/finetune/qlora/cpu/kubernetes/templates/ipex-llm-finetuning-namespace.yaml
index b521299e..2bb03dc4 100644
--- a/docker/llm/finetune/lora/cpu/kubernetes/templates/bigdl-lora-finetuning-namespace.yaml
+++ b/docker/llm/finetune/qlora/cpu/kubernetes/templates/ipex-llm-finetuning-namespace.yaml
@@ -1,4 +1,4 @@
 apiVersion: v1
 kind: Namespace
 metadata:
-  name: bigdl-lora-finetuning
+  name: ipex-llm-qlora-finetuning
diff --git a/docker/llm/finetune/qlora/cpu/kubernetes/templates/bigdl-qlora-finetuning-job.yaml b/docker/llm/finetune/qlora/cpu/kubernetes/templates/ipex-llm-qlora-finetuning-job.yaml
similarity index 81%
rename from docker/llm/finetune/qlora/cpu/kubernetes/templates/bigdl-qlora-finetuning-job.yaml
rename to docker/llm/finetune/qlora/cpu/kubernetes/templates/ipex-llm-qlora-finetuning-job.yaml
index 71b1cd03..a6fd8477 100644
--- a/docker/llm/finetune/qlora/cpu/kubernetes/templates/bigdl-qlora-finetuning-job.yaml
+++ b/docker/llm/finetune/qlora/cpu/kubernetes/templates/ipex-llm-qlora-finetuning-job.yaml
@@ -1,8 +1,8 @@
 apiVersion: kubeflow.org/v2beta1
 kind: MPIJob
 metadata:
-  name: bigdl-qlora-finetuning-job
-  namespace: bigdl-qlora-finetuning
+  name: ipex-llm-qlora-finetuning-job
+  namespace: ipex-llm-qlora-finetuning
 spec:
   slotsPerWorker: 1
   runPolicy:
@@ -20,10 +20,10 @@ spec:
                claimName: nfs-pvc
            containers:
            - image: {{ .Values.imageName }}
-             name: bigdl-qlora-finetuning-launcher
+             name: ipex-llm-qlora-finetuning-launcher
              securityContext:
               runAsUser: 1000
-             command: ['sh' , '-c', 'bash /bigdl/bigdl-qlora-finetuing-entrypoint.sh']
+             command: ['sh' , '-c', 'bash /ipex_llm/ipex-llm-qlora-finetuing-entrypoint.sh']
              env:
              - name: WORKER_ROLE
                value: "launcher"
@@ -34,7 +34,7 @@ spec:
              - name: MASTER_PORT
                value: "42679"
              - name: MASTER_ADDR
-               value: "bigdl-qlora-finetuning-job-worker-0.bigdl-qlora-finetuning-job-worker"
+               value: "ipex-llm-qlora-finetuning-job-worker-0.ipex-llm-qlora-finetuning-job-worker"
              - name: DATA_SUB_PATH
                value: "{{ .Values.dataSubPath }}"
              - name: ENABLE_GRADIENT_CHECKPOINT
@@ -52,10 +52,10 @@ spec:
              volumeMounts:
              - name: nfs-storage
                subPath: {{ .Values.modelSubPath }}
-               mountPath: /bigdl/model
+               mountPath: /ipex_llm/model
              - name: nfs-storage
                subPath: {{ .Values.dataSubPath }}
-               mountPath: "/bigdl/data/{{ .Values.dataSubPath }}"
+               mountPath: "/ipex_llm/data/{{ .Values.dataSubPath }}"
     Worker:
       replicas: {{ .Values.trainerNum }}
       template:
@@ -74,10 +74,10 @@ spec:
                     topologyKey: kubernetes.io/hostname
           containers:
           - image: {{ .Values.imageName }}
-            name: bigdl-qlora-finetuning-worker
+            name: ipex-llm-qlora-finetuning-worker
             securityContext:
               runAsUser: 1000
-            command: ['sh' , '-c', 'bash /bigdl/bigdl-qlora-finetuing-entrypoint.sh']
+            command: ['sh' , '-c', 'bash /ipex_llm/ipex-llm-qlora-finetuing-entrypoint.sh']
             env:
             - name: WORKER_ROLE
               value: "trainer"
@@ -88,7 +88,7 @@ spec:
             - name: MASTER_PORT
               value: "42679"
             - name: MASTER_ADDR
-              value: "bigdl-qlora-finetuning-job-worker-0.bigdl-qlora-finetuning-job-worker"
+              value: "ipex-llm-qlora-finetuning-job-worker-0.ipex-llm-qlora-finetuning-job-worker"
             - name: ENABLE_GRADIENT_CHECKPOINT
               value: "{{ .Values.enableGradientCheckpoint }}"
             - name: http_proxy
@@ -102,10 +102,10 @@ spec:
             volumeMounts:
             - name: nfs-storage
               subPath: {{ .Values.modelSubPath }}
-              mountPath: /bigdl/model
+              mountPath: /ipex_llm/model
             - name: nfs-storage
               subPath: {{ .Values.dataSubPath }}
-              mountPath: "/bigdl/data/{{ .Values.dataSubPath }}"
+              mountPath: "/ipex_llm/data/{{ .Values.dataSubPath }}"
             resources:
               requests:
                 cpu: 48
diff --git a/docker/llm/finetune/qlora/cpu/kubernetes/templates/nfs-pv.yaml b/docker/llm/finetune/qlora/cpu/kubernetes/templates/nfs-pv.yaml
index 14a2126f..fa104391 100644
--- a/docker/llm/finetune/qlora/cpu/kubernetes/templates/nfs-pv.yaml
+++ b/docker/llm/finetune/qlora/cpu/kubernetes/templates/nfs-pv.yaml
@@ -1,8 +1,8 @@
 apiVersion: v1
 kind: PersistentVolume
 metadata:
-  name: nfs-pv-bigdl-qlora-finetuning
-  namespace: bigdl-qlora-finetuning
+  name: nfs-pv-ipex-llm-qlora-finetuning
+  namespace: ipex-llm-qlora-finetuning
 spec:
   capacity:
     storage: 15Gi
diff --git a/docker/llm/finetune/qlora/cpu/kubernetes/templates/nfs-pvc.yaml b/docker/llm/finetune/qlora/cpu/kubernetes/templates/nfs-pvc.yaml
index 48ef589d..0a2b0d99 100644
--- a/docker/llm/finetune/qlora/cpu/kubernetes/templates/nfs-pvc.yaml
+++ b/docker/llm/finetune/qlora/cpu/kubernetes/templates/nfs-pvc.yaml
@@ -2,7 +2,7 @@ kind: PersistentVolumeClaim
 apiVersion: v1
 metadata:
   name: nfs-pvc
-  namespace: bigdl-qlora-finetuning
+  namespace: ipex-llm-qlora-finetuning
 spec:
   accessModes:
   - ReadWriteOnce
diff --git a/docker/llm/finetune/qlora/cpu/kubernetes/values.yaml b/docker/llm/finetune/qlora/cpu/kubernetes/values.yaml
index b195f203..083f6584 100644
--- a/docker/llm/finetune/qlora/cpu/kubernetes/values.yaml
+++ b/docker/llm/finetune/qlora/cpu/kubernetes/values.yaml
@@ -1,4 +1,4 @@
-imageName: intelanalytics/bigdl-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT
+imageName: intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:2.5.0-SNAPSHOT
 trainerNum: 2
 microBatchSize: 8
 enableGradientCheckpoint: false # true will save more memory but increase latency
diff --git a/docker/llm/finetune/qlora/xpu/docker/Dockerfile b/docker/llm/finetune/qlora/xpu/docker/Dockerfile
index 415581ad..25bc65d2 100644
--- a/docker/llm/finetune/qlora/xpu/docker/Dockerfile
+++ b/docker/llm/finetune/qlora/xpu/docker/Dockerfile
@@ -28,15 +28,15 @@ RUN curl -fsSL https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-P
     ln -s /usr/bin/python3 /usr/bin/python && \
     apt-get install -y python3-pip python3.9-dev python3-wheel python3.9-distutils && \
     curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
-    # install XPU bigdl-llm
-    pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu && \
+    # install XPU ipex-llm
+    pip install --pre --upgrade ipex-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu && \
     # install huggingface dependencies
     pip install git+https://github.com/huggingface/transformers.git@${TRANSFORMERS_COMMIT_ID} && \
     pip install peft==0.5.0 datasets accelerate==0.23.0 && \
     pip install bitsandbytes scipy && \
-    git clone https://github.com/intel-analytics/BigDL.git && \
-    mv BigDL/python/llm/example/GPU/LLM-Finetuning/common /common && \
-    rm -r BigDL && \
-    wget https://raw.githubusercontent.com/intel-analytics/BigDL/main/python/llm/example/GPU/LLM-Finetuning/QLoRA/simple-example/qlora_finetuning.py
+    git clone https://github.com/intel-analytics/IPEX-LLM.git && \
+    mv IPEX-LLM/python/llm/example/GPU/LLM-Finetuning/common /common && \
+    rm -r IPEX-LLM && \
+    wget https://raw.githubusercontent.com/intel-analytics/IPEX-LLM/main/python/llm/example/GPU/LLM-Finetuning/QLoRA/simple-example/qlora_finetuning.py
 
 COPY ./start-qlora-finetuning-on-xpu.sh /start-qlora-finetuning-on-xpu.sh
diff --git a/docker/llm/finetune/qlora/xpu/docker/README.md b/docker/llm/finetune/qlora/xpu/docker/README.md
index 13e5fbab..56926293 100644
--- a/docker/llm/finetune/qlora/xpu/docker/README.md
+++ b/docker/llm/finetune/qlora/xpu/docker/README.md
@@ -1,13 +1,13 @@
-## Fine-tune LLM with BigDL LLM Container
+## Fine-tune LLM with IPEX LLM Container
 
-The following shows how to fine-tune LLM with Quantization (QLoRA built on BigDL-LLM 4bit optimizations) in a docker environment, which is accelerated by Intel XPU.
+The following shows how to fine-tune LLM with Quantization (QLoRA built on IPEX-LLM 4bit optimizations) in a docker environment, which is accelerated by Intel XPU.
 
 ### 1. Prepare Docker Image
 
 You can download directly from Dockerhub like:
 
 ```bash
-docker pull intelanalytics/bigdl-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT
 ```
 
 Or build the image from source:
@@ -19,7 +19,7 @@ export HTTPS_PROXY=your_https_proxy
 docker build \
   --build-arg http_proxy=${HTTP_PROXY} \
   --build-arg https_proxy=${HTTPS_PROXY} \
-  -t intelanalytics/bigdl-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT \
+  -t intelanalytics/ipex-llm-finetune-qlora-xpu:2.5.0-SNAPSHOT \
   -f ./Dockerfile .
 ```
 
@@ -37,13 +37,13 @@ docker run -itd \
    --net=host \
    --device=/dev/dri \
    --memory="32G" \
-   --name=bigdl-llm-fintune-qlora-xpu \
+   --name=ipex-llm-fintune-qlora-xpu \
    -e http_proxy=${HTTP_PROXY} \
    -e https_proxy=${HTTPS_PROXY} \
    -v $BASE_MODE_PATH:/model \
    -v $DATA_PATH:/data/alpaca-cleaned \
    --shm-size="16g" \
-   intelanalytics/bigdl-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT
+   intelanalytics/ipex-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT
 ```
 
 The download and mount of base model and data to a docker container demonstrates a standard fine-tuning process. You can skip this step for a quick start, and in this way, the fine-tuning codes will automatically download the needed files:
@@ -56,11 +56,11 @@ docker run -itd \
    --net=host \
    --device=/dev/dri \
    --memory="32G" \
-   --name=bigdl-llm-fintune-qlora-xpu \
+   --name=ipex-llm-fintune-qlora-xpu \
    -e http_proxy=${HTTP_PROXY} \
    -e https_proxy=${HTTPS_PROXY} \
    --shm-size="16g" \
-   intelanalytics/bigdl-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT
+   intelanalytics/ipex-llm-fintune-qlora-xpu:2.5.0-SNAPSHOT
 ```
 
 However, we do recommend you to handle them manually, because the automatical download can be blocked by Internet access and Huggingface authentication etc. according to different environment, and the manual method allows you to fine-tune in a custom way (with different base model and dataset).
@@ -70,7 +70,7 @@ However, we do recommend you to handle them manually, because the automatical do
 Enter the running container:
 
 ```bash
-docker exec -it bigdl-llm-fintune-qlora-xpu bash
+docker exec -it ipex-llm-fintune-qlora-xpu bash
 ```
 
 Then, start QLoRA fine-tuning:
diff --git a/docker/llm/inference/cpu/docker/Dockerfile b/docker/llm/inference/cpu/docker/Dockerfile
index c12189fd..f8a302f7 100644
--- a/docker/llm/inference/cpu/docker/Dockerfile
+++ b/docker/llm/inference/cpu/docker/Dockerfile
@@ -24,18 +24,18 @@ RUN env DEBIAN_FRONTEND=noninteractive apt-get update && \
     rm get-pip.py && \
     pip install --upgrade requests argparse urllib3 && \
     pip3 install --no-cache-dir --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu && \
-    pip install --pre --upgrade bigdl-llm[all] && \
-# Download bigdl-llm-tutorial
+    pip install --pre --upgrade ipex-llm[all] && \
+# Download ipex-llm-tutorial
     cd /llm && \
     pip install --upgrade jupyterlab && \
-    git clone https://github.com/intel-analytics/bigdl-llm-tutorial && \
+    git clone https://github.com/intel-analytics/ipex-llm-tutorial && \
     chmod +x /llm/start-notebook.sh && \
 # Download all-in-one benchmark
-    git clone https://github.com/intel-analytics/BigDL && \
-    cp -r ./BigDL/python/llm/dev/benchmark/ ./benchmark && \
+    git clone https://github.com/intel-analytics/IPEX-LLM && \
+    cp -r ./IPEX-LLM/python/llm/dev/benchmark/ ./benchmark && \
 # Copy chat.py script
     pip install --upgrade colorama && \
-    cp -r ./BigDL/python/llm/portable-zip/ ./portable-zip && \
+    cp -r ./IPEX-LLM/python/llm/portable-zip/ ./portable-zip && \
 # Install all-in-one dependencies
     apt-get install -y numactl && \
     pip install --upgrade omegaconf && \
@@ -46,13 +46,13 @@ RUN env DEBIAN_FRONTEND=noninteractive apt-get update && \
 # Add Qwen support
     pip install --upgrade transformers_stream_generator einops && \
 # Copy vLLM-Serving
-    cp -r ./BigDL/python/llm/example/CPU/vLLM-Serving/ ./vLLM-Serving && \
-    rm -rf ./BigDL && \
+    cp -r ./IPEX-LLM/python/llm/example/CPU/vLLM-Serving/ ./vLLM-Serving && \
+    rm -rf ./IPEX-LLM && \
 # Fix vllm service 
     pip install pydantic==1.10.11 && \
-# Install bigdl-llm
+# Install ipex-llm
     cd /llm && \
-    pip install --pre --upgrade bigdl-llm[all] && \
+    pip install --pre --upgrade ipex-llm[all] && \
     # Fix CVE-2024-22195
     pip install Jinja2==3.1.3 && \
     pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cpu && \
diff --git a/docker/llm/inference/cpu/docker/README.md b/docker/llm/inference/cpu/docker/README.md
index cbe6c8ae..ba817747 100644
--- a/docker/llm/inference/cpu/docker/README.md
+++ b/docker/llm/inference/cpu/docker/README.md
@@ -1,4 +1,4 @@
-## Build/Use BigDL-LLM cpu image
+## Build/Use IPEX-LLM cpu image
 
 ### Build Image
 ```bash
@@ -6,7 +6,7 @@ docker build \
   --build-arg http_proxy=.. \
   --build-arg https_proxy=.. \
   --build-arg no_proxy=.. \
-  --rm --no-cache -t intelanalytics/bigdl-llm-cpu:2.5.0-SNAPSHOT .
+  --rm --no-cache -t intelanalytics/ipex-llm-cpu:2.5.0-SNAPSHOT .
 ```
 
 
@@ -16,7 +16,7 @@ docker build \
 An example could be:
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-cpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-cpu:2.5.0-SNAPSHOT
 
 sudo docker run -itd \
         --net=host \
@@ -31,7 +31,7 @@ sudo docker run -itd \
 
 After the container is booted, you could get into the container through `docker exec`.
 
-To run inference using `BigDL-LLM` using cpu, you could refer to this [documentation](https://github.com/intel-analytics/BigDL/tree/main/python/llm#cpu-int4).
+To run inference using `IPEX-LLM` using cpu, you could refer to this [documentation](https://github.com/intel-analytics/IPEX-LLM/tree/main/python/llm#cpu-int4).
 
 ### Use chat.py
 
@@ -41,7 +41,7 @@ You can download models and bind the model directory from host machine to contai
 
 Here is an example:
 ```bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-cpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-cpu:2.5.0-SNAPSHOT
 export MODEL_PATH=/home/llm/models
 
 sudo docker run -itd \
@@ -65,4 +65,4 @@ In the example above, it can be:
 ```bash
 cd /llm
 python chat.py --model-path /llm/models/MODEL_NAME
-```
\ No newline at end of file
+```
diff --git a/docker/llm/inference/cpu/docker/start-notebook.sh b/docker/llm/inference/cpu/docker/start-notebook.sh
index f3aedda8..85ebcd55 100644
--- a/docker/llm/inference/cpu/docker/start-notebook.sh
+++ b/docker/llm/inference/cpu/docker/start-notebook.sh
@@ -1,7 +1,7 @@
 #!/bin/bash
 
 #
-# Copyright 2016 The BigDL Authors.
+# Copyright 2016 The IPEX-LLM Authors.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -29,4 +29,4 @@ while [ $# -gt 0 ]; do
   shift
 done
 
-jupyter-lab --notebook-dir=/llm/bigdl-llm-tutorial --ip=0.0.0.0 --port=$port --no-browser --NotebookApp.token=$token --allow-root
\ No newline at end of file
+jupyter-lab --notebook-dir=/llm/ipex-llm-tutorial --ip=0.0.0.0 --port=$port --no-browser --NotebookApp.token=$token --allow-root
diff --git a/docker/llm/inference/xpu/docker/Dockerfile b/docker/llm/inference/xpu/docker/Dockerfile
index a74e00f3..266515db 100644
--- a/docker/llm/inference/xpu/docker/Dockerfile
+++ b/docker/llm/inference/xpu/docker/Dockerfile
@@ -20,7 +20,7 @@ RUN curl -fsSL https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-P
     wget -qO - https://repositories.intel.com/graphics/intel-graphics.key | gpg --dearmor --output /usr/share/keyrings/intel-graphics.gpg && \
     echo 'deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/graphics/ubuntu jammy arc' | tee /etc/apt/sources.list.d/intel.gpu.jammy.list && \
     rm /etc/apt/sources.list.d/intel-graphics.list && \
-    # Install PYTHON 3.9 and BigDL-LLM[xpu]
+    # Install PYTHON 3.9 and IPEX-LLM[xpu]
     ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone && \
     env DEBIAN_FRONTEND=noninteractive apt-get update && \
     apt install software-properties-common libunwind8-dev vim less -y && \
@@ -35,7 +35,7 @@ RUN curl -fsSL https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-P
     python3 get-pip.py && \
     rm get-pip.py && \
     pip install --upgrade requests argparse urllib3 && \
-    pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu && \
+    pip install --pre --upgrade ipex-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu && \
     # Fix Trivy CVE Issues
     pip install transformers==4.36.2 && \
     pip install transformers_stream_generator einops tiktoken && \
@@ -48,6 +48,6 @@ RUN curl -fsSL https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-P
     pip install --upgrade fastapi && \
     pip install --upgrade "uvicorn[standard]" && \
     # Download vLLM-Serving
-    git clone https://github.com/intel-analytics/BigDL && \
-    cp -r ./BigDL/python/llm/example/GPU/vLLM-Serving/ ./vLLM-Serving && \
-    rm -rf ./BigDL
+    git clone https://github.com/intel-analytics/IPEX-LLM && \
+    cp -r ./IPEX-LLM/python/llm/example/GPU/vLLM-Serving/ ./vLLM-Serving && \
+    rm -rf ./IPEX-LLM
diff --git a/docker/llm/inference/xpu/docker/README.md b/docker/llm/inference/xpu/docker/README.md
index e63d751a..c17787e6 100644
--- a/docker/llm/inference/xpu/docker/README.md
+++ b/docker/llm/inference/xpu/docker/README.md
@@ -1,4 +1,4 @@
-## Build/Use BigDL-LLM xpu image
+## Build/Use IPEX-LLM xpu image
 
 ### Build Image
 ```bash
@@ -6,7 +6,7 @@ docker build \
   --build-arg http_proxy=.. \
   --build-arg https_proxy=.. \
   --build-arg no_proxy=.. \
-  --rm --no-cache -t intelanalytics/bigdl-llm-xpu:2.5.0-SNAPSHOT .
+  --rm --no-cache -t intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT .
 ```
 
 
@@ -17,7 +17,7 @@ To map the `xpu` into the container, you need to specify `--device=/dev/dri` whe
 An example could be:
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-xpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT
 
 sudo docker run -itd \
         --net=host \
@@ -42,4 +42,4 @@ root@arda-arc12:/# sycl-ls
 ```
 
 
-To run inference using `BigDL-LLM` using xpu, you could refer to this [documentation](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU).
+To run inference using `IPEX-LLM` using xpu, you could refer to this [documentation](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU).
diff --git a/docker/llm/serving/cpu/docker/Dockerfile b/docker/llm/serving/cpu/docker/Dockerfile
index 8346502d..6c5c4684 100644
--- a/docker/llm/serving/cpu/docker/Dockerfile
+++ b/docker/llm/serving/cpu/docker/Dockerfile
@@ -1,4 +1,4 @@
-FROM intelanalytics/bigdl-llm-cpu:2.5.0-SNAPSHOT
+FROM intelanalytics/ipex-llm-cpu:2.5.0-SNAPSHOT
 
 ARG http_proxy
 ARG https_proxy
@@ -12,7 +12,7 @@ COPY ./model_adapter.py.patch /llm/model_adapter.py.patch
 ADD  https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /sbin/tini
 # Install Serving Dependencies
 RUN cd /llm && \
-    pip install --pre --upgrade bigdl-llm[serving] && \
+    pip install --pre --upgrade ipex-llm[serving] && \
     # Fix Trivy CVE Issues
     pip install Jinja2==3.1.3 transformers==4.36.2 gradio==4.19.2 cryptography==42.0.4 && \
     # Fix Qwen model adpater in fastchat
diff --git a/docker/llm/serving/cpu/docker/README.md b/docker/llm/serving/cpu/docker/README.md
index bea11636..ec1b011d 100644
--- a/docker/llm/serving/cpu/docker/README.md
+++ b/docker/llm/serving/cpu/docker/README.md
@@ -1,4 +1,4 @@
-## Build/Use BigDL-LLM-serving cpu image
+## Build/Use IPEX-LLM-serving cpu image
 
 ### Build Image
 ```bash
@@ -6,7 +6,7 @@ docker build \
   --build-arg http_proxy=.. \
   --build-arg https_proxy=.. \
   --build-arg no_proxy=.. \
-  --rm --no-cache -t intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT .
+  --rm --no-cache -t intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT .
 ```
 
 ### Use the image for doing cpu serving
@@ -16,7 +16,7 @@ You could use the following bash script to start the container.  Please be noted
 
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
 
 sudo docker run -itd \
         --net=host \
@@ -30,13 +30,13 @@ sudo docker run -itd \
 
 After the container is booted, you could get into the container through `docker exec`.
 
-To run model-serving using `BigDL-LLM` as backend, you can refer to this [document](https://github.com/intel-analytics/BigDL/tree/main/python/llm/src/bigdl/llm/serving).
+To run model-serving using `IPEX-LLM` as backend, you can refer to this [document](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/src/ipex/llm/serving).
 Also you can set environment variables and start arguments while running a container to get serving started initially. You may need to boot several containers to support. One controller container and at least one worker container are needed. The api server address(host and port) and controller address are set in controller container, and you need to set the same controller address as above, model path on your machine and worker address in worker container.
 
 To start a controller container:
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
 controller_host=localhost
 controller_port=23000
 api_host=localhost
@@ -59,7 +59,7 @@ sudo docker run -itd \
 To start a worker container:
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
 export MODEL_PATH=YOUR_MODEL_PATH
 controller_host=localhost
 controller_port=23000
@@ -94,4 +94,4 @@ curl -X POST -H "Content-Type: application/json" -d '{
     "use_beam_search": false,
     "stream": false
 }' http://localhost:8000/v1/completions
-```
\ No newline at end of file
+```
diff --git a/docker/llm/serving/cpu/docker/entrypoint.sh b/docker/llm/serving/cpu/docker/entrypoint.sh
index 36217dd2..2a137499 100644
--- a/docker/llm/serving/cpu/docker/entrypoint.sh
+++ b/docker/llm/serving/cpu/docker/entrypoint.sh
@@ -196,8 +196,8 @@ else
   else
     # Logic for non-controller(worker) mode
     worker_address="http://$worker_host:$worker_port"
-    # Apply optimizations from bigdl-llm
-    source bigdl-llm-init -t
+    # Apply optimizations from ipex-llm
+    source ipex-llm-init -t
     # First check if user have set OMP_NUM_THREADS by themselves
     if [[ -n "${omp_num_threads}" ]]; then
       echo "Setting OMP_NUM_THREADS to its original value: $omp_num_threads"
diff --git a/docker/llm/serving/cpu/kubernetes/README.md b/docker/llm/serving/cpu/kubernetes/README.md
index 5c08e00d..f8d745d3 100644
--- a/docker/llm/serving/cpu/kubernetes/README.md
+++ b/docker/llm/serving/cpu/kubernetes/README.md
@@ -1,8 +1,8 @@
-## Deployment bigdl-llm serving service in K8S environment
+## Deployment ipex-llm serving service in K8S environment
 
 ## Image
 
-To deploy BigDL-LLM-serving cpu in Kubernetes environment, please use this image: `intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT`
+To deploy IPEX-LLM-serving cpu in Kubernetes environment, please use this image: `intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT`
 
 ## Before deployment
 
@@ -10,12 +10,10 @@ To deploy BigDL-LLM-serving cpu in Kubernetes environment, please use this image
 
 In this document, we will use `vicuna-7b-v1.5` as the deployment model.
 
-After downloading the model, please change name from `vicuna-7b-v1.5` to `vicuna-7b-v1.5-bigdl` to use `bigdl-llm` as the backend. The `bigdl-llm` backend will be used if model path contains `bigdl`. Otherwise, the original transformer-backend will be used.
+After downloading the model, please change name from `vicuna-7b-v1.5` to `vicuna-7b-v1.5-ipex` to use `ipex-llm` as the backend. The `ipex-llm` backend will be used if model path contains `ipex-llm`. Otherwise, the original transformer-backend will be used.
 
 You can download the model from [here](https://huggingface.co/lmsys/vicuna-7b-v1.5).
 
-For ChatGLM models, users do not need to add `bigdl` into model path.  We have already used the `BigDL-LLM` backend for this model.
-
 ### Kubernetes config
 
 We recommend to setup your kubernetes cluster before deployment.  Mostly importantly, please set `cpu-management-policy` to `static` by using this [tutorial](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/).  Also, it would be great to also set the `topology management policy` to `single-numa-node`.
@@ -67,7 +65,7 @@ We use the following yaml file for controller deployment:
 apiVersion: v1
 kind: Pod
 metadata:
-  name: bigdl-fschat-a1234bd-controller
+  name: ipex-llm-fschat-a1234bd-controller
   labels:
     fastchat-appid: a1234bd
     fastchat-app-type: controller
@@ -75,7 +73,7 @@ spec:
   dnsPolicy: "ClusterFirst"
   containers:
   - name: fastchat-controller # fixed
-    image: intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+    image: intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
     imagePullPolicy: IfNotPresent
     env:
     - name: CONTROLLER_HOST # fixed
@@ -107,7 +105,7 @@ spec:
 apiVersion: v1
 kind: Service
 metadata:
-  name: bigdl-a1234bd-fschat-controller-service
+  name: ipex-llm-a1234bd-fschat-controller-service
 spec:
   # You may also want to change this to use the cluster's feature
   type: NodePort
@@ -133,7 +131,7 @@ We use the following deployment for worker deployment:
 apiVersion: apps/v1
 kind: Deployment
 metadata:
-  name: bigdl-fschat-a1234bd-worker-deployment
+  name: ipex-llm-fschat-a1234bd-worker-deployment
 spec:
   # Change this to the number you want
   replicas: 1
@@ -148,11 +146,11 @@ spec:
       dnsPolicy: "ClusterFirst"
       containers:
       - name: fastchat-worker # fixed
-        image: intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+        image: intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
         imagePullPolicy: IfNotPresent
         env:
         - name: CONTROLLER_HOST # fixed
-          value: bigdl-a1234bd-fschat-controller-service
+          value: ipex-llm-a1234bd-fschat-controller-service
         - name: CONTROLLER_PORT # fixed
           value: "21005"
         - name: WORKER_HOST # fixed
@@ -162,7 +160,7 @@ spec:
         - name: WORKER_PORT # fixed
           value: "21841"
         - name: MODEL_PATH 
-          value: "/llm/models/vicuna-7b-v1.5-bigdl/" # change this to your model
+          value: "/llm/models/vicuna-7b-v1.5-ipex-llm/" # change this to your model
         - name: OMP_NUM_THREADS
           value: "16"
         resources:
@@ -190,7 +188,7 @@ You may want to change the `MODEL_PATH` variable in the yaml.  Also, please reme
 We have set port using `GRADIO_PORT` envrionment variable in `deployment.yaml`, you can use this command
 
 ```bash
-k port-forward bigdl-fschat-a1234bd-controller --address 0.0.0.0 8002:8002
+k port-forward ipex-llm-fschat-a1234bd-controller --address 0.0.0.0 8002:8002
 ```
 
 Then visit http://YOUR_HOST_IP:8002 to access ui.
@@ -209,14 +207,14 @@ First, install openai-python:
 pip install --upgrade openai
 ```
 
-Then, interact with model vicuna-7b-v1.5-bigdl:
+Then, interact with model vicuna-7b-v1.5-ipex-llm:
 
 ```python
 import openai
 openai.api_key = "EMPTY"
 openai.api_base = "http://localhost:8000/v1"
 
-model = "vicuna-7b-v1.5-bigdl"
+model = "vicuna-7b-v1.5-ipex-llm"
 prompt = "Once upon a time"
 
 # create a completion
diff --git a/docker/llm/serving/cpu/kubernetes/deployment.yaml b/docker/llm/serving/cpu/kubernetes/deployment.yaml
index 1c58f811..71c3ad42 100644
--- a/docker/llm/serving/cpu/kubernetes/deployment.yaml
+++ b/docker/llm/serving/cpu/kubernetes/deployment.yaml
@@ -16,7 +16,7 @@ spec:
 apiVersion: v1
 kind: Pod
 metadata:
-  name: bigdl-fschat-a1234bd-controller
+  name: ipex-llm-fschat-a1234bd-controller
   labels:
     fastchat-appid: a1234bd
     fastchat-app-type: controller
@@ -24,7 +24,7 @@ spec:
   dnsPolicy: "ClusterFirst"
   containers:
   - name: fastchat-controller # fixed
-    image: intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+    image: intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
     imagePullPolicy: IfNotPresent
     env:
     - name: CONTROLLER_HOST # fixed
@@ -56,7 +56,7 @@ spec:
 apiVersion: v1
 kind: Service
 metadata:
-  name: bigdl-a1234bd-fschat-controller-service
+  name: ipex-llm-a1234bd-fschat-controller-service
 spec:
   # You may also want to change this to use the cluster's feature
   type: NodePort
@@ -76,7 +76,7 @@ spec:
 apiVersion: apps/v1
 kind: Deployment
 metadata:
-  name: bigdl-fschat-a1234bd-worker-deployment
+        name: ipex-llm-fschat-a1234bd-worker-deployment
 spec:
   # Change this to the number you want
   replicas: 1
@@ -91,11 +91,11 @@ spec:
       dnsPolicy: "ClusterFirst"
       containers:
       - name: fastchat-worker # fixed
-        image: intelanalytics/bigdl-llm-serving-cpu:2.5.0-SNAPSHOT
+        image: intelanalytics/ipex-llm-serving-cpu:2.5.0-SNAPSHOT
         imagePullPolicy: IfNotPresent
         env:
         - name: CONTROLLER_HOST # fixed
-          value: bigdl-a1234bd-fschat-controller-service
+          value: ipex-llm-a1234bd-fschat-controller-service
         - name: CONTROLLER_PORT # fixed
           value: "21005"
         - name: WORKER_HOST # fixed
@@ -105,7 +105,7 @@ spec:
         - name: WORKER_PORT # fixed
           value: "21841"
         - name: MODEL_PATH
-          value: "/llm/models/vicuna-7b-v1.5-bigdl/" # change this to your model
+          value: "/llm/models/vicuna-7b-v1.5-ipex-llm/" # change this to your model
         - name: OMP_NUM_THREADS
           value: "16"
         resources:
@@ -123,4 +123,4 @@ spec:
       volumes:
       - name: llm-models
         persistentVolumeClaim:
-          claimName: models-pvc
\ No newline at end of file
+          claimName: models-pvc
diff --git a/docker/llm/serving/xpu/docker/Dockerfile b/docker/llm/serving/xpu/docker/Dockerfile
index 42922562..87cb85b8 100644
--- a/docker/llm/serving/xpu/docker/Dockerfile
+++ b/docker/llm/serving/xpu/docker/Dockerfile
@@ -1,4 +1,4 @@
-FROM intelanalytics/bigdl-llm-xpu:2.5.0-SNAPSHOT
+FROM intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT
 
 ARG http_proxy
 ARG https_proxy
@@ -10,7 +10,7 @@ COPY ./entrypoint.sh /opt/entrypoint.sh
 
 # Install Serving Dependencies
 RUN cd /llm && \
-    pip install --pre --upgrade bigdl-llm[serving] && \
+    pip install --pre --upgrade ipex-llm[serving] && \
     pip install transformers==4.36.2 gradio==4.19.2 && \
     chmod +x /opt/entrypoint.sh
 
diff --git a/docker/llm/serving/xpu/docker/README.md b/docker/llm/serving/xpu/docker/README.md
index 85b266bc..16822dff 100644
--- a/docker/llm/serving/xpu/docker/README.md
+++ b/docker/llm/serving/xpu/docker/README.md
@@ -1,4 +1,4 @@
-## Build/Use BigDL-LLM-serving xpu image
+## Build/Use IPEX-LLM-serving xpu image
 
 ### Build Image
 ```bash
@@ -6,7 +6,7 @@ docker build \
   --build-arg http_proxy=.. \
   --build-arg https_proxy=.. \
   --build-arg no_proxy=.. \
-  --rm --no-cache -t intelanalytics/bigdl-llm-serving-xpu:2.5.0-SNAPSHOT .
+  --rm --no-cache -t intelanalytics/ipex-llm-serving-xpu:2.5.0-SNAPSHOT .
 ```
 
 
@@ -18,7 +18,7 @@ To map the `xpu` into the container, you need to specify `--device=/dev/dri` whe
 An example could be:
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-serving-xpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-serving-xpu:2.5.0-SNAPSHOT
 
 sudo docker run -itd \
         --net=host \
@@ -43,4 +43,4 @@ root@arda-arc12:/# sycl-ls
 ```
 After the container is booted, you could get into the container through `docker exec`.
 
-To run model-serving using `BigDL-LLM` as backend, you can refer to this [document](https://github.com/intel-analytics/BigDL/tree/main/python/llm/src/bigdl/llm/serving).
+To run model-serving using `IPEX-LLM` as backend, you can refer to this [document](https://github.com/intel-analytics/IPEX-LLM/tree/main/python/llm/src/ipex_llm/serving).