Update k8s doc - remove env from yaml (#7670)
This commit is contained in:
		
							parent
							
								
									48f5144a34
								
							
						
					
					
						commit
						51b8ff3728
					
				
					 1 changed files with 17 additions and 42 deletions
				
			
		| 
						 | 
				
			
			@ -138,16 +138,15 @@ def train_data_creator(config, batch_size):
 | 
			
		|||
### 2 Pull Docker Image
 | 
			
		||||
Please pull the BigDL [`bigdl-k8s`](https://hub.docker.com/r/intelanalytics/bigdl-k8s/tags) image (built on top of Spark 3.1.3) from Docker Hub beforehand as follows:
 | 
			
		||||
```bash
 | 
			
		||||
# For the release version, e.g. 2.2.0
 | 
			
		||||
sudo docker pull intelanalytics/bigdl-k8s:version
 | 
			
		||||
 | 
			
		||||
# For the latest nightly build version
 | 
			
		||||
sudo docker pull intelanalytics/bigdl-k8s:latest
 | 
			
		||||
 | 
			
		||||
# For the release version, e.g. 2.2.0
 | 
			
		||||
sudo docker pull intelanalytics/bigdl-k8s:2.2.0
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
In the BigDL K8s Docker image:
 | 
			
		||||
- Spark is located at `/opt/spark`. Spark version is 3.1.3.
 | 
			
		||||
- BigDL is located at `/opt/bigdl-VERSION`. For the latest nightly build image, BigDL version would be `xxx-SNAPSHOT` (e.g. 2.3.0-SNAPSHOT).
 | 
			
		||||
* The environment for Spark (including SPARK_VERSION and SPARK_HOME) and BigDL (including BIGDL_VERSION and BIGDL_HOME) are already configured in the BigDL K8s Docker image.
 | 
			
		||||
* Spark executor containers are scheduled by K8s at runtime and you don't need to create them manually.
 | 
			
		||||
 | 
			
		||||
---
 | 
			
		||||
## 3. Create BigDL K8s Container
 | 
			
		||||
| 
						 | 
				
			
			@ -168,29 +167,25 @@ sudo docker run -itd --net=host \
 | 
			
		|||
    -e https_proxy=https://your-proxy-host:your-proxy-port \
 | 
			
		||||
    -e RUNTIME_SPARK_MASTER=k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port> \
 | 
			
		||||
    -e RUNTIME_K8S_SERVICE_ACCOUNT=spark \
 | 
			
		||||
    -e RUNTIME_K8S_SPARK_IMAGE=intelanalytics/bigdl-k8s:latest \
 | 
			
		||||
    -e RUNTIME_K8S_SPARK_IMAGE=intelanalytics/bigdl-k8s:version \
 | 
			
		||||
    -e RUNTIME_PERSISTENT_VOLUME_CLAIM=nfsvolumeclaim \
 | 
			
		||||
    -e RUNTIME_DRIVER_HOST=${RUNTIME_DRIVER_HOST} \
 | 
			
		||||
    intelanalytics/bigdl-k8s:latest bash
 | 
			
		||||
    intelanalytics/bigdl-k8s:version bash
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
In the script:
 | 
			
		||||
* **Please switch the version tag according to the BigDL K8s Docker image you pull.**
 | 
			
		||||
* **Please modify the version tag according to the BigDL K8s Docker image you pull.**
 | 
			
		||||
* **Please make sure you are mounting the correct Volume path (e.g. NFS) into the container.**
 | 
			
		||||
* `--net=host`: use the host network stack for the Docker container.
 | 
			
		||||
* `-v /etc/kubernetes:/etc/kubernetes`: specify the path of Kubernetes configurations to mount into the Docker container.
 | 
			
		||||
* `-v /root/.kube:/root/.kube`: specify the path of Kubernetes installation to mount into the Docker container.
 | 
			
		||||
* `-v /path/to/nfsdata:/bigdl/nfsdata`: mount NFS path on the host into the Docker container as the specified path (e.g. "/bigdl/nfsdata").
 | 
			
		||||
* `RUNTIME_SPARK_MASTER`: a URL format that specifies the Spark master: `k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>`.
 | 
			
		||||
* `RUNTIME_K8S_SERVICE_ACCOUNT`: a string that specifies the service account for the driver pod.
 | 
			
		||||
* `RUNTIME_K8S_SERVICE_ACCOUNT`: the service account for the driver pod.
 | 
			
		||||
* `RUNTIME_K8S_SPARK_IMAGE`: the name of the BigDL K8s Docker image. Note that you need to change the version accordingly.
 | 
			
		||||
* `RUNTIME_PERSISTENT_VOLUME_CLAIM`: a string that specifies the Kubernetes volumeName (e.g. "nfsvolumeclaim").
 | 
			
		||||
* `RUNTIME_PERSISTENT_VOLUME_CLAIM`: the Kubernetes volumeName (e.g. "nfsvolumeclaim").
 | 
			
		||||
* `RUNTIME_DRIVER_HOST`: a URL format that specifies the driver localhost (only required if you use k8s-client mode).
 | 
			
		||||
 | 
			
		||||
__Notes:__
 | 
			
		||||
* The __Client Container__ already contains all the required environment configurations for Spark and BigDL Orca.
 | 
			
		||||
* Spark executor containers are scheduled by K8s at runtime and you don't need to create them manually.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
### 3.2 Launch the K8s Client Container
 | 
			
		||||
Once the container is created, a `containerID` would be returned and with which you can enter the container following the command below:
 | 
			
		||||
| 
						 | 
				
			
			@ -512,7 +507,7 @@ We define a Kubernetes Deployment in a YAML file. Some fields of the YAML are ex
 | 
			
		|||
 | 
			
		||||
#### 7.3.1 K8s Client
 | 
			
		||||
BigDL has provided an example [orca-tutorial-k8s-client.yaml](https://github.com/intel-analytics/BigDL/blob/main/python/orca/tutorial/pytorch/docker/orca-tutorial-client.yaml) to directly run the Fashion-MNIST example for k8s-client mode.
 | 
			
		||||
Note that you need to change the configurations in the YAML file accordingly, including the version of the Docker image, RUNTIME_SPARK_MASTER, BIGDL_VERSION and BIGDL_HOME.
 | 
			
		||||
The environment variables for Spark (including SPARK_VERSION and SPARK_HOME) and BigDL (including BIGDL_VERSION and BIGDL_HOME) are already configured in the BigDL K8s Docker image.
 | 
			
		||||
 | 
			
		||||
You need to uncompress the conda archive in NFS before submitting the job:
 | 
			
		||||
```bash
 | 
			
		||||
| 
						 | 
				
			
			@ -521,7 +516,7 @@ mkdir environment
 | 
			
		|||
tar -xzvf environment.tar.gz --directory environment
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
orca-tutorial-k8s-client.yaml
 | 
			
		||||
*orca-tutorial-k8s-client.yaml*
 | 
			
		||||
 | 
			
		||||
```bash
 | 
			
		||||
apiVersion: batch/v1
 | 
			
		||||
| 
						 | 
				
			
			@ -543,7 +538,7 @@ spec:
 | 
			
		|||
                export RUNTIME_DRIVER_HOST=$( hostname -I | awk '{print $1}' );
 | 
			
		||||
                ${SPARK_HOME}/bin/spark-submit \
 | 
			
		||||
                --master ${RUNTIME_SPARK_MASTER} \
 | 
			
		||||
                --deploy-mode ${SPARK_MODE} \
 | 
			
		||||
                --deploy-mode client \
 | 
			
		||||
                --name orca-k8s-client-tutorial \
 | 
			
		||||
                --conf spark.driver.host=${RUNTIME_DRIVER_HOST} \
 | 
			
		||||
                --conf spark.kubernetes.container.image=${RUNTIME_K8S_SPARK_IMAGE} \
 | 
			
		||||
| 
						 | 
				
			
			@ -557,9 +552,9 @@ spec:
 | 
			
		|||
                --conf spark.pyspark.python=/bigdl/nfsdata/environment/bin/python \
 | 
			
		||||
                --properties-file ${BIGDL_HOME}/conf/spark-bigdl.conf \
 | 
			
		||||
                --py-files ${BIGDL_HOME}/python/bigdl-spark_${SPARK_VERSION}-${BIGDL_VERSION}-python-api.zip,/bigdl/nfsdata/model.py \
 | 
			
		||||
                --conf spark.kubernetes.executor.deleteOnTermination=True \
 | 
			
		||||
                --conf spark.driver.extraClassPath=${BIGDL_HOME}/jars/* \
 | 
			
		||||
                --conf spark.executor.extraClassPath=${BIGDL_HOME}/jars/* \
 | 
			
		||||
                --conf spark.kubernetes.executor.deleteOnTermination=True \
 | 
			
		||||
                --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.nfsvolumeclaim.options.claimName=nfsvolumeclaim \
 | 
			
		||||
                --conf spark.kubernetes.driver.volumes.persistentVolumeClaim.nfsvolumeclaim.mount.path=/bigdl/nfsdata/ \
 | 
			
		||||
                --conf spark.kubernetes.executor.volumes.persistentVolumeClaim.nfsvolumeclaim.options.claimName=nfsvolumeclaim \
 | 
			
		||||
| 
						 | 
				
			
			@ -575,16 +570,6 @@ spec:
 | 
			
		|||
            value: intelanalytics/bigdl-k8s:latest
 | 
			
		||||
          - name: RUNTIME_SPARK_MASTER
 | 
			
		||||
            value: k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>
 | 
			
		||||
          - name: SPARK_MODE
 | 
			
		||||
            value: client
 | 
			
		||||
          - name: SPARK_VERSION
 | 
			
		||||
            value: 3.1.3
 | 
			
		||||
          - name: SPARK_HOME
 | 
			
		||||
            value: /opt/spark
 | 
			
		||||
          - name: BIGDL_VERSION
 | 
			
		||||
            value: 2.2.0-SNAPSHOT
 | 
			
		||||
          - name: BIGDL_HOME
 | 
			
		||||
            value: /opt/bigdl-2.2.0-SNAPSHOT
 | 
			
		||||
        volumeMounts:
 | 
			
		||||
          - name: nfs-storage
 | 
			
		||||
            mountPath: /bigdl/nfsdata
 | 
			
		||||
| 
						 | 
				
			
			@ -626,9 +611,9 @@ kubectl delete job orca-pytorch-job
 | 
			
		|||
 | 
			
		||||
#### 7.3.2 K8s Cluster
 | 
			
		||||
BigDL has provided an example [orca-tutorial-k8s-cluster.yaml](https://github.com/intel-analytics/BigDL/blob/main/python/orca/tutorial/pytorch/docker/orca-tutorial-cluster.yaml) to run the Fashion-MNIST example for k8s-cluster mode.
 | 
			
		||||
Note that you need to change the configurations in the YAML file accordingly, including the version of the Docker image, RUNTIME_SPARK_MASTER, BIGDL_VERSION and BIGDL_HOME.
 | 
			
		||||
The environment variables for Spark (including SPARK_VERSION and SPARK_HOME) and BigDL (including BIGDL_VERSION and BIGDL_HOME) are already configured in the BigDL K8s Docker image.
 | 
			
		||||
 | 
			
		||||
orca-tutorial-k8s-cluster.yaml
 | 
			
		||||
*orca-tutorial-k8s-cluster.yaml*
 | 
			
		||||
 | 
			
		||||
```bash
 | 
			
		||||
apiVersion: batch/v1
 | 
			
		||||
| 
						 | 
				
			
			@ -650,7 +635,7 @@ spec:
 | 
			
		|||
                ${SPARK_HOME}/bin/spark-submit \
 | 
			
		||||
                --master ${RUNTIME_SPARK_MASTER} \
 | 
			
		||||
                --name orca-k8s-cluster-tutorial \
 | 
			
		||||
                --deploy-mode ${SPARK_MODE} \
 | 
			
		||||
                --deploy-mode cluster \
 | 
			
		||||
                --conf spark.kubernetes.container.image=${RUNTIME_K8S_SPARK_IMAGE} \
 | 
			
		||||
                --conf spark.kubernetes.authenticate.driver.serviceAccountName=${RUNTIME_K8S_SERVICE_ACCOUNT} \
 | 
			
		||||
                --num-executors 2 \
 | 
			
		||||
| 
						 | 
				
			
			@ -685,16 +670,6 @@ spec:
 | 
			
		|||
            value: k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>
 | 
			
		||||
          - name: RUNTIME_K8S_SERVICE_ACCOUNT
 | 
			
		||||
            value: spark
 | 
			
		||||
          - name: SPARK_MODE
 | 
			
		||||
            value: cluster
 | 
			
		||||
          - name: SPARK_VERSION
 | 
			
		||||
            value: 3.1.3
 | 
			
		||||
          - name: SPARK_HOME
 | 
			
		||||
            value: /opt/spark
 | 
			
		||||
          - name: BIGDL_VERSION
 | 
			
		||||
            value: 2.2.0-SNAPSHOT
 | 
			
		||||
          - name: BIGDL_HOME
 | 
			
		||||
            value: /opt/bigdl-2.2.0-SNAPSHOT
 | 
			
		||||
        volumeMounts:
 | 
			
		||||
          - name: nfs-storage
 | 
			
		||||
            mountPath: /bigdl/nfsdata
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
		Reference in a new issue