Update k8s command (#7532)
* remove redundant conf * update command * update command * format * change to num executor * fix * minor * fix * modify cores * remove pythonhome * meet review * minor * rephase * minor * minor * update yarn master * update args
This commit is contained in:
parent
7727b4c9ba
commit
2e1d977e08
2 changed files with 78 additions and 77 deletions
|
|
@ -159,12 +159,6 @@ sudo docker run -itd --net=host \
|
|||
-e RUNTIME_PERSISTENT_VOLUME_CLAIM=nfsvolumeclaim \
|
||||
-e RUNTIME_DRIVER_HOST=x.x.x.x \
|
||||
-e RUNTIME_DRIVER_PORT=54321 \
|
||||
-e RUNTIME_EXECUTOR_INSTANCES=2 \
|
||||
-e RUNTIME_EXECUTOR_CORES=4 \
|
||||
-e RUNTIME_EXECUTOR_MEMORY=2g \
|
||||
-e RUNTIME_TOTAL_EXECUTOR_CORES=8 \
|
||||
-e RUNTIME_DRIVER_CORES=2 \
|
||||
-e RUNTIME_DRIVER_MEMORY=2g \
|
||||
intelanalytics/bigdl-k8s:latest bash
|
||||
```
|
||||
|
||||
|
|
@ -177,22 +171,16 @@ In the script:
|
|||
* `-v /path/to/nfsdata:/bigdl/nfsdata`: mount NFS path on the host into the container as the specified path (e.g. "/bigdl/nfsdata").
|
||||
* `NOTEBOOK_PORT`: an integer that specifies the port number for the Notebook. This is not necessary if you don't use notebook.
|
||||
* `NOTEBOOK_TOKEN`: a string that specifies the token for Notebook. This is not necessary if you don't use notebook.
|
||||
* `RUNTIME_SPARK_MASTER`: a URL format that specifies the Spark master: k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>.
|
||||
* `RUNTIME_SPARK_MASTER`: a URL format that specifies the Spark master: `k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>`.
|
||||
* `RUNTIME_K8S_SERVICE_ACCOUNT`: a string that specifies the service account for the driver pod.
|
||||
* `RUNTIME_K8S_SPARK_IMAGE`: the name of the BigDL K8s Docker image.
|
||||
* `RUNTIME_PERSISTENT_VOLUME_CLAIM`: a string that specifies the Kubernetes volumeName (e.g. "nfsvolumeclaim").
|
||||
* `RUNTIME_DRIVER_HOST`: a URL format that specifies the driver localhost (only required if you use k8s-client mode).
|
||||
* `RUNTIME_DRIVER_PORT`: a string that specifies the driver port (only required if you use k8s-client mode).
|
||||
* `RUNTIME_EXECUTOR_INSTANCES`: an integer that specifies the number of executors.
|
||||
* `RUNTIME_EXECUTOR_CORES`: an integer that specifies the number of cores for each executor.
|
||||
* `RUNTIME_EXECUTOR_MEMORY`: a string that specifies the memory for each executor.
|
||||
* `RUNTIME_TOTAL_EXECUTOR_CORES`: an integer that specifies the number of cores for all executors.
|
||||
* `RUNTIME_DRIVER_CORES`: an integer that specifies the number of cores for the driver node.
|
||||
* `RUNTIME_DRIVER_MEMORY`: a string that specifies the memory for the driver node.
|
||||
|
||||
__Notes:__
|
||||
* The __Client Container__ contains all the required environment except K8s configurations.
|
||||
* You don't need to create Spark executor containers manually, which are scheduled by K8s at runtime.
|
||||
* The __Client Container__ already contains all the required environment configurations for Spark and BigDL Orca.
|
||||
* Spark executor containers are scheduled by K8s at runtime and you don't need to create them manually.
|
||||
|
||||
|
||||
### 2.3 Launch the K8s Client Container
|
||||
|
|
@ -209,7 +197,7 @@ In the launched BigDL K8s **Client Container**, please setup the environment fol
|
|||
|
||||
- See [here](../Overview/install.md#install-anaconda) to install conda and prepare the Python environment.
|
||||
|
||||
- See [here](../Overview/install.md#to-install-orca-for-spark3) to install BigDL Orca in the created conda environment.
|
||||
- See [here](../Overview/install.md#to-install-orca-for-spark3) to install BigDL Orca in the created conda environment. *Note that if you use [`spark-submit`](#use-spark-submit), please __skip__ this step and __DO NOT__ install BigDL Orca with pip install command in the conda environment.*
|
||||
|
||||
- You should install all the other Python libraries that you need in your program in the conda environment as well. `torch` and `torchvision` are needed to run the Fashion-MNIST example we provide:
|
||||
```bash
|
||||
|
|
@ -339,34 +327,42 @@ python /bigdl/nfsdata/train.py --cluster_mode k8s-cluster --data_dir /bigdl/nfsd
|
|||
|
||||
### 6.2 Use `spark-submit`
|
||||
|
||||
Set the cluster_mode to "bigdl-submit" in `init_orca_context`.
|
||||
```python
|
||||
init_orca_context(cluster_mode="spark-submit")
|
||||
```
|
||||
If you prefer to use `spark-submit`, please follow the steps below to prepare the environment in the __Client Container__.
|
||||
|
||||
Pack the current activate conda environment to an archive in the __Client Container__:
|
||||
```bash
|
||||
conda pack -o environment.tar.gz
|
||||
```
|
||||
1. Set the cluster_mode to "spark-submit" in `init_orca_context`.
|
||||
```python
|
||||
sc = init_orca_context(cluster_mode="spark-submit")
|
||||
```
|
||||
|
||||
2. Download the requirement file(s) from [here](https://github.com/intel-analytics/BigDL/tree/main/python/requirements/orca) and install the required Python libraries of BigDL Orca according to your needs.
|
||||
```bash
|
||||
pip install -r /path/to/requirements.txt
|
||||
```
|
||||
Note that you are recommended **NOT** to install BigDL Orca with pip install command in the conda environment if you use spark-submit to avoid possible conflicts.
|
||||
|
||||
3. Pack the current activate conda environment to an archive before submitting the example:
|
||||
```bash
|
||||
conda pack -o environment.tar.gz
|
||||
```
|
||||
|
||||
Some runtime configurations for Spark are as follows:
|
||||
|
||||
* `--master`: a URL format that specifies the Spark master: k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port>.
|
||||
* `--name`: the name of the Spark application.
|
||||
* `--conf spark.kubernetes.container.image`: the name of the BigDL K8s Docker image.
|
||||
* `--conf spark.kubernetes.authenticate.driver.serviceAccountName`: the service account for the driver pod.
|
||||
* `--conf spark.executor.instances`: the number of executors.
|
||||
* `--executor-memory`: the memory for each executor.
|
||||
* `--driver-memory`: the memory for the driver node.
|
||||
* `--num-executors`: the number of executors.
|
||||
* `--executor-cores`: the number of cores for each executor.
|
||||
* `--total-executor-cores`: the total number of executor cores.
|
||||
* `--executor-memory`: the memory for each executor.
|
||||
* `--driver-cores`: the number of cores for the driver.
|
||||
* `--driver-memory`: the memory for the driver.
|
||||
* `--properties-file`: the BigDL configuration properties to be uploaded to K8s.
|
||||
* `--py-files`: the extra Python dependency files to be uploaded to K8s.
|
||||
* `--archives`: the conda archive to be uploaded to K8s.
|
||||
* `--conf spark.driver.extraClassPath`: upload and register BigDL jars files to the driver's classpath.
|
||||
* `--conf spark.executor.extraClassPath`: upload and register BigDL jars files to the executors' classpath.
|
||||
* `--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName`: specify the claim name of `persistentVolumeClaim` to mount `persistentVolume` into executor pods.
|
||||
* `--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.mount.path`: specify the path to be mounted as `persistentVolumeClaim` to executor pods.
|
||||
* `--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.mount.path`: specify the path to be mounted as `persistentVolumeClaim` into executor pods.
|
||||
|
||||
|
||||
#### 6.2.1 K8s Client
|
||||
|
|
@ -378,19 +374,17 @@ ${SPARK_HOME}/bin/spark-submit \
|
|||
--name orca-k8s-client-tutorial \
|
||||
--conf spark.driver.host=${RUNTIME_DRIVER_HOST} \
|
||||
--conf spark.kubernetes.container.image=${RUNTIME_K8S_SPARK_IMAGE} \
|
||||
--conf spark.kubernetes.authenticate.driver.serviceAccountName=${RUNTIME_K8S_SERVICE_ACCOUNT} \
|
||||
--conf spark.executor.instances=${RUNTIME_EXECUTOR_INSTANCES} \
|
||||
--driver-cores ${RUNTIME_DRIVER_CORES} \
|
||||
--driver-memory ${RUNTIME_DRIVER_MEMORY} \
|
||||
--executor-cores ${RUNTIME_EXECUTOR_CORES} \
|
||||
--executor-memory ${RUNTIME_EXECUTOR_MEMORY} \
|
||||
--total-executor-cores ${RUNTIME_TOTAL_EXECUTOR_CORES} \
|
||||
--properties-file ${BIGDL_HOME}/conf/spark-bigdl.conf \
|
||||
--num-executors 2 \
|
||||
--executor-cores 4 \
|
||||
--total-executor-cores 8 \
|
||||
--executor-memory 2g \
|
||||
--driver-cores 2 \
|
||||
--driver-memory 2g \
|
||||
--archives /path/to/environment.tar.gz#environment \
|
||||
--conf spark.pyspark.driver.python=python \
|
||||
--conf spark.pyspark.python=./environment/bin/python \
|
||||
--archives /path/to/environment.tar.gz#environment \
|
||||
--properties-file ${BIGDL_HOME}/conf/spark-bigdl.conf \
|
||||
--py-files ${BIGDL_HOME}/python/bigdl-spark_${SPARK_VERSION}-${BIGDL_VERSION}-python-api.zip,/path/to/train.py,/path/to/model.py \
|
||||
--py-files ${BIGDL_HOME}/python/bigdl-spark_${SPARK_VERSION}-${BIGDL_VERSION}-python-api.zip,/path/to/model.py \
|
||||
--conf spark.driver.extraClassPath=${BIGDL_HOME}/jars/* \
|
||||
--conf spark.executor.extraClassPath=${BIGDL_HOME}/jars/* \
|
||||
--conf spark.kubernetes.executor.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName=${RUNTIME_PERSISTENT_VOLUME_CLAIM} \
|
||||
|
|
@ -429,18 +423,18 @@ ${SPARK_HOME}/bin/spark-submit \
|
|||
--name orca-k8s-cluster-tutorial \
|
||||
--conf spark.kubernetes.container.image=${RUNTIME_K8S_SPARK_IMAGE} \
|
||||
--conf spark.kubernetes.authenticate.driver.serviceAccountName=${RUNTIME_K8S_SERVICE_ACCOUNT} \
|
||||
--conf spark.executor.instances=${RUNTIME_EXECUTOR_INSTANCES} \
|
||||
--num-executors 2 \
|
||||
--executor-cores 4 \
|
||||
--total-executor-cores 8 \
|
||||
--executor-memory 2g \
|
||||
--driver-cores 2 \
|
||||
--driver-memory 2g \
|
||||
--archives file:///bigdl/nfsdata/environment.tar.gz#environment \
|
||||
--conf spark.pyspark.driver.python=environment/bin/python \
|
||||
--conf spark.pyspark.python=environment/bin/python \
|
||||
--conf spark.executorEnv.PYTHONHOME=environment \
|
||||
--conf spark.kubernetes.file.upload.path=/bigdl/nfsdata \
|
||||
--executor-cores ${RUNTIME_EXECUTOR_CORES} \
|
||||
--executor-memory ${RUNTIME_EXECUTOR_MEMORY} \
|
||||
--total-executor-cores ${RUNTIME_TOTAL_EXECUTOR_CORES} \
|
||||
--driver-cores ${RUNTIME_DRIVER_CORES} \
|
||||
--driver-memory ${RUNTIME_DRIVER_MEMORY} \
|
||||
--properties-file ${BIGDL_HOME}/conf/spark-bigdl.conf \
|
||||
--py-files local://${BIGDL_HOME}/python/bigdl-spark_3.1.2-2.1.0-SNAPSHOT-python-api.zip,file:///bigdl/nfsdata/train.py,file:///bigdl/nfsdata/model.py \
|
||||
--py-files ${BIGDL_HOME}/python/bigdl-spark_${SPARK_VERSION}-${BIGDL_VERSION}-python-api.zip,file:///bigdl/nfsdata/train.py,file:///bigdl/nfsdata/model.py \
|
||||
--conf spark.driver.extraClassPath=local://${BIGDL_HOME}/jars/* \
|
||||
--conf spark.executor.extraClassPath=local://${BIGDL_HOME}/jars/* \
|
||||
--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName=${RUNTIME_PERSISTENT_VOLUME_CLAIM} \
|
||||
|
|
@ -452,9 +446,12 @@ ${SPARK_HOME}/bin/spark-submit \
|
|||
|
||||
In the `spark-submit` script:
|
||||
* `deploy-mode`: set it to `cluster` when running programs on k8s-cluster mode.
|
||||
* `spark.pyspark.python`: sset the Python location in conda archive as each executor's Python environment.
|
||||
* `spark.executorEnv.PYTHONHOME`: the search path of Python libraries on executor pods.
|
||||
* `spark.kubernetes.file.upload.path`: the path to store files at spark submit side in k8s-cluster mode.
|
||||
* `--conf spark.kubernetes.authenticate.driver.serviceAccountName`: the service account for the driver pod.
|
||||
* `--conf spark.pyspark.driver.python`: set the Python location in conda archive as the driver's Python environment.
|
||||
* `--conf spark.pyspark.python`: also set the Python location in conda archive as each executor's Python environment.
|
||||
* `--conf spark.kubernetes.file.upload.path`: the path to store files at spark submit side in k8s-cluster mode.
|
||||
* `--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.options.claimName`: specify the claim name of `persistentVolumeClaim` to mount `persistentVolume` into the driver pod.
|
||||
* `--conf spark.kubernetes.driver.volumes.persistentVolumeClaim.${RUNTIME_PERSISTENT_VOLUME_CLAIM}.mount.path`: specify the path to be mounted as `persistentVolumeClaim` into the driver pod.
|
||||
|
||||
|
||||
### 6.3 Use Kubernetes Deployment (with Conda Archive)
|
||||
|
|
|
|||
|
|
@ -91,7 +91,7 @@ __Note__:
|
|||
### 2.2 Install Python Libraries
|
||||
- See [here](../Overview/install.md#install-anaconda) to install conda and prepare the Python environment on the __Client Node__.
|
||||
|
||||
- See [here](../Overview/install.md#install-bigdl-orca) to install BigDL Orca in the created conda environment. Note that if you use [`spark-submit`](#use-spark-submit), please skip this step and __DO NOT__ install BigDL Orca.
|
||||
- See [here](../Overview/install.md#install-bigdl-orca) to install BigDL Orca in the created conda environment. *Note that if you use [`spark-submit`](#use-spark-submit), please __skip__ this step and __DO NOT__ install BigDL Orca with pip install command in the conda environment.*
|
||||
|
||||
- You should install all the other Python libraries that you need in your program in the conda environment as well. `torch` and `torchvision` are needed to run the Fashion-MNIST example:
|
||||
```bash
|
||||
|
|
@ -233,10 +233,12 @@ conda pack -o environment.tar.gz
|
|||
|
||||
Some runtime configurations for Spark are as follows:
|
||||
|
||||
* `--executor-memory`: the memory for each executor.
|
||||
* `--driver-memory`: the memory for the driver node.
|
||||
* `--executor-cores`: the number of cores for each executor.
|
||||
* `--master`: the spark master, set it to "yarn".
|
||||
* `--num_executors`: the number of executors.
|
||||
* `--executor-cores`: the number of cores for each executor.
|
||||
* `--executor-memory`: the memory for each executor.
|
||||
* `--driver-cores`: the number of cores for the driver.
|
||||
* `--driver-memory`: the memory for the driver.
|
||||
* `--py-files`: the extra Python dependency files to be uploaded to YARN.
|
||||
* `--archives`: the conda archive to be uploaded to YARN.
|
||||
|
||||
|
|
@ -246,10 +248,11 @@ Submit and run the example for `yarn-client` mode following the `bigdl-submit` s
|
|||
bigdl-submit \
|
||||
--master yarn \
|
||||
--deploy-mode client \
|
||||
--executor-memory 2g \
|
||||
--driver-memory 2g \
|
||||
--executor-cores 4 \
|
||||
--num-executors 2 \
|
||||
--executor-cores 4 \
|
||||
--executor-memory 2g \
|
||||
--driver-cores 2 \
|
||||
--driver-memory 2g \
|
||||
--py-files model.py \
|
||||
--archives /path/to/environment.tar.gz#environment \
|
||||
--conf spark.pyspark.driver.python=/path/to/python \
|
||||
|
|
@ -257,7 +260,6 @@ bigdl-submit \
|
|||
train.py --cluster_mode bigdl-submit --data_dir hdfs://path/to/remote/data
|
||||
```
|
||||
In the `bigdl-submit` script:
|
||||
* `--master`: the spark master, set it to "yarn".
|
||||
* `--deploy-mode`: set it to `client` when running programs on yarn-client mode.
|
||||
* `--conf spark.pyspark.driver.python`: set the activate Python location on __Client Node__ as the driver's Python environment. You can find it by running `which python`.
|
||||
* `--conf spark.pyspark.python`: set the Python location in conda archive as each executor's Python environment.
|
||||
|
|
@ -269,10 +271,11 @@ Submit and run the program for `yarn-cluster` mode following the `bigdl-submit`
|
|||
bigdl-submit \
|
||||
--master yarn \
|
||||
--deploy-mode cluster \
|
||||
--executor-memory 2g \
|
||||
--driver-memory 2g \
|
||||
--executor-cores 4 \
|
||||
--num-executors 2 \
|
||||
--executor-cores 4 \
|
||||
--executor-memory 2g \
|
||||
--driver-cores 2 \
|
||||
--driver-memory 2g \
|
||||
--py-files model.py \
|
||||
--archives /path/to/environment.tar.gz#environment \
|
||||
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=environment/bin/python \
|
||||
|
|
@ -280,7 +283,6 @@ bigdl-submit \
|
|||
train.py --cluster_mode bigdl-submit --data_dir hdfs://path/to/remote/data
|
||||
```
|
||||
In the `bigdl-submit` script:
|
||||
* `--master`: the spark master, set it to "yarn".
|
||||
* `--deploy-mode`: set it to `cluster` when running programs on yarn-cluster mode.
|
||||
* `--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON`: set the Python location in conda archive as the Python environment of the Application Master.
|
||||
* `--conf spark.executorEnv.PYSPARK_PYTHON`: also set the Python location in conda archive as each executor's Python environment. The Application Master and the executors will all use the archive for the Python environment.
|
||||
|
|
@ -294,11 +296,11 @@ If you prefer to use `spark-submit` instead of `bigdl-submit`, please follow the
|
|||
sc = init_orca_context(cluster_mode="spark-submit")
|
||||
```
|
||||
|
||||
2. Download the requirement file from [here](https://github.com/intel-analytics/BigDL/tree/main/python/requirements/orca) and install the required Python libraries of BigDL Orca according to your needs.
|
||||
2. Download the requirement file(s) from [here](https://github.com/intel-analytics/BigDL/tree/main/python/requirements/orca) and install the required Python libraries of BigDL Orca according to your needs.
|
||||
```bash
|
||||
pip install -r /path/to/requirements.txt
|
||||
```
|
||||
Note that you are recommended **NOT** to install BigDL Orca in the conda environment if you use spark-submit to avoid possible conflicts.
|
||||
Note that you are recommended **NOT** to install BigDL Orca with pip install command in the conda environment if you use spark-submit to avoid possible conflicts.
|
||||
|
||||
3. Pack the current activate conda environment to an archive before submitting the example:
|
||||
```bash
|
||||
|
|
@ -307,22 +309,24 @@ If you prefer to use `spark-submit` instead of `bigdl-submit`, please follow the
|
|||
|
||||
4. Download the BigDL assembly package from [here](../Overview/install.html#download-bigdl-orca) and unzip it. Then setup the environment variables `${BIGDL_HOME}` and `${BIGDL_VERSION}`.
|
||||
```bash
|
||||
export BIGDL_HOME=/path/to/unzipped_BigDL # the folder path where you extract the BigDL package
|
||||
export BIGDL_VERSION="downloaded BigDL version"
|
||||
export BIGDL_HOME=/path/to/unzipped_BigDL # the folder path where you extract the BigDL package
|
||||
```
|
||||
|
||||
5. Download and extract [Spark](https://archive.apache.org/dist/spark/). BigDL is currently released for [Spark 2.4](https://archive.apache.org/dist/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz) and [Spark 3.1](https://archive.apache.org/dist/spark/spark-3.1.3/spark-3.1.3-bin-hadoop2.7.tgz). Make sure the version of your downloaded Spark matches the one that your downloaded BigDL is released with. Then setup the environment variables `${SPARK_HOME}` and `${SPARK_VERSION}`.
|
||||
```bash
|
||||
export SPARK_HOME=/path/to/uncompressed_spark # the folder path where you extract the Spark package
|
||||
export SPARK_VERSION="downloaded Spark version"
|
||||
export SPARK_HOME=/path/to/uncompressed_spark # the folder path where you extract the Spark package
|
||||
```
|
||||
|
||||
Some runtime configurations for Spark are as follows:
|
||||
|
||||
* `--executor-memory`: the memory for each executor.
|
||||
* `--driver-memory`: the memory for the driver node.
|
||||
* `--executor-cores`: the number of cores for each executor.
|
||||
* `--master`: the spark master, set it to "yarn".
|
||||
* `--num_executors`: the number of executors.
|
||||
* `--executor-cores`: the number of cores for each executor.
|
||||
* `--executor-memory`: the memory for each executor.
|
||||
* `--driver-cores`: the number of cores for the driver.
|
||||
* `--driver-memory`: the memory for the driver.
|
||||
* `--py-files`: the extra Python dependency files to be uploaded to YARN.
|
||||
* `--archives`: the conda archive to be uploaded to YARN.
|
||||
* `--properties-file`: the BigDL configuration properties to be uploaded to YARN.
|
||||
|
|
@ -334,10 +338,11 @@ Submit and run the program for `yarn-client` mode following the `spark-submit` s
|
|||
${SPARK_HOME}/bin/spark-submit \
|
||||
--master yarn \
|
||||
--deploy-mode client \
|
||||
--executor-memory 2g \
|
||||
--driver-memory 2g \
|
||||
--executor-cores 4 \
|
||||
--num-executors 2 \
|
||||
--executor-cores 4 \
|
||||
--executor-memory 2g \
|
||||
--driver-cores 2 \
|
||||
--driver-memory 2g \
|
||||
--archives /path/to/environment.tar.gz#environment \
|
||||
--properties-file ${BIGDL_HOME}/conf/spark-bigdl.conf \
|
||||
--conf spark.pyspark.driver.python=/path/to/python \
|
||||
|
|
@ -347,7 +352,6 @@ ${SPARK_HOME}/bin/spark-submit \
|
|||
train.py --cluster_mode spark-submit --data_dir hdfs://path/to/remote/data
|
||||
```
|
||||
In the `spark-submit` script:
|
||||
* `--master`: the spark master, set it to "yarn".
|
||||
* `--deploy-mode`: set it to `client` when running programs on yarn-client mode.
|
||||
* `--conf spark.pyspark.driver.python`: set the activate Python location on __Client Node__ as the driver's Python environment. You can find the location by running `which python`.
|
||||
* `--conf spark.pyspark.python`: set the Python location in conda archive as each executor's Python environment.
|
||||
|
|
@ -358,10 +362,11 @@ Submit and run the program for `yarn-cluster` mode following the `spark-submit`
|
|||
${SPARK_HOME}/bin/spark-submit \
|
||||
--master yarn \
|
||||
--deploy-mode cluster \
|
||||
--executor-memory 2g \
|
||||
--driver-memory 2g \
|
||||
--executor-cores 4 \
|
||||
--num-executors 2 \
|
||||
--executor-cores 4 \
|
||||
--executor-memory 2g \
|
||||
--driver-cores 2 \
|
||||
--driver-memory 2g \
|
||||
--archives /path/to/environment.tar.gz#environment \
|
||||
--properties-file ${BIGDL_HOME}/conf/spark-bigdl.conf \
|
||||
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=environment/bin/python \
|
||||
|
|
@ -371,7 +376,6 @@ ${SPARK_HOME}/bin/spark-submit \
|
|||
train.py --cluster_mode spark-submit --data_dir hdfs://path/to/remote/data
|
||||
```
|
||||
In the `spark-submit` script:
|
||||
* `--master`: the spark master, set it to "yarn".
|
||||
* `--deploy-mode`: set it to `cluster` when running programs on yarn-cluster mode.
|
||||
* `--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON`: set the Python location in conda archive as the Python environment of the Application Master.
|
||||
* `--conf spark.executorEnv.PYSPARK_PYTHON`: also set the Python location in conda archive as each executor's Python environment. The Application Master and the executors will all use the archive for the Python environment.
|
||||
|
|
|
|||
Loading…
Reference in a new issue