Update spark-submit section in yarn tutorial (#7209)

* update * update
2023-01-10 09:55:30 +08:00 · 2023-01-10 09:55:30 +08:00 · af9cdc6edd
commit af9cdc6edd
parent c4874f35c8
1 changed files with 8 additions and 7 deletions
--- a/docs/readthedocs/source/doc/Orca/Tutorial/yarn.md
+++ b/docs/readthedocs/source/doc/Orca/Tutorial/yarn.md
@ -295,26 +295,27 @@ If you prefer to use `spark-submit` instead of `bigdl-submit`, please follow the
    sc = init_orca_context(cluster_mode="spark-submit")
    ```

-2. Download requirement file [here](https://github.com/intel-analytics/BigDL/tree/main/python/requirements/orca) and install required Python libraries of BigDL Orca according to your needs.
+2. Download the requirement file from [here](https://github.com/intel-analytics/BigDL/tree/main/python/requirements/orca) and install the required Python libraries of BigDL Orca according to your needs.
    ```bash
    pip install -r /path/to/requirements.txt
    ```
+    Note that you are recommended **NOT** to install BigDL Orca in the conda environment if you use spark-submit to avoid possible conflicts.

 3. Pack the current activate conda environment to an archive before submitting the example:
    ```bash
    conda pack -o environment.tar.gz
    ```

-4. Download and extract [Spark](https://archive.apache.org/dist/spark/). Then setup the environment variables `${SPARK_HOME}` and `${SPARK_VERSION}`.
+4. Download the BigDL assembly package from [here](../Overview/install.html#download-bigdl-orca) and unzip it. Then setup the environment variables `${BIGDL_HOME}` and `${BIGDL_VERSION}`.
    ```bash
-    export SPARK_HOME=/path/to/spark # the folder path where you extract the Spark package
-    export SPARK_VERSION="downloaded spark version"
+    export BIGDL_HOME=/path/to/unzipped_BigDL  # the folder path where you extract the BigDL package
+    export BIGDL_VERSION="downloaded BigDL version"
    ```

-5. Refer to [here](../Overview/install.html#download-bigdl-orca) to download and unzip a BigDL assembly package. Make sure the Spark version of your downloaded BigDL matches your downloaded Spark. Then setup the environment variables `${BIGDL_HOME}` and `${BIGDL_VERSION}`.
+5. Download and extract [Spark](https://archive.apache.org/dist/spark/). BigDL is currently released for [Spark 2.4](https://archive.apache.org/dist/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz) and [Spark 3.1](https://archive.apache.org/dist/spark/spark-3.1.3/spark-3.1.3-bin-hadoop2.7.tgz). Make sure the version of your downloaded Spark matches the one that your downloaded BigDL is released with. Then setup the environment variables `${SPARK_HOME}` and `${SPARK_VERSION}`.
    ```bash
-    export BIGDL_HOME=/path/to/unzipped_BigDL
-    export BIGDL_VERSION="downloaded BigDL version"
+    export SPARK_HOME=/path/to/uncompressed_spark  # the folder path where you extract the Spark package
+    export SPARK_VERSION="downloaded Spark version"
    ```

 Some runtime configurations for Spark are as follows: