Merge Orca quick starts and how to (#7133)

* add tf2 to howto * update tf2 * remove * modify sidebar * remove quickstart * minor
2022-12-30 16:05:15 +08:00 · 2022-12-30 16:05:15 +08:00 · 264451c6bd
commit 264451c6bd
parent 6375e36f00
13 changed files with 50 additions and 104 deletions
--- a/docs/readthedocs/source/_templates/sidebar_quicklinks.html
+++ b/docs/readthedocs/source/_templates/sidebar_quicklinks.html
@ -10,19 +10,13 @@
                </label>
                <ul class="bigdl-quicklinks-section-nav">
                    <li>
-                        <a href="doc/Orca/QuickStart/orca-tf2keras-quickstart.html">TensorFlow 2 Quickstart</a>
+                        <a href="doc/Orca/Howto/tf2keras-quickstart.html">Scale TensorFlow 2 Applications</a>
                    </li>
 		    <li>
-                        <a href="doc/Orca/QuickStart/orca-pytorch-quickstart.html">PyTorch Quickstart</a>
+                        <a href="doc/Orca/Howto/pytorch-quickstart.html">Scale PyTorch Applications</a>
                    </li>
 		    <li>
-                        <a href="doc/Orca/QuickStart/ray-quickstart.html">RayOnSpark Quickstart</a>
-                    </li>
-		    <li>
-                        <a href="doc/Orca/QuickStart/orca-tf-quickstart.html">TensorFlow 1.15 Quickstart</a>
-                    </li>
-		    <li>
-                        <a href="doc/Orca/QuickStart/orca-keras-quickstart.html">Keras 2.3 Quickstart</a>
+                        <a href="doc/Orca/Howto/ray-quickstart.html">Run Ray programs on Big Data clusters</a>
                    </li>
                </ul>
            </li>
--- a/docs/readthedocs/source/_toc.yml
+++ b/docs/readthedocs/source/_toc.yml
@ -38,23 +38,19 @@ subtrees:
                - file: doc/Orca/Overview/distributed-training-inference
                - file: doc/Orca/Overview/distributed-tuning
                - file: doc/Orca/Overview/ray
-          - file: doc/Orca/QuickStart/index
-            title: "Quickstarts"
-            subtrees:
-              - entries:
-                - file: doc/Orca/QuickStart/orca-tf2keras-quickstart
-                - file: doc/Orca/QuickStart/orca-pytorch-quickstart
-                - file: doc/Orca/QuickStart/ray-quickstart
-                - file: doc/Orca/QuickStart/orca-tf-quickstart
-                - file: doc/Orca/QuickStart/orca-keras-quickstart
          - file: doc/Orca/Howto/index
            title: "How-to Guides"
            subtrees:
              - entries:
+                - file: doc/Orca/Howto/tf2keras-quickstart
+                - file: doc/Orca/Howto/pytorch-quickstart
+                - file: doc/Orca/Howto/ray-quickstart
                - file: doc/Orca/Howto/spark-dataframe
                - file: doc/Orca/Howto/xshards-pandas
-                - file: doc/Orca/Howto/orca-autoestimator-pytorch-quickstart
-                - file: doc/Orca/Howto/orca-autoxgboost-quickstart
+                - file: doc/Orca/Howto/autoestimator-pytorch-quickstart
+                - file: doc/Orca/Howto/autoxgboost-quickstart
+                - file: doc/Orca/Howto/tf1-quickstart
+                - file: doc/Orca/Howto/tf1keras-quickstart
          - file: doc/Orca/Tutorial/index
            title: "Tutorials"
            subtrees:
--- a/docs/readthedocs/source/doc/Orca/Howto/orca-autoestimator-pytorch-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/Howto/orca-autoestimator-pytorch-quickstart.md
--- a/docs/readthedocs/source/doc/Orca/Howto/orca-autoxgboost-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/Howto/orca-autoxgboost-quickstart.md
--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart-bigdl.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart-bigdl.md
@ -40,9 +40,9 @@ elif cluster_mode == "yarn":  # For Hadoop/YARN cluster
        "spark.driver.extraJavaOptions": "-Dbigdl.failure.retryTimes=1"})
 ```

-This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
+This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.

-**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
+**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.

 ### Step 2: Define the Model

--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart-ray.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart-ray.md
@ -31,9 +31,9 @@ elif cluster_mode == "yarn":  # For Hadoop/YARN cluster
    init_orca_context(cluster_mode="yarn", cores=2, num_nodes=2, memory="10g", driver_memory="10g", driver_cores=1)
 ```

-This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
+This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.

-**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
+**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.

 ### Step 2: Define the Model

--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart.md
@ -1,4 +1,4 @@
-# PyTorch Quickstart
+# Scale PyTorch Applications

 ---

@ -33,9 +33,9 @@ elif cluster_mode == "yarn":  # For Hadoop/YARN cluster
    init_orca_context(cluster_mode="yarn", num_nodes=2, cores=2, memory="10g", driver_memory="10g", driver_cores=1)
 ```

-This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
+This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.

-**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
+**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.

 ### Step 2: Define the Model

--- a/docs/readthedocs/source/doc/Orca/QuickStart/ray-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/ray-quickstart.md
@ -1,4 +1,4 @@
-# RayOnSpark Quickstart
+# Run Ray programs on Big Data clusters

 ---

@ -33,7 +33,7 @@ elif cluster_mode == "yarn":  # For Hadoop/YARN cluster
    sc = init_orca_context(cluster_mode="yarn", num_nodes=2, cores=2, memory="10g", driver_memory="10g", driver_cores=1, init_ray_on_spark=True)
 ```

-This is the only place where you need to specify local or distributed mode. See [here](./../Overview/ray.md#initialize) for more RayOnSpark related arguments when you `init_orca_context`.
+This is the only place where you need to specify local or distributed mode. See [here](../Overview/ray.md#initialize) for more RayOnSpark related arguments when you `init_orca_context`.

 By default, the Ray cluster would be launched using Spark barrier execution mode, you can turn it off via the configurations of `OrcaContext`:

@ -43,9 +43,9 @@ from bigdl.orca import OrcaContext
 OrcaContext.barrier_mode = False
 ```

-View [Orca Context](./../Overview/orca-context.md) for more details.
+View [Orca Context](../Overview/orca-context.md) for more details.

-**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
+**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.

 You can retrieve the information of the Ray cluster via `OrcaContext`:

--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf-quickstart.md
@ -1,4 +1,4 @@
-# TensorFlow 1.15 Quickstart
+# Scale TensorFlow 1.15 Applications

 ---

@ -6,7 +6,7 @@

 ---

-**In this guide we will describe how to scale out _TensorFlow 1.15_ programs using Orca in 4 simple steps.** (_[Keras 2.3](./orca-keras-quickstart.md) and [TensorFlow 2](./orca-tf2keras-quickstart.md) guides are also available._)
+**In this guide we will describe how to scale out _TensorFlow 1.15_ programs using Orca in 4 simple steps.**

 ### Step 0: Prepare Environment

@ -35,9 +35,9 @@ elif cluster_mode == "yarn":  # For Hadoop/YARN cluster
    dataset_dir = "hdfs:///tensorflow_datasets"
 ```

-This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
+This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.

-**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).
+**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).

 ### Step 2: Define the Model

--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-keras-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-keras-quickstart.md
@ -1,4 +1,4 @@
-# Keras 2.3 Quickstart
+# Scale Keras 2.3 Applications

 ---

@ -6,7 +6,7 @@

 ---

-**In this guide we will describe how to scale out _Keras 2.3_ programs using Orca in 4 simple steps.** (_[TensorFlow 1.5](./orca-tf-quickstart.md) and [TensorFlow 2](./orca-tf2keras-quickstart.md) guides are also available._)
+**In this guide we will describe how to scale out _Keras 2.3_ programs using Orca in 4 simple steps.**


 ### Step 0: Prepare Environment
@ -38,9 +38,9 @@ elif cluster_mode == "yarn":  # For Hadoop/YARN cluster
    dataset_dir = "hdfs:///tensorflow_datasets"
 ```

-This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
+This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.

-**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).
+**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).

 ### Step 2: Define the Model

--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf2keras-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf2keras-quickstart.md
@ -1,4 +1,4 @@
-# TensorFlow 2 Quickstart
+# Scale TensorFlow2 Applications

 ---

@ -6,15 +6,16 @@

 ---

-**In this guide we will describe how to to scale out _TensorFlow 2_ programs using Orca in 4 simple steps.** (_[TensorFlow 1.5](./orca-tf-quickstart.md) and [Keras 2.3](./orca-keras-quickstart.md) guides are also available._)
+**In this guide we will describe how to to scale out _TensorFlow 2_ programs using Orca in 4 simple steps.**

 ### Step 0: Prepare Environment

-We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
+We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../Overview/install.md) for more details.

 ```bash
 conda create -n py37 python=3.7  # "py37" is conda environment name, you can use any name you like.
 conda activate py37
+
 pip install bigdl-orca[ray]
 pip install tensorflow
 ```
@ -24,20 +25,20 @@ pip install tensorflow
 from bigdl.orca import init_orca_context, stop_orca_context

 if cluster_mode == "local":  # For local machine
-    init_orca_context(cluster_mode="local", cores=4, memory="10g")
+    init_orca_context(cluster_mode="local", cores=4, memory="4g")
 elif cluster_mode == "k8s":  # For K8s cluster
-    init_orca_context(cluster_mode="k8s", num_nodes=2, cores=2, memory="10g", driver_memory="10g", driver_cores=1)
+    init_orca_context(cluster_mode="k8s", num_nodes=2, cores=2, memory="4g")
 elif cluster_mode == "yarn":  # For Hadoop/YARN cluster
-    init_orca_context(cluster_mode="yarn", num_nodes=2, cores=2, memory="10g", driver_memory="10g", driver_cores=1)
+    init_orca_context(cluster_mode="yarn", num_nodes=2, cores=2, memory="4g")
 ```

-This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
+This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.

-**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
+Please check the tutorials if you want to run on [Kubernetes](../Tutorial/k8s.md) or [Hadoop/YARN](../Tutorial/yarn.md) clusters.

 ### Step 2: Define the Model

-You can then define the Keras model in the _Creator Function_ using the standard TensroFlow 2 APIs.
+You can then define the Keras model in the _Creator Function_ using the standard TensorFlow 2 Keras APIs.

 ```python
 import tensorflow as tf
@ -61,9 +62,9 @@ def model_creator(config):
                  metrics=['accuracy'])
    return model
 ```
-### Step 3: Define Train Dataset
+### Step 3: Define the Dataset

-You can define the dataset in the _Creator Function_ using standard [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) APIs. Orca also supports [Spark DataFrame](https://spark.apache.org/docs/latest/sql-programming-guide.html) and [Orca XShards](../Overview/data-parallel-processing.md).
+You can define the dataset in the _Creator Function_ using standard [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) APIs. Orca also supports [Spark DataFrame](./spark-dataframe.md) and [Orca XShards](./xshards-pandas.md).


 ```python
@ -93,7 +94,7 @@ def val_data_creator(config, batch_size):

 ### Step 4: Fit with Orca Estimator

-First, create an Estimator.
+First, create an Orca Estimator for TensorFlow 2.

 ```python
 from bigdl.orca.learn.tf2 import Estimator
@ -110,17 +111,16 @@ stats = est.fit(train_data_creator,
                steps_per_epoch=60000 // batch_size,
                validation_data=val_data_creator,
                validation_steps=10000 // batch_size)
-                
-est.save("/tmp/mnist_keras.ckpt")

 stats = est.evaluate(val_data_creator, num_steps=10000 // batch_size)
-est.shutdown()
 print(stats)
+
+est.shutdown()
 ```

 ### Step 5: Save and Load the Model

-Orca TF2 Estimator supports two formats to save and load the entire model (**TensorFlow SavedModel and Keras H5 Format**). The recommended format is SavedModel, which is the default format when you use `estimator.save()`.
+Orca TensorFlow 2 Estimator supports two formats to save and load the entire model (**TensorFlow SavedModel and Keras H5 Format**). The recommended format is SavedModel, which is the default format when you use `estimator.save()`.

 You could also save the model to Keras H5 format by passing `save_format='h5'` or a filename that ends in `.h5` or `.keras` to `estimator.save()`.

@ -130,22 +130,22 @@ You could also save the model to Keras H5 format by passing `save_format='h5'`

 ```python
 # save model in SavedModel format
-est.save("/tmp/cifar10_model")
+est.save("lenet_model")

 # load model
-est.load("/tmp/cifar10_model")
+est.load("lenet_model")
 ```

 **2. HDF5 format**

 ```python
 # save model in H5 format
-est.save("/tmp/cifar10_model.h5", save_format='h5')
+est.save("lenet_model.h5", save_format='h5')

 # load model
-est.load("/tmp/cifar10_model.h5")
+est.load("lenet_model.h5")
 ```

-That's it, the same code can run seamlessly in your local laptop and to distribute K8s or Hadoop cluster.
+That's it, the same code can run seamlessly on your local laptop and scale to [Kubernetes](../Tutorial/k8s.md) or [Hadoop/YARN](../Tutorial/yarn.md) clusters.

 **Note:** You should call `stop_orca_context()` when your program finishes.
--- a/docs/readthedocs/source/doc/Orca/QuickStart/index.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/index.md
@ -1,43 +0,0 @@
-# Orca Quickstarts
-
-
- [**TensorFlow 2 Quickstart**](./orca-tf2keras-quickstart.html)
-
-    > ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf2_keras_lenet_mnist.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf2_keras_lenet_mnist.ipynb)
-
-    In this guide we will describe how to scale out TensorFlow 2 programs using Orca in 5 simple steps.
-
---------------------------
-
- [**PyTorch Quickstart**](./orca-pytorch-quickstart.html)
-
-    > ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/pytorch_lenet_mnist.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/pytorch_lenet_mnist_spark.ipynb)
-
-    In this guide we will describe how to scale out PyTorch programs using Orca in 5 simple steps.
-
---------------------------
-
- [**RayOnSpark Quickstart**](./ray-quickstart.html)
-
-    > ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/ray_parameter_server.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/ray_parameter_server.ipynb)
-
-    In this guide we will describe how to use RayOnSpark to directly run Ray programs on Big Data clusters in 2 simple steps.
-
---------------------------
-
- [**TensorFlow 1.15 Quickstart**](./orca-tf-quickstart.html)
-
-    > ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf_lenet_mnist.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf_lenet_mnist.ipynb)
-
-    In this guide we will describe how to scale out TensorFlow 1.15 programs using Orca in 4 simple steps.
-
---------------------------
-
- [**Keras 2.3 Quickstart**](./orca-keras-quickstart.html)
-
-    > ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/keras_lenet_mnist.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/keras_lenet_mnist.ipynb)
-
-    In this guide we will describe how to scale out Keras 2.3 programs using Orca in 4 simple steps.
-
---------------------------
-
--- a/docs/readthedocs/source/doc/Orca/index.rst
+++ b/docs/readthedocs/source/doc/Orca/index.rst
@ -30,7 +30,6 @@ Most AI projects start with a Python notebook running on a single laptop; howeve

        +++

-        :bdg-link:`Quickstarts <./QuickStart/index.html>` |
        :bdg-link:`How-to Guides <./Howto/index.html>` |
        :bdg-link:`Tutorials <./Tutorial/index.html>`