Merge Orca quick starts and how to (#7133)

* add tf2 to howto

* update tf2

* remove

* modify sidebar

* remove quickstart

* minor
This commit is contained in:
Kai Huang 2022-12-30 16:05:15 +08:00 committed by GitHub
parent 6375e36f00
commit 264451c6bd
13 changed files with 50 additions and 104 deletions

View file

@ -10,19 +10,13 @@
</label>
<ul class="bigdl-quicklinks-section-nav">
<li>
<a href="doc/Orca/QuickStart/orca-tf2keras-quickstart.html">TensorFlow 2 Quickstart</a>
<a href="doc/Orca/Howto/tf2keras-quickstart.html">Scale TensorFlow 2 Applications</a>
</li>
<li>
<a href="doc/Orca/QuickStart/orca-pytorch-quickstart.html">PyTorch Quickstart</a>
<a href="doc/Orca/Howto/pytorch-quickstart.html">Scale PyTorch Applications</a>
</li>
<li>
<a href="doc/Orca/QuickStart/ray-quickstart.html">RayOnSpark Quickstart</a>
</li>
<li>
<a href="doc/Orca/QuickStart/orca-tf-quickstart.html">TensorFlow 1.15 Quickstart</a>
</li>
<li>
<a href="doc/Orca/QuickStart/orca-keras-quickstart.html">Keras 2.3 Quickstart</a>
<a href="doc/Orca/Howto/ray-quickstart.html">Run Ray programs on Big Data clusters</a>
</li>
</ul>
</li>

View file

@ -38,23 +38,19 @@ subtrees:
- file: doc/Orca/Overview/distributed-training-inference
- file: doc/Orca/Overview/distributed-tuning
- file: doc/Orca/Overview/ray
- file: doc/Orca/QuickStart/index
title: "Quickstarts"
subtrees:
- entries:
- file: doc/Orca/QuickStart/orca-tf2keras-quickstart
- file: doc/Orca/QuickStart/orca-pytorch-quickstart
- file: doc/Orca/QuickStart/ray-quickstart
- file: doc/Orca/QuickStart/orca-tf-quickstart
- file: doc/Orca/QuickStart/orca-keras-quickstart
- file: doc/Orca/Howto/index
title: "How-to Guides"
subtrees:
- entries:
- file: doc/Orca/Howto/tf2keras-quickstart
- file: doc/Orca/Howto/pytorch-quickstart
- file: doc/Orca/Howto/ray-quickstart
- file: doc/Orca/Howto/spark-dataframe
- file: doc/Orca/Howto/xshards-pandas
- file: doc/Orca/Howto/orca-autoestimator-pytorch-quickstart
- file: doc/Orca/Howto/orca-autoxgboost-quickstart
- file: doc/Orca/Howto/autoestimator-pytorch-quickstart
- file: doc/Orca/Howto/autoxgboost-quickstart
- file: doc/Orca/Howto/tf1-quickstart
- file: doc/Orca/Howto/tf1keras-quickstart
- file: doc/Orca/Tutorial/index
title: "Tutorials"
subtrees:

View file

@ -40,9 +40,9 @@ elif cluster_mode == "yarn": # For Hadoop/YARN cluster
"spark.driver.extraJavaOptions": "-Dbigdl.failure.retryTimes=1"})
```
This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.
### Step 2: Define the Model

View file

@ -31,9 +31,9 @@ elif cluster_mode == "yarn": # For Hadoop/YARN cluster
init_orca_context(cluster_mode="yarn", cores=2, num_nodes=2, memory="10g", driver_memory="10g", driver_cores=1)
```
This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.
### Step 2: Define the Model

View file

@ -1,4 +1,4 @@
# PyTorch Quickstart
# Scale PyTorch Applications
---
@ -33,9 +33,9 @@ elif cluster_mode == "yarn": # For Hadoop/YARN cluster
init_orca_context(cluster_mode="yarn", num_nodes=2, cores=2, memory="10g", driver_memory="10g", driver_cores=1)
```
This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.
### Step 2: Define the Model

View file

@ -1,4 +1,4 @@
# RayOnSpark Quickstart
# Run Ray programs on Big Data clusters
---
@ -33,7 +33,7 @@ elif cluster_mode == "yarn": # For Hadoop/YARN cluster
sc = init_orca_context(cluster_mode="yarn", num_nodes=2, cores=2, memory="10g", driver_memory="10g", driver_cores=1, init_ray_on_spark=True)
```
This is the only place where you need to specify local or distributed mode. See [here](./../Overview/ray.md#initialize) for more RayOnSpark related arguments when you `init_orca_context`.
This is the only place where you need to specify local or distributed mode. See [here](../Overview/ray.md#initialize) for more RayOnSpark related arguments when you `init_orca_context`.
By default, the Ray cluster would be launched using Spark barrier execution mode, you can turn it off via the configurations of `OrcaContext`:
@ -43,9 +43,9 @@ from bigdl.orca import OrcaContext
OrcaContext.barrier_mode = False
```
View [Orca Context](./../Overview/orca-context.md) for more details.
View [Orca Context](../Overview/orca-context.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.
You can retrieve the information of the Ray cluster via `OrcaContext`:

View file

@ -1,4 +1,4 @@
# TensorFlow 1.15 Quickstart
# Scale TensorFlow 1.15 Applications
---
@ -6,7 +6,7 @@
---
**In this guide we will describe how to scale out _TensorFlow 1.15_ programs using Orca in 4 simple steps.** (_[Keras 2.3](./orca-keras-quickstart.md) and [TensorFlow 2](./orca-tf2keras-quickstart.md) guides are also available._)
**In this guide we will describe how to scale out _TensorFlow 1.15_ programs using Orca in 4 simple steps.**
### Step 0: Prepare Environment
@ -35,9 +35,9 @@ elif cluster_mode == "yarn": # For Hadoop/YARN cluster
dataset_dir = "hdfs:///tensorflow_datasets"
```
This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).
### Step 2: Define the Model

View file

@ -1,4 +1,4 @@
# Keras 2.3 Quickstart
# Scale Keras 2.3 Applications
---
@ -6,7 +6,7 @@
---
**In this guide we will describe how to scale out _Keras 2.3_ programs using Orca in 4 simple steps.** (_[TensorFlow 1.5](./orca-tf-quickstart.md) and [TensorFlow 2](./orca-tf2keras-quickstart.md) guides are also available._)
**In this guide we will describe how to scale out _Keras 2.3_ programs using Orca in 4 simple steps.**
### Step 0: Prepare Environment
@ -38,9 +38,9 @@ elif cluster_mode == "yarn": # For Hadoop/YARN cluster
dataset_dir = "hdfs:///tensorflow_datasets"
```
This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).
### Step 2: Define the Model

View file

@ -1,4 +1,4 @@
# TensorFlow 2 Quickstart
# Scale TensorFlow2 Applications
---
@ -6,15 +6,16 @@
---
**In this guide we will describe how to to scale out _TensorFlow 2_ programs using Orca in 4 simple steps.** (_[TensorFlow 1.5](./orca-tf-quickstart.md) and [Keras 2.3](./orca-keras-quickstart.md) guides are also available._)
**In this guide we will describe how to to scale out _TensorFlow 2_ programs using Orca in 4 simple steps.**
### Step 0: Prepare Environment
We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../Overview/install.md) for more details.
```bash
conda create -n py37 python=3.7 # "py37" is conda environment name, you can use any name you like.
conda activate py37
pip install bigdl-orca[ray]
pip install tensorflow
```
@ -24,20 +25,20 @@ pip install tensorflow
from bigdl.orca import init_orca_context, stop_orca_context
if cluster_mode == "local": # For local machine
init_orca_context(cluster_mode="local", cores=4, memory="10g")
init_orca_context(cluster_mode="local", cores=4, memory="4g")
elif cluster_mode == "k8s": # For K8s cluster
init_orca_context(cluster_mode="k8s", num_nodes=2, cores=2, memory="10g", driver_memory="10g", driver_cores=1)
init_orca_context(cluster_mode="k8s", num_nodes=2, cores=2, memory="4g")
elif cluster_mode == "yarn": # For Hadoop/YARN cluster
init_orca_context(cluster_mode="yarn", num_nodes=2, cores=2, memory="10g", driver_memory="10g", driver_cores=1)
init_orca_context(cluster_mode="yarn", num_nodes=2, cores=2, memory="4g")
```
This is the only place where you need to specify local or distributed mode. View [Orca Context](./../Overview/orca-context.md) for more details.
This is the only place where you need to specify local or distributed mode. View [Orca Context](../Overview/orca-context.md) for more details.
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
Please check the tutorials if you want to run on [Kubernetes](../Tutorial/k8s.md) or [Hadoop/YARN](../Tutorial/yarn.md) clusters.
### Step 2: Define the Model
You can then define the Keras model in the _Creator Function_ using the standard TensroFlow 2 APIs.
You can then define the Keras model in the _Creator Function_ using the standard TensorFlow 2 Keras APIs.
```python
import tensorflow as tf
@ -61,9 +62,9 @@ def model_creator(config):
metrics=['accuracy'])
return model
```
### Step 3: Define Train Dataset
### Step 3: Define the Dataset
You can define the dataset in the _Creator Function_ using standard [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) APIs. Orca also supports [Spark DataFrame](https://spark.apache.org/docs/latest/sql-programming-guide.html) and [Orca XShards](../Overview/data-parallel-processing.md).
You can define the dataset in the _Creator Function_ using standard [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) APIs. Orca also supports [Spark DataFrame](./spark-dataframe.md) and [Orca XShards](./xshards-pandas.md).
```python
@ -93,7 +94,7 @@ def val_data_creator(config, batch_size):
### Step 4: Fit with Orca Estimator
First, create an Estimator.
First, create an Orca Estimator for TensorFlow 2.
```python
from bigdl.orca.learn.tf2 import Estimator
@ -110,17 +111,16 @@ stats = est.fit(train_data_creator,
steps_per_epoch=60000 // batch_size,
validation_data=val_data_creator,
validation_steps=10000 // batch_size)
est.save("/tmp/mnist_keras.ckpt")
stats = est.evaluate(val_data_creator, num_steps=10000 // batch_size)
est.shutdown()
print(stats)
est.shutdown()
```
### Step 5: Save and Load the Model
Orca TF2 Estimator supports two formats to save and load the entire model (**TensorFlow SavedModel and Keras H5 Format**). The recommended format is SavedModel, which is the default format when you use `estimator.save()`.
Orca TensorFlow 2 Estimator supports two formats to save and load the entire model (**TensorFlow SavedModel and Keras H5 Format**). The recommended format is SavedModel, which is the default format when you use `estimator.save()`.
You could also save the model to Keras H5 format by passing `save_format='h5'` or a filename that ends in `.h5` or `.keras` to `estimator.save()`.
@ -130,22 +130,22 @@ You could also save the model to Keras H5 format by passing `save_format='h5'`
```python
# save model in SavedModel format
est.save("/tmp/cifar10_model")
est.save("lenet_model")
# load model
est.load("/tmp/cifar10_model")
est.load("lenet_model")
```
**2. HDF5 format**
```python
# save model in H5 format
est.save("/tmp/cifar10_model.h5", save_format='h5')
est.save("lenet_model.h5", save_format='h5')
# load model
est.load("/tmp/cifar10_model.h5")
est.load("lenet_model.h5")
```
That's it, the same code can run seamlessly in your local laptop and to distribute K8s or Hadoop cluster.
That's it, the same code can run seamlessly on your local laptop and scale to [Kubernetes](../Tutorial/k8s.md) or [Hadoop/YARN](../Tutorial/yarn.md) clusters.
**Note:** You should call `stop_orca_context()` when your program finishes.

View file

@ -1,43 +0,0 @@
# Orca Quickstarts
- [**TensorFlow 2 Quickstart**](./orca-tf2keras-quickstart.html)
> ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf2_keras_lenet_mnist.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf2_keras_lenet_mnist.ipynb)
In this guide we will describe how to scale out TensorFlow 2 programs using Orca in 5 simple steps.
---------------------------
- [**PyTorch Quickstart**](./orca-pytorch-quickstart.html)
> ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/pytorch_lenet_mnist.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/pytorch_lenet_mnist_spark.ipynb)
In this guide we will describe how to scale out PyTorch programs using Orca in 5 simple steps.
---------------------------
- [**RayOnSpark Quickstart**](./ray-quickstart.html)
> ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/ray_parameter_server.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/ray_parameter_server.ipynb)
In this guide we will describe how to use RayOnSpark to directly run Ray programs on Big Data clusters in 2 simple steps.
---------------------------
- [**TensorFlow 1.15 Quickstart**](./orca-tf-quickstart.html)
> ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf_lenet_mnist.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf_lenet_mnist.ipynb)
In this guide we will describe how to scale out TensorFlow 1.15 programs using Orca in 4 simple steps.
---------------------------
- [**Keras 2.3 Quickstart**](./orca-keras-quickstart.html)
> ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/keras_lenet_mnist.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/keras_lenet_mnist.ipynb)
In this guide we will describe how to scale out Keras 2.3 programs using Orca in 4 simple steps.
---------------------------

View file

@ -30,7 +30,6 @@ Most AI projects start with a Python notebook running on a single laptop; howeve
+++
:bdg-link:`Quickstarts <./QuickStart/index.html>` |
:bdg-link:`How-to Guides <./Howto/index.html>` |
:bdg-link:`Tutorials <./Tutorial/index.html>`