Update scala docs (#3248)

* Update scala.md

* Update scala.md

* Update scala.md

* Update scala.md

* Update scala.md

* Update scala.md
This commit is contained in:
Xin Qiu 2021-10-26 09:16:22 +08:00 committed by GitHub
parent 3c316e0071
commit a3dc9a0158

View file

@ -2,22 +2,22 @@
--- ---
### **1. Try Analytics Zoo Examples** ### **1. Try BigDL Examples**
This section will show you how to download Analytics Zoo prebuild packages and run the build-in examples. This section will show you how to download BigDL prebuild packages and run the build-in examples.
#### **1.1 Download and config** #### **1.1 Download and config**
You can download the Analytics Zoo official releases and nightly build from the [Release Page](../release.md). After extracting the prebuild package, you need to set environment variables **ANALYTICS_ZOO_HOME** and **SPARK_HOME** as follows: You can download the BigDL official releases and nightly build from the [Release Page](../release.md). After extracting the prebuild package, you need to set environment variables **BIGDL_HOME** and **SPARK_HOME** as follows:
```bash ```bash
export SPARK_HOME=folder path where you extract the Spark package export SPARK_HOME=folder path where you extract the Spark package
export ANALYTICS_ZOO_HOME=folder path where you extract the Analytics Zoo package export BIGDL_HOME=folder path where you extract the BigDL package
``` ```
#### **1.2 Use Spark interactive shell** #### **1.2 Use Spark interactive shell**
You can try Analytics Zoo using the Spark interactive shell as follows: You can try BigDL using the Spark interactive shell as follows:
```bash ```bash
${ANALYTICS_ZOO_HOME}/bin/spark-shell-with-zoo.sh --master local[2] ${BIGDL_HOME}/bin/spark-shell-with-bigdl.sh --master local[2]
``` ```
You will then see a welcome message like below: You will then see a welcome message like below:
@ -27,7 +27,7 @@ Welcome to
____ __ ____ __
/ __/__ ___ _____/ /__ / __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/ _\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.4.3 /___/ .__/\_,_/_/ /_/\_\ version 2.4.6
/_/ /_/
Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112) Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
@ -35,11 +35,11 @@ Type in expressions to have them evaluated.
Type :help for more information. Type :help for more information.
``` ```
Before you try Analytics Zoo APIs, you should use `initNNcontext` to verify your environment: Before you try BigDL APIs, you should use `initNNcontext` to verify your environment:
```scala ```scala
scala> import com.intel.analytics.zoo.common.NNContext scala> import com.intel.analytics.bigdl.dllib.NNContext
import com.intel.analytics.zoo.common.NNContext import com.intel.analytics.bigdl.dllib.NNContext
scala> val sc = NNContext.initNNContext("Run Example") scala> val sc = NNContext.initNNContext("Run Example")
2021-01-26 10:19:52 WARN SparkContext:66 - Using an existing SparkContext; some configuration may not take effect. 2021-01-26 10:19:52 WARN SparkContext:66 - Using an existing SparkContext; some configuration may not take effect.
@ -47,95 +47,100 @@ scala> val sc = NNContext.initNNContext("Run Example")
sc: org.apache.spark.SparkContext = org.apache.spark.SparkContext@487f025 sc: org.apache.spark.SparkContext = org.apache.spark.SparkContext@487f025
``` ```
#### **1.3 Run Analytics Zoo examples** #### **1.3 Run BigDL examples**
You can run an Analytics Zoo example, e.g., the [Wide & Deep Recommendation](https://github.com/intel-analytics/analytics-zoo/tree/master/zoo/src/main/scala/com/intel/analytics/zoo/examples/recommendation), as a standard Spark program (running in either local mode or cluster mode) as follows: You can run an BigDL example, e.g., the [Lenet](https://github.com/intel-analytics/BigDL/tree/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/models/lenet), as a standard Spark program (running in either local mode or cluster mode) as follows:
1. Download Census Income Dataset to `./data/census` from [here](https://archive.ics.uci.edu/ml/datasets/Census+Income). 1. You can download the MNIST Data from [here](http://yann.lecun.com/exdb/mnist/). Unzip all the
files and put them in one folder(e.g. mnist).
There're four files. **train-images-idx3-ubyte** contains train images,
**train-labels-idx1-ubyte** is train label file, **t10k-images-idx3-ubyte** has validation images
and **t10k-labels-idx1-ubyte** contains validation labels. For more detail, please refer to the
download page.
After you uncompress the gzip files, these files may be renamed by some uncompress tools, e.g. **train-images-idx3-ubyte** is renamed
to **train-images.idx3-ubyte**. Please change the name back before you run the example.
2. Run the following command: 2. Run the following command:
```bash ```bash
# Spark local mode # Spark local mode
${ANALYTICS_ZOO_HOME}/bin/spark-submit-scala-with-zoo.sh \ ${BIGDL_HOME}/bin/spark-submit-scala-with-bigdl.sh \
--master local[2] \ --master local[2] \
--class com.intel.analytics.zoo.examples.recommendation.WideAndDeepExample \ --class com.intel.analytics.bigdl.dllib.models.lenet.Train \
dist/lib/analytics-zoo-bigdl_0.12.1-spark_2.4.3-0.9.0-jar-with-dependencies.jar \ #change to your jar file if your download is not spark_2.4.3-0.9.0 ${BIGDL_HOME}/jars/bigdl-dllib-spark_2.4.6-0.14.0-SNAPSHOT-jar-with-dependencies.jar \ #change to your jar file if your download is not the same version
--inputDir ./data/census \ -f ./data/mnist \
--batchSize 320 \ -b 320 \
--maxEpoch 20 \ -e 20
--dataset census
# Spark standalone mode # Spark standalone mode
${ANALYTICS_ZOO_HOME}/bin/spark-submit-scala-with-zoo.sh \ ${BIGDL_HOME}/bin/spark-submit-scala-with-bigdl.sh \
--master spark://... \ #add your spark master address --master spark://... \ #add your spark master address
--executor-cores cores_per_executor \ --executor-cores 2 \
--total-executor-cores total_cores_for_the_job \ --total-executor-cores 4 \
--class com.intel.analytics.zoo.examples.recommendation.WideAndDeepExample \ --class com.intel.analytics.bigdl.dllib.models.lenet.Train \
dist/lib/analytics-zoo-bigdl_0.12.1-spark_2.4.3-0.9.0-jar-with-dependencies.jar \ #change to your jar file if your download is not spark_2.4.3-0.9.0 ${BIGDL_HOME}/jars/bigdl-dllib-spark_2.4.6-0.14.0-SNAPSHOT-jar-with-dependencies.jar \ #change to your jar file if your download is not the same version
--inputDir ./data/census \ -f ./data/mnist \
--batchSize 320 \ -b 320 \
--maxEpoch 20 \ -e 20
--dataset census
# Spark yarn client mode, please make sure the right HADOOP_CONF_DIR is set # Spark yarn client mode, please make sure the right HADOOP_CONF_DIR is set
${ANALYTICS_ZOO_HOME}/bin/spark-submit-scala-with-zoo.sh \ ${BIGDL_HOME}/bin/spark-submit-scala-with-bigdl.sh \
--master yarn \ --master yarn \
--deploy-mode client \ --deploy-mode client \
--executor-cores cores_per_executor \ --executor-cores 2 \
--num-executors executors_number \ --num-executors 2 \
--class com.intel.analytics.zoo.examples.recommendation.WideAndDeepExample \ --class com.intel.analytics.bigdl.dllib.models.lenet.Train \
dist/lib/analytics-zoo-bigdl_0.12.1-spark_2.4.3-0.9.0-jar-with-dependencies.jar \ #change to your jar file if your download is not spark_2.4.3-0.9.0 ${BIGDL_HOME}/jars/bigdl-dllib-spark_2.4.6-0.14.0-SNAPSHOT-jar-with-dependencies.jar \ #change to your jar file if your download is not the same version
--inputDir ./data/census \ -f ./data/mnist \
--batchSize 320 \ -b 320 \
--maxEpoch 20 \ -e 20
--dataset census
# Spark yarn cluster mode, please make sure the right HADOOP_CONF_DIR is set # Spark yarn cluster mode, please make sure the right HADOOP_CONF_DIR is set
${ANALYTICS_ZOO_HOME}/bin/spark-submit-scala-with-zoo.sh \ ${BIGDL_HOME}/bin/spark-submit-scala-with-bigdl.sh \
--master yarn \ --master yarn \
--deploy-mode cluster \ --deploy-mode cluster \
--executor-cores cores_per_executor \ --executor-cores 2 \
--num-executors executors_number \ --num-executors 2 \
--class com.intel.analytics.zoo.examples.recommendation.WideAndDeepExample \ --class com.intel.analytics.bigdl.dllib.models.lenet.Train \
dist/lib/analytics-zoo-bigdl_0.12.1-spark_2.4.3-0.9.0-jar-with-dependencies.jar \ #change to your jar file if your download is not spark_2.4.3-0.9.0 ${BIGDL_HOME}/jars/bigdl-dllib-spark_2.4.6-0.14.0-SNAPSHOT-jar-with-dependencies.jar \ #change to your jar file if your download is not the same version
--inputDir ./data/census \ -f ./data/mnist \
--batchSize 320 \ -b 320 \
--maxEpoch 20 \ -e 20
--dataset census
``` ```
--- ---
### **2. Build Analytics Zoo Applications** ### **2. Build BigDL Applications**
This section will show you how to build your own deep learning project with Analytics Zoo. This section will show you how to build your own deep learning project with BigDL.
#### **2.1 Add Analytics Zoo dependency** #### **2.1 Add BigDL dependency**
##### **2.1.1 official Release** ##### **2.1.1 official Release**
Currently, Analytics Zoo releases are hosted on maven central; below is an example to add the Analytics Zoo dependency to your own project: Currently, BigDL releases are hosted on maven central; below is an example to add the BigDL dllib dependency to your own project:
```xml ```xml
<dependency> <dependency>
<groupId>com.intel.analytics.zoo</groupId> <groupId>com.intel.analytics.bigdl</groupId>
<artifactId>analytics-zoo-bigdl_0.12.1-spark_2.4.3</artifactId> <artifactId>bigdl-dllib-spark_2.4.6</artifactId>
<version>0.9.0</version> <version>2.0.0</version>
</dependency> </dependency>
``` ```
You can find the other SPARK version [here](https://search.maven.org/search?q=analytics-zoo-bigdl), such as `spark_2.1.1`, `spark_2.2.1`, `spark_2.3.1`, `spark_3.0.0`. You can find the other SPARK version [here](https://search.maven.org/search?q=bigdl-dllib), such as `spark_3.1.2`.
SBT developers can use SBT developers can use
```sbt ```sbt
libraryDependencies += "com.intel.analytics.zoo" % "analytics-zoo-bigdl_0.12.1-spark_2.4.3" % "0.9.0" libraryDependencies += "com.intel.analytics.bigdl" % "bigdl-dllib-spark_2.4.6" % "2.0.0"
``` ```
##### **2.1.2 Nightly Build** ##### **2.1.2 Nightly Build**
Currently, Analytics Zoo nightly build is hosted on [SonaType](https://oss.sonatype.org/content/groups/public/com/intel/analytics/zoo/). Currently, BigDL nightly build is hosted on [SonaType](https://oss.sonatype.org/content/groups/public/com/intel/analytics/bigdl/).
To link your application with the latest Analytics Zoo nightly build, you should add some dependencies like [official releases](#11-official-release), but change `0.9.0` to the snapshot version (such as 0.10.0-snapshot), and add below repository to your pom.xml. To link your application with the latest BigDL nightly build, you should add some dependencies like [official releases](#11-official-release), but change `2.0.0` to the snapshot version (such as 0.14.0-snapshot), and add below repository to your pom.xml.
```xml ```xml
@ -159,5 +164,5 @@ resolvers += "ossrh repository" at "https://oss.sonatype.org/content/repositorie
#### **2.2 Build a Scala project** #### **2.2 Build a Scala project**
To enable Analytics Zoo in project, you should add Analytics Zoo to your project's dependencies using maven or sbt. Here is a [simple MLP example](https://github.com/intel-analytics/zoo-tutorials/tree/master/scala/SimpleMlp) to show you how to use Analytics Zoo to build your own deep learning project using maven or sbt, and how to run the simple example in IDEA and spark-submit. To enable BigDL in project, you should add BigDL to your project's dependencies using maven or sbt. Here is a [simple MLP example](https://github.com/intel-analytics/zoo-tutorials/tree/master/scala/SimpleMlp) to show you how to use BigDL to build your own deep learning project using maven or sbt, and how to run the simple example in IDEA and spark-submit.