ipex-llm/docs/readthedocs/source/doc/Orca/Overview/orca.md
Shengsheng Huang f2e4c40cee change the readthedocs theme and reorg the sections (#6056)
* refactor toc

* refactor toc

* Change to pydata-sphinx-theme and update packages requirement list for ReadtheDocs

* Remove customized css for old theme

* Add index page to each top bar section and limit dropdown maximum to be 4

* Use js to change 'More' to 'Libraries'

* Add custom.css to conf.py for further css changes

* Add BigDL logo and search bar

* refactor toc

* refactor toc and add overview

* refactor toc and add overview

* refactor toc and add overview

* refactor get started

* add paper and video section

* add videos

* add grid columns in landing page

* add document roadmap to index

* reapply search bar and github icon commit

* reorg orca and chronos sections

* Test: weaken ads by js

* update: change left attrbute

* update: add comments

* update: change opacity to 0.7

* Remove useless theme template override for old theme

* Add sidebar releases component in the home page

* Remove sidebar search and restore top nav search button

* Add BigDL handouts

* Add back to homepage button to pages except from the home page

* Update releases contents & styles in left sidebar

* Add version badge to the top bar

* Test: weaken ads by js

* update: add comments

* remove landing page contents

* rfix chronos install

* refactor install

* refactor chronos section titles

* refactor nano index

* change chronos landing

* revise chronos landing page

* add document navigator to nano landing page

* revise install landing page

* Improve css of versions in sidebar

* Make handouts image pointing to a page in new tab

* add win guide to install

* add dliib installation

* revise title bar

* rename index files

* add index page for user guide

* add dllib and orca API

* update user guide landing page

* refactor side bar

* Remove extra style configuration of card components & make different card usage consistent

* Remove extra styles for Nano how-to guides

* Remove extra styles for Chronos how-to guides

* Remove dark mode for now

* Update index page description

* Add decision tree for choosing BigDL libraries in index page

* add dllib models api, revise core layers formats

* Change primary & info color in light mode

* Restyle card components

* Restructure Chronos landing page

* Update card style

* Update BigDL library selection decision tree

* Fix failed Chronos tutorials filter

* refactor PPML documents

* refactor and add friesian documents

* add friesian arch diagram

* update landing pages and fill key features guide index page

* Restyle link card component

* Style video frames in PPML sections

* Adjust Nano landing page

* put api docs to the last in index for convinience

* Make badge horizontal padding smaller & small changes

* Change the second letter of all header titles to be small capitalizd

* Small changes on Chronos index page

* Revise decision tree to make it smaller

* Update: try to change the position of ads.

* Bugfix: deleted nonexist file config

* Update: update ad JS/CSS/config

* Update: change ad.

* Update: delete my template and change files.

* Update: change chronos installation table color.

* Update: change table font color to --pst-color-primary-text

* Remove old contents in landing page sidebar

* Restyle badge for usage in card footer again

* Add quicklinks template on landing page sidebar

* add quick links

* Add scala logo

* move tf, pytorch out of the link

* change orca key features cards

* fix typo

* fix a mistake in wording

* Restyle badge for card footer

* Update decision tree

* Remove useless html templates

* add more api docs and update tutorials in dllib

* update chronos install using new style

* merge changes in nano doc from master

* fix quickstart links in sidebar quicklinks

* Make tables responsive

* Fix overflow in api doc

* Fix list indents problems in [User guide] section

* Further fixes to nested bullets contents in [User Guide] section

* Fix strange title in Nano 5-min doc

* Fix list indent problems in [DLlib] section

* Fix misnumbered list problems and other small fixes for [Chronos] section

* Fix list indent problems and other small fixes for [Friesian] section

* Fix list indent problem and other small fixes for [PPML] section

* Fix list indent problem for developer guide

* Fix list indent problem for [Cluster Serving] section

* fix dllib links

* Fix wrong relative link in section landing page

Co-authored-by: Yuwen Hu <yuwen.hu@intel.com>
Co-authored-by: Juntao Luo <1072087358@qq.com>
2022-10-18 15:35:31 +08:00

2 KiB

Orca in 5 minutes

Overview

Most AI projects start with a Python notebook running on a single laptop; however, one usually needs to go through a mountain of pains to scale it to handle larger data set in a distributed fashion. The Orca library seamlessly scales out your single node Python notebook across large clusters (so as to process distributed Big Data).


Tensorflow Bite-sized Example

This section uses TensorFlow 1.15, and you should install TensorFlow before running this example:

pip install tensorflow==1.15

First, initialize Orca Context:

from bigdl.orca import init_orca_context, OrcaContext

# cluster_mode can be "local", "k8s" or "yarn"
sc = init_orca_context(cluster_mode="local", cores=4, memory="10g", num_nodes=1)

Next, perform data-parallel processing in Orca (supporting standard Spark Dataframes, TensorFlow Dataset, PyTorch DataLoader, Pandas, etc.):

from pyspark.sql.functions import array

spark = OrcaContext.get_spark_session()
df = spark.read.parquet(file_path)
df = df.withColumn('user', array('user')) \
       .withColumn('item', array('item'))

Finally, use sklearn-style Estimator APIs in Orca to perform distributed TensorFlow, PyTorch, Keras and BigDL training and inference:

from tensorflow import keras
from bigdl.orca.learn.tf.estimator import Estimator

user = keras.layers.Input(shape=[1])
item = keras.layers.Input(shape=[1])
feat = keras.layers.concatenate([user, item], axis=1)
predictions = keras.layers.Dense(2, activation='softmax')(feat)
model = keras.models.Model(inputs=[user, item], outputs=predictions)
model.compile(optimizer='rmsprop',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

est = Estimator.from_keras(keras_model=model)
est.fit(data=df,
        batch_size=64,
        epochs=4,
        feature_cols=['user', 'item'],
        label_cols=['label'])