change the readthedocs theme and reorg the sections (#6056)
* refactor toc * refactor toc * Change to pydata-sphinx-theme and update packages requirement list for ReadtheDocs * Remove customized css for old theme * Add index page to each top bar section and limit dropdown maximum to be 4 * Use js to change 'More' to 'Libraries' * Add custom.css to conf.py for further css changes * Add BigDL logo and search bar * refactor toc * refactor toc and add overview * refactor toc and add overview * refactor toc and add overview * refactor get started * add paper and video section * add videos * add grid columns in landing page * add document roadmap to index * reapply search bar and github icon commit * reorg orca and chronos sections * Test: weaken ads by js * update: change left attrbute * update: add comments * update: change opacity to 0.7 * Remove useless theme template override for old theme * Add sidebar releases component in the home page * Remove sidebar search and restore top nav search button * Add BigDL handouts * Add back to homepage button to pages except from the home page * Update releases contents & styles in left sidebar * Add version badge to the top bar * Test: weaken ads by js * update: add comments * remove landing page contents * rfix chronos install * refactor install * refactor chronos section titles * refactor nano index * change chronos landing * revise chronos landing page * add document navigator to nano landing page * revise install landing page * Improve css of versions in sidebar * Make handouts image pointing to a page in new tab * add win guide to install * add dliib installation * revise title bar * rename index files * add index page for user guide * add dllib and orca API * update user guide landing page * refactor side bar * Remove extra style configuration of card components & make different card usage consistent * Remove extra styles for Nano how-to guides * Remove extra styles for Chronos how-to guides * Remove dark mode for now * Update index page description * Add decision tree for choosing BigDL libraries in index page * add dllib models api, revise core layers formats * Change primary & info color in light mode * Restyle card components * Restructure Chronos landing page * Update card style * Update BigDL library selection decision tree * Fix failed Chronos tutorials filter * refactor PPML documents * refactor and add friesian documents * add friesian arch diagram * update landing pages and fill key features guide index page * Restyle link card component * Style video frames in PPML sections * Adjust Nano landing page * put api docs to the last in index for convinience * Make badge horizontal padding smaller & small changes * Change the second letter of all header titles to be small capitalizd * Small changes on Chronos index page * Revise decision tree to make it smaller * Update: try to change the position of ads. * Bugfix: deleted nonexist file config * Update: update ad JS/CSS/config * Update: change ad. * Update: delete my template and change files. * Update: change chronos installation table color. * Update: change table font color to --pst-color-primary-text * Remove old contents in landing page sidebar * Restyle badge for usage in card footer again * Add quicklinks template on landing page sidebar * add quick links * Add scala logo * move tf, pytorch out of the link * change orca key features cards * fix typo * fix a mistake in wording * Restyle badge for card footer * Update decision tree * Remove useless html templates * add more api docs and update tutorials in dllib * update chronos install using new style * merge changes in nano doc from master * fix quickstart links in sidebar quicklinks * Make tables responsive * Fix overflow in api doc * Fix list indents problems in [User guide] section * Further fixes to nested bullets contents in [User Guide] section * Fix strange title in Nano 5-min doc * Fix list indent problems in [DLlib] section * Fix misnumbered list problems and other small fixes for [Chronos] section * Fix list indent problems and other small fixes for [Friesian] section * Fix list indent problem and other small fixes for [PPML] section * Fix list indent problem for developer guide * Fix list indent problem for [Cluster Serving] section * fix dllib links * Fix wrong relative link in section landing page Co-authored-by: Yuwen Hu <yuwen.hu@intel.com> Co-authored-by: Juntao Luo <1072087358@qq.com>
BIN
docs/readthedocs/image/friesian_architecture.png
Normal file
|
After Width: | Height: | Size: 214 KiB |
BIN
docs/readthedocs/image/scala_logo.png
Normal file
|
After Width: | Height: | Size: 795 B |
|
|
@ -11,9 +11,9 @@ cloudpickle==2.1.0
|
|||
ray[tune]==1.9.2
|
||||
ray==1.9.2
|
||||
torch==1.9.0
|
||||
Pygments==2.6.1
|
||||
Pygments==2.7
|
||||
setuptools==41.0.1
|
||||
docutils==0.17
|
||||
docutils==0.17.1
|
||||
mock==1.0.1
|
||||
pillow==5.4.1
|
||||
sphinx==4.5.0
|
||||
|
|
@ -21,7 +21,6 @@ alabaster>=0.7,<0.8,!=0.7.5
|
|||
commonmark==0.8.1
|
||||
recommonmark==0.5.0
|
||||
readthedocs-sphinx-ext<2.2
|
||||
sphinx_rtd_theme==1.0.0
|
||||
scikit-learn==1.0.2
|
||||
pystan==2.19.1.1
|
||||
prophet==1.0.1
|
||||
|
|
@ -41,3 +40,4 @@ nbsphinx==0.8.9
|
|||
ipython==7.34.0
|
||||
sphinx-design==0.2.0
|
||||
nbsphinx-link==1.3.0
|
||||
pydata-sphinx-theme==0.11.0
|
||||
|
|
@ -11,7 +11,7 @@
|
|||
}
|
||||
|
||||
#table-1 tr, td{
|
||||
background-color: rgb(240, 241, 245);
|
||||
background-color: var(--pst-color-on-surface);
|
||||
height: 30px;
|
||||
border-width: 2px;
|
||||
border-style: solid;
|
||||
|
|
@ -26,7 +26,7 @@
|
|||
#table-1 td{
|
||||
font-size: 16px;
|
||||
font-family: Verdana;
|
||||
color: rgb(15, 24, 33);
|
||||
color: var(--pst-color-text-base);
|
||||
text-align: center;
|
||||
/* height: 56px;
|
||||
line-height: 56px; */
|
||||
|
|
|
|||
|
|
@ -1,65 +1,63 @@
|
|||
/*Extends the docstring signature box.*/
|
||||
.rst-content dl:not(.docutils) dt {
|
||||
display: block;
|
||||
padding: 10px;
|
||||
word-wrap: break-word;
|
||||
padding-right: 100px;
|
||||
}
|
||||
/*Lists in an admonition note do not have awkward whitespace below.*/
|
||||
.rst-content .admonition-note .section ul {
|
||||
margin-bottom: 0px;
|
||||
}
|
||||
/*Properties become blue (classmethod, staticmethod, property)*/
|
||||
.rst-content dl dt em.property {
|
||||
color: #2980b9;
|
||||
text-transform: uppercase;
|
||||
/* change primary & info color for light mode*/
|
||||
html[data-theme="light"] {
|
||||
--pst-color-primary: rgb(1, 113, 195);
|
||||
--pst-color-info: rgb(1, 113, 195);
|
||||
}
|
||||
|
||||
.rst-content .section ol p,
|
||||
.rst-content .section ul p {
|
||||
margin-bottom: 0px;
|
||||
/* ectra css variables */
|
||||
:root {
|
||||
--pst-color-info-tiny-opacity: rgba(1, 113, 195, 0.1);
|
||||
--pst-color-info-low-opacity: rgba(1, 113, 195, 0.25);
|
||||
}
|
||||
|
||||
div.sphx-glr-bigcontainer {
|
||||
display: inline-block;
|
||||
width: 100%;
|
||||
|
||||
/* align items in the left part of header to the ground*/
|
||||
.bd-header #navbar-start {
|
||||
align-items: end;
|
||||
}
|
||||
|
||||
td.tune-colab,
|
||||
th.tune-colab {
|
||||
border: 1px solid #dddddd;
|
||||
text-align: left;
|
||||
padding: 8px;
|
||||
/* for version badge, possible for other badges*/
|
||||
.version-badge{
|
||||
border: 1px solid var(--pst-color-primary);
|
||||
border-radius: 0.25rem;
|
||||
color: var(--pst-color-primary);
|
||||
padding: 0.1rem 0.25rem;
|
||||
font-size: var(--pst-font-size-milli);
|
||||
}
|
||||
|
||||
/* Adjustment to Sphinx Book Theme */
|
||||
.table td {
|
||||
/* Remove row spacing */
|
||||
padding: 0;
|
||||
/* for card components */
|
||||
.bd-content .sd-card {
|
||||
border: none;
|
||||
border-left: .2rem solid var(--pst-color-info-low-opacity);
|
||||
}
|
||||
|
||||
table {
|
||||
/* Force full width for all table */
|
||||
width: 136% !important;
|
||||
.bd-content .sd-card .sd-card-header{
|
||||
background-color: var(--pst-color-info-tiny-opacity);
|
||||
border: none;
|
||||
}
|
||||
|
||||
img.inline-figure {
|
||||
/* Override the display: block for img */
|
||||
display: inherit !important;
|
||||
.bigdl-link-card:hover{
|
||||
border-left: .2rem solid var(--pst-color-info);
|
||||
}
|
||||
|
||||
#version-warning-banner {
|
||||
/* Make version warning clickable */
|
||||
z-index: 1;
|
||||
/* for sphinx-design badge components (customized for usage in card footer)*/
|
||||
.sd-badge{
|
||||
padding: .35em 0em;
|
||||
font-size: 0.9em;
|
||||
}
|
||||
|
||||
span.rst-current-version > span.fa.fa-book {
|
||||
/* Move the book icon away from the top right
|
||||
* corner of the version flyout menu */
|
||||
margin: 10px 0px 0px 5px;
|
||||
/* for landing page side bar */
|
||||
.bigdl-quicklinks-section-nav{
|
||||
padding-bottom: 0.5rem;
|
||||
padding-left: 1rem;
|
||||
}
|
||||
|
||||
/* Adjustment to Version block */
|
||||
.rst-versions {
|
||||
z-index: 1200 !important;
|
||||
.bigdl-quicklinks-section-title{
|
||||
color: var(--pst-color-primary);
|
||||
}
|
||||
|
||||
/* force long parameter definition (which occupy a whole line)
|
||||
to break in api documents for class/method */
|
||||
.sig-object{
|
||||
overflow-wrap: break-word;
|
||||
}
|
||||
|
|
@ -232,8 +232,8 @@ function refresh_cmd(){
|
|||
|
||||
//set the color of selected buttons
|
||||
function set_color(id){
|
||||
$("#"+id).parent().css("background-color","rgb(74, 106, 237)");
|
||||
$("#"+id).css("color","white");
|
||||
$("#"+id).parent().css("background-color","var(--pst-color-primary)");
|
||||
$("#"+id).css("color","var(--pst-color-primary-text)");
|
||||
$("#"+id).addClass("isset");
|
||||
}
|
||||
|
||||
|
|
@ -241,7 +241,7 @@ function set_color(id){
|
|||
function reset_color(list){
|
||||
for (btn in list){
|
||||
$("#"+list[btn]).parent().css("background-color","transparent");
|
||||
$("#"+list[btn]).css("color","black");
|
||||
$("#"+list[btn]).css("color","var(--pst-color-text-base)");
|
||||
$("#"+list[btn]).removeClass("isset");
|
||||
}
|
||||
}
|
||||
|
|
@ -254,7 +254,7 @@ function disable(list){
|
|||
}
|
||||
reset_color(list);
|
||||
for(btn in list){
|
||||
$("#"+list[btn]).parent().css("background-color","rgb(133, 133, 133)");
|
||||
$("#"+list[btn]).parent().css("background-color","var(--pst-color-muted)");
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -303,14 +303,14 @@ $(document).on('click',"button",function(){
|
|||
$(document).on({
|
||||
mouseenter: function () {
|
||||
if($(this).prop("disabled")!=true){
|
||||
$(this).parent().css("background-color","rgb(74, 106, 237)");
|
||||
$(this).css("color","white");
|
||||
$(this).parent().css("background-color","var(--pst-color-primary)");
|
||||
$(this).css("color","var(--pst-color-primary-text)");
|
||||
}
|
||||
},
|
||||
mouseleave: function () {
|
||||
if(!$(this).hasClass("isset") && $(this).prop("disabled")!=true){
|
||||
$(this).parent().css("background-color","transparent");
|
||||
$(this).css("color","black");
|
||||
$(this).css("color","var(--pst-color-text-base)");
|
||||
}
|
||||
}
|
||||
}, "button");
|
||||
|
|
|
|||
|
|
@ -24,8 +24,9 @@ function disCheck(ids){
|
|||
//event when click the checkboxes
|
||||
$(".checkboxes").click(function(){
|
||||
//get all checked values
|
||||
//class checkboxes is specified to avoid selecting toctree checkboxes (arrows)
|
||||
var vals = [];
|
||||
$('input:checkbox:checked').each(function (index, item) {
|
||||
$('.checkboxes:input:checkbox:checked').each(function (index, item) {
|
||||
vals.push($(this).val());
|
||||
});
|
||||
|
||||
|
|
|
|||
26
docs/readthedocs/source/_static/js/custom.js
Normal file
|
|
@ -0,0 +1,26 @@
|
|||
$(document).ready(function(){
|
||||
// $('.btn.dropdown-toggle.nav-item').text('Libraries'); // change text for dropdown menu in header from More to Libraries
|
||||
|
||||
// hide the original left sidebar ads display
|
||||
$('#ethical-ad-placement').css({
|
||||
"display":"none"
|
||||
});
|
||||
|
||||
// manually add the ads to the end of content
|
||||
$(".bd-article").append(
|
||||
"<br />\
|
||||
<div style='display:flex;justify-content:center;'\
|
||||
<div\
|
||||
id='ethical-ad-placement'\
|
||||
class='horizontal'\
|
||||
data-ea-publisher='readthedocs'\
|
||||
data-ea-type='image'\
|
||||
></div>\
|
||||
</div>"
|
||||
);
|
||||
|
||||
// make tables responsive
|
||||
$("table").wrap(
|
||||
"<div style='overflow-x:auto;'></div>"
|
||||
);
|
||||
})
|
||||
|
|
@ -1,60 +0,0 @@
|
|||
<!--
|
||||
Copyright 2016 The BigDL Authors.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
|
||||
the following code is adapted from https://github.com/readthedocs/sphinx_rtd_theme/
|
||||
|
||||
The MIT License (MIT)
|
||||
|
||||
Copyright (c) 2013-2018 Dave Snider, Read the Docs, Inc. & contributors
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy of
|
||||
this software and associated documentation files (the "Software"), to deal in
|
||||
the Software without restriction, including without limitation the rights to
|
||||
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
|
||||
the Software, and to permit persons to whom the Software is furnished to do so,
|
||||
subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
|
||||
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
|
||||
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
|
||||
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
|
||||
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
-->
|
||||
|
||||
{%- extends "sphinx_rtd_theme/breadcrumbs.html" %}
|
||||
|
||||
<!--Change "Edit on Github" button on top-right corner to "Edit this page" in every page-->
|
||||
{%- block breadcrumbs_aside %}
|
||||
<li class="wy-breadcrumbs-aside">
|
||||
{%- if hasdoc(pagename) and display_vcs_links %}
|
||||
{%- if display_github %}
|
||||
{%- if check_meta and 'github_url' in meta %}
|
||||
<!-- User defined GitHub URL -->
|
||||
<a href="{{ meta['github_url'] }}" class="fa fa-github"> {{ _('Edit this page') }}</a>
|
||||
{%- else %}
|
||||
<a href="https://{{ github_host|default("github.com") }}/{{ github_user }}/{{ github_repo }}/{{ theme_vcs_pageview_mode or "blob" }}/{{ github_version }}{{ conf_py_path }}{{ pagename }}{{ page_source_suffix }}" class="fa fa-github"> {{ _('Edit this page') }}</a>
|
||||
{%- endif %}
|
||||
{%- elif show_source and source_url_prefix %}
|
||||
<a href="{{ source_url_prefix }}{{ pagename }}{{ page_source_suffix }}">{{ _('View page source') }}</a>
|
||||
{%- elif show_source and has_source and sourcename %}
|
||||
<a href="{{ pathto('_sources/' + sourcename, true)|e }}" rel="nofollow"> {{ _('View page source') }}</a>
|
||||
{%- endif %}
|
||||
{%- endif %}
|
||||
</li>
|
||||
{%- endblock %}
|
||||
|
|
@ -0,0 +1,6 @@
|
|||
{% set home_href = pathto(master_doc) %}
|
||||
<div>
|
||||
<a href={{ home_href }}>
|
||||
<strong>Back to Homepage ↵</strong>
|
||||
</a>
|
||||
</div>
|
||||
68
docs/readthedocs/source/_templates/sidebar_quicklinks.html
Normal file
|
|
@ -0,0 +1,68 @@
|
|||
<nav class="bd-links">
|
||||
<p class="bd-links__title">Quick Links</p>
|
||||
<div class="navbar-nav">
|
||||
<strong class="bigdl-quicklinks-section-title">Orca QuickStart</Q></strong>
|
||||
<ul class="nav bigdl-quicklinks-section-nav">
|
||||
<li>
|
||||
<a href="doc/UseCase/spark-dataframe.html">Use Spark Dataframe for Deep Learning</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/Orca/QuickStart/orca-pytorch-distributed-quickstart.html">Distributed PyTorch using Orca</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/Orca/QuickStart/orca-autoxgboost-quickstart.html">Use AutoXGBoost to tune XGBoost parameters automatically</a>
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
<strong class="bigdl-quicklinks-section-title">Nano QuickStart</strong>
|
||||
<ul class="nav bigdl-quicklinks-section-nav" >
|
||||
<li>
|
||||
<a href="doc/Nano/QuickStart/pytorch_train_quickstart.html">PyTorch Training Acceleration</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/Nano/QuickStart/pytorch_quantization_inc_onnx.html">PyTorch Inference Quantization with ONNXRuntime Acceleration </a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/Nano/QuickStart/pytorch_openvino.html">PyTorch Inference Acceleration using OpenVINO</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/Nano/QuickStart/tensorflow_train_quickstart.html">Tensorflow Training Acceleration</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/Nano/QuickStart/tensorflow_quantization_quickstart.html">Tensorflow Quantization Acceleration</a>
|
||||
</li>
|
||||
</ul>
|
||||
<strong class="bigdl-quicklinks-section-title">DLlib QuickStart</strong>
|
||||
<ul class="nav bigdl-quicklinks-section-nav" >
|
||||
<li>
|
||||
<a href="doc/DLlib/QuickStart/python-getting-started.html">Python QuickStart</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/DLlib/QuickStart/scala-getting-started.html">Scala QuickStart</a>
|
||||
</li>
|
||||
</ul>
|
||||
<strong class="bigdl-quicklinks-section-title">Chronos QuickStart</strong>
|
||||
<ul class="nav bigdl-quicklinks-section-nav" >
|
||||
<li>
|
||||
<a href="doc/Chronos/QuickStart/chronos-tsdataset-forecaster-quickstart.html">Basic Forecasting</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/Chronos/QuickStart/chronos-autotsest-quickstart.html">Forecasting using AutoML</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/Chronos/QuickStart/chronos-anomaly-detector.html">Anomaly Detection</a>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<strong class="bigdl-quicklinks-section-title">PPML QuickStart</strong>
|
||||
<ul class="nav bigdl-quicklinks-section-nav" >
|
||||
<li>
|
||||
<a href="doc/PPML/Overview/quicktour.html">Hello World Example</a>
|
||||
</li>
|
||||
<li>
|
||||
<a href="doc/PPML/QuickStart/end-to-end.html">End-to-End Example</a>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
</div>
|
||||
</nav>
|
||||
3
docs/readthedocs/source/_templates/version_badge.html
Normal file
|
|
@ -0,0 +1,3 @@
|
|||
<div class="version-badge" style="margin-bottom: 2px;">
|
||||
{{ release }}
|
||||
</div>
|
||||
|
|
@ -1,66 +1,186 @@
|
|||
root: index
|
||||
subtrees:
|
||||
- caption: Quick Start
|
||||
entries:
|
||||
- file: doc/Orca/QuickStart/orca-tf-quickstart
|
||||
- file: doc/Orca/QuickStart/orca-keras-quickstart
|
||||
- file: doc/Orca/QuickStart/orca-tf2keras-quickstart
|
||||
- file: doc/Orca/QuickStart/orca-pytorch-quickstart
|
||||
- file: doc/Ray/QuickStart/ray-quickstart
|
||||
|
||||
- caption: User Guide
|
||||
entries:
|
||||
- entries:
|
||||
- file: doc/UserGuide/index
|
||||
title: 'User guide'
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/UserGuide/python
|
||||
- file: doc/UserGuide/scala
|
||||
- file: doc/UserGuide/win
|
||||
- file: doc/UserGuide/colab
|
||||
- file: doc/UserGuide/docker
|
||||
- file: doc/UserGuide/colab
|
||||
- file: doc/UserGuide/hadoop
|
||||
- file: doc/UserGuide/k8s
|
||||
- file: doc/UserGuide/databricks
|
||||
- file: doc/UserGuide/develop
|
||||
- file: doc/UserGuide/known_issues
|
||||
|
||||
- caption: Nano
|
||||
entries:
|
||||
- file: doc/Nano/Overview/nano
|
||||
- file: doc/Nano/QuickStart/pytorch_train
|
||||
- file: doc/Nano/QuickStart/pytorch_inference
|
||||
- file: doc/Nano/QuickStart/tensorflow_train
|
||||
- file: doc/Nano/QuickStart/tensorflow_inference
|
||||
- file: doc/Nano/QuickStart/hpo
|
||||
- file: doc/Nano/QuickStart/index
|
||||
- file: doc/Nano/Howto/index
|
||||
- file: doc/Nano/Overview/known_issues
|
||||
|
||||
- caption: DLlib
|
||||
entries:
|
||||
- file: doc/DLlib/Overview/dllib
|
||||
- file: doc/DLlib/Overview/keras-api
|
||||
- file: doc/DLlib/Overview/nnframes
|
||||
- entries:
|
||||
- file: doc/Application/powered-by
|
||||
title: "Powered by"
|
||||
|
||||
- caption: Orca
|
||||
entries:
|
||||
|
||||
- entries:
|
||||
- file: doc/Orca/index
|
||||
title: "Orca"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/Orca/Overview/orca
|
||||
title: "Orca User Guide"
|
||||
title: "Orca in 5 miniutes"
|
||||
- file: doc/Orca/Overview/install
|
||||
title: "Installation"
|
||||
- file: doc/Orca/Overview/index
|
||||
title: "Key Features"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/Orca/Overview/orca-context
|
||||
- file: doc/Orca/Overview/data-parallel-processing
|
||||
- file: doc/Orca/Overview/distributed-training-inference
|
||||
- file: doc/Orca/Overview/distributed-tuning
|
||||
- file: doc/Ray/Overview/ray
|
||||
- file: doc/Orca/Overview/ray
|
||||
- file: doc/Orca/QuickStart/index
|
||||
title: "Tutorials"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/UseCase/spark-dataframe
|
||||
- file: doc/UseCase/xshards-pandas
|
||||
- file: doc/Orca/QuickStart/ray-quickstart
|
||||
- file: doc/Orca/QuickStart/orca-pytorch-distributed-quickstart
|
||||
- file: doc/Orca/QuickStart/orca-autoestimator-pytorch-quickstart
|
||||
- file: doc/Orca/QuickStart/orca-autoxgboost-quickstart
|
||||
- file: doc/Orca/Overview/known_issues
|
||||
title: "Tips and Known Issues"
|
||||
- file: doc/PythonAPI/Orca/index
|
||||
title: "API Reference"
|
||||
|
||||
- caption: Chronos
|
||||
entries:
|
||||
- file: doc/Chronos/Overview/chronos
|
||||
|
||||
|
||||
- entries:
|
||||
- file: doc/Nano/index
|
||||
title: "Nano"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/Nano/Overview/nano
|
||||
title: "Nano in 5 minutes"
|
||||
- file: doc/Nano/Overview/install
|
||||
title: "Installation"
|
||||
- file: doc/Nano/Overview/index
|
||||
title: "Key Features"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/Nano/Overview/pytorch_train
|
||||
- file: doc/Nano/Overview/pytorch_inference
|
||||
- file: doc/Nano/Overview/tensorflow_train
|
||||
- file: doc/Nano/Overview/tensorflow_inference
|
||||
- file: doc/Nano/Overview/hpo
|
||||
- file: doc/Nano/QuickStart/index
|
||||
title: "Tutorials"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/Nano/QuickStart/pytorch_train_quickstart
|
||||
- file: doc/Nano/QuickStart/pytorch_onnxruntime
|
||||
- file: doc/Nano/QuickStart/pytorch_openvino
|
||||
- file: doc/Nano/QuickStart/pytorch_quantization_inc_onnx
|
||||
- file: doc/Nano/QuickStart/pytorch_quantization_inc
|
||||
- file: doc/Nano/QuickStart/pytorch_quantization_openvino
|
||||
- file: doc/Nano/QuickStart/tensorflow_train_quickstart
|
||||
- file: doc/Nano/QuickStart/tensorflow_embedding
|
||||
- file: doc/Nano/QuickStart/tensorflow_quantization_quickstart
|
||||
- file: doc/Nano/Howto/index
|
||||
title: "How-to Guides"
|
||||
- file: doc/Nano/Overview/known_issues
|
||||
title: "Tips and Known Issues"
|
||||
- file: doc/PythonAPI/Nano/index
|
||||
title: "API Reference"
|
||||
|
||||
|
||||
|
||||
- entries:
|
||||
- file: doc/DLlib/index
|
||||
title: "DLlib"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/DLlib/Overview/dllib
|
||||
title: "DLLib in 5 minutes"
|
||||
- file: doc/DLlib/Overview/install
|
||||
title: "Installation"
|
||||
- file: doc/DLlib/Overview/index
|
||||
title: "Key Features"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/DLlib/Overview/keras-api
|
||||
- file: doc/DLlib/Overview/nnframes
|
||||
- file: doc/DLlib/Overview/visualization
|
||||
title: "Visualization"
|
||||
- file: doc/DLlib/QuickStart/index
|
||||
title: "Tutorials"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/DLlib/QuickStart/python-getting-started
|
||||
title: "Python Quick Start"
|
||||
- file: doc/DLlib/QuickStart/scala-getting-started
|
||||
title: "Scala Quick Start"
|
||||
- file: doc/PythonAPI/DLlib/index
|
||||
title: "API Reference"
|
||||
|
||||
- entries:
|
||||
- file: doc/Chronos/index
|
||||
title: "Chronos"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/Chronos/Overview/quick-tour
|
||||
- file: doc/Chronos/Howto/index
|
||||
- file: doc/Chronos/QuickStart/index
|
||||
title: "Chronos in 5 minutes"
|
||||
- file: doc/Chronos/Overview/install
|
||||
title: "Installation"
|
||||
- file: doc/Chronos/Overview/deep_dive
|
||||
title: "Key Features"
|
||||
- file: doc/Chronos/Howto/index
|
||||
title: "How-to Guides"
|
||||
- file: doc/Chronos/QuickStart/index
|
||||
title: "Tutorials"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/Chronos/QuickStart/chronos-tsdataset-forecaster-quickstart
|
||||
- file: doc/Chronos/QuickStart/chronos-autotsest-quickstart
|
||||
- file: doc/Chronos/QuickStart/chronos-anomaly-detector
|
||||
- file: doc/Chronos/Overview/chronos_known_issue
|
||||
title: "Tips and Known Issues"
|
||||
- file: doc/PythonAPI/Chronos/index
|
||||
title: "API Reference"
|
||||
|
||||
- caption: PPML
|
||||
entries:
|
||||
- entries:
|
||||
- file: doc/Friesian/index
|
||||
title: "Friesian"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/Friesian/intro
|
||||
title: "Introduction"
|
||||
- file: doc/Friesian/serving
|
||||
title: "Serving"
|
||||
- file: doc/Friesian/examples
|
||||
title: "Use Cases"
|
||||
- file: doc/PythonAPI/Friesian/index
|
||||
title: "API Reference"
|
||||
|
||||
- entries:
|
||||
- file: doc/PPML/index
|
||||
title: "PPML"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/PPML/Overview/intro
|
||||
title: "PPML Introduction"
|
||||
- file: doc/PPML/Overview/userguide
|
||||
title: 'User Guide'
|
||||
- file: doc/PPML/Overview/examples
|
||||
title: "Tutorials"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/PPML/Overview/quicktour
|
||||
- file: doc/PPML/QuickStart/end-to-end
|
||||
- file: doc/PPML/Overview/misc
|
||||
title: "Advanced Topics"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/PPML/Overview/ppml
|
||||
- file: doc/PPML/Overview/trusted_big_data_analytics_and_ml
|
||||
- file: doc/PPML/Overview/trusted_fl
|
||||
|
|
@ -72,34 +192,33 @@ subtrees:
|
|||
- file: doc/PPML/QuickStart/tpc-ds_with_sparksql_on_k8s
|
||||
- file: doc/PPML/Overview/azure_ppml
|
||||
|
||||
- caption: Serving
|
||||
entries:
|
||||
|
||||
- entries:
|
||||
- file: doc/UserGuide/develop
|
||||
title: "Developer guide"
|
||||
|
||||
|
||||
- entries:
|
||||
- file: doc/Serving/index
|
||||
title: "Cluster serving"
|
||||
subtrees:
|
||||
- entries:
|
||||
- file: doc/Serving/Overview/serving.md
|
||||
title: "User Guide"
|
||||
- file: doc/Serving/QuickStart/serving-quickstart
|
||||
title: "Serving in 5 miniutes"
|
||||
- file: doc/Serving/ProgrammingGuide/serving-installation
|
||||
- file: doc/Serving/ProgrammingGuide/serving-start
|
||||
- file: doc/Serving/ProgrammingGuide/serving-inference
|
||||
- file: doc/Serving/Example/example
|
||||
title: "Examples"
|
||||
- file: doc/Serving/FAQ/faq
|
||||
- file: doc/Serving/FAQ/contribute-guide
|
||||
|
||||
- caption: Common Use Case
|
||||
entries:
|
||||
- file: doc/Orca/QuickStart/orca-pytorch-distributed-quickstart
|
||||
- file: doc/UseCase/spark-dataframe
|
||||
- file: doc/UseCase/xshards-pandas
|
||||
- file: doc/Orca/QuickStart/orca-autoestimator-pytorch-quickstart
|
||||
- file: doc/Orca/QuickStart/orca-autoxgboost-quickstart
|
||||
|
||||
- caption: Python API
|
||||
entries:
|
||||
- file: doc/PythonAPI/Orca/orca
|
||||
- file: doc/PythonAPI/Friesian/feature
|
||||
- file: doc/PythonAPI/Chronos/index
|
||||
- file: doc/PythonAPI/Nano/index
|
||||
|
||||
- caption: Real-World Application
|
||||
entries:
|
||||
- entries:
|
||||
- file: doc/Application/presentations
|
||||
title: "Presentations"
|
||||
|
||||
- entries:
|
||||
- file: doc/Application/blogs
|
||||
- file: doc/Application/powered-by
|
||||
|
|
|
|||
|
|
@ -31,19 +31,39 @@ sys.path.insert(0, os.path.abspath("../../../python/serving/src/"))
|
|||
sys.path.insert(0, os.path.abspath("../../../python/nano/src/"))
|
||||
|
||||
# -- Project information -----------------------------------------------------
|
||||
import sphinx_rtd_theme
|
||||
html_theme = "sphinx_rtd_theme"
|
||||
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
|
||||
#html_theme = "sphinx_book_theme"
|
||||
html_theme = "pydata_sphinx_theme"
|
||||
html_theme_options = {
|
||||
"repository_url": "https://github.com/intel-analytics/BigDL",
|
||||
"use_repository_button": True,
|
||||
"use_issues_button": True,
|
||||
"use_edit_page_button": True,
|
||||
"path_to_docs": "doc/source",
|
||||
"home_page_in_toc": True,
|
||||
"header_links_before_dropdown": 8,
|
||||
"icon_links": [
|
||||
{
|
||||
"name": "GitHub Repository for BigDL",
|
||||
"url": "https://github.com/intel-analytics/BigDL",
|
||||
"icon": "fa-brands fa-square-github",
|
||||
"type": "fontawesome",
|
||||
}
|
||||
],
|
||||
"navbar_start": ["navbar-logo.html", "version_badge.html"],
|
||||
"navbar_end": ["navbar-icon-links.html"], # remove dark mode for now
|
||||
}
|
||||
|
||||
# add search bar to side bar
|
||||
html_sidebars = {
|
||||
"index": [
|
||||
"sidebar_quicklinks.html"
|
||||
],
|
||||
"**": ["sidebar_backbutton.html","sidebar-nav-bs.html"]
|
||||
}
|
||||
|
||||
# remove dark mode for now
|
||||
html_context = {
|
||||
"default_mode": "light"
|
||||
}
|
||||
|
||||
html_logo = "../image/bigdl_logo.png"
|
||||
|
||||
# hard code it for now, may change it to read from installed bigdl
|
||||
release = "latest"
|
||||
|
||||
# The suffix of source filenames.
|
||||
from recommonmark.parser import CommonMarkParser
|
||||
source_suffix = {'.rst': 'restructuredtext',
|
||||
|
|
@ -92,7 +112,8 @@ extensions = [
|
|||
'sphinx_external_toc',
|
||||
'sphinx_design',
|
||||
'nbsphinx',
|
||||
'nbsphinx_link'
|
||||
'nbsphinx_link',
|
||||
'sphinx.ext.graphviz' # for embedded graphviz diagram
|
||||
]
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
|
|
@ -136,6 +157,13 @@ exclude_patterns = ['_build']
|
|||
# relative to this directory. They are copied after the builtin static files,
|
||||
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||
html_static_path = ['_static']
|
||||
# add js/css for customizing each page
|
||||
html_js_files = [
|
||||
'js/custom.js',
|
||||
]
|
||||
html_css_files = [
|
||||
'css/custom.css',
|
||||
]
|
||||
|
||||
# Custom sidebar templates, must be a dictionary that maps document namesan
|
||||
# to template names.
|
||||
|
|
@ -247,3 +275,6 @@ def setup(app):
|
|||
|
||||
# disable notebook execution
|
||||
nbsphinx_execute = 'never'
|
||||
|
||||
# make output of graphviz diagram to svg
|
||||
graphviz_output_format = 'svg'
|
||||
2
docs/readthedocs/source/doc/Application/index.rst
Normal file
|
|
@ -0,0 +1,2 @@
|
|||
Real-World Application
|
||||
=========================
|
||||
|
|
@ -97,15 +97,15 @@ After the Jupyter Notebook service is successfully started, you can connect to t
|
|||
You should shut down the BigDL Docker container after using it.
|
||||
1. First, use `ctrl+p+q` to quit the container when you are still in it.
|
||||
2. Then, you can list all the active Docker containers by command line:
|
||||
```bash
|
||||
sudo docker ps
|
||||
```
|
||||
You will see your docker containers:
|
||||
```bash
|
||||
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
|
||||
40de2cdad025 chronos-nightly:b1 "/opt/work/" 3 hours ago Up 3 hours upbeat_al
|
||||
```
|
||||
```bash
|
||||
sudo docker ps
|
||||
```
|
||||
You will see your docker containers:
|
||||
```bash
|
||||
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
|
||||
40de2cdad025 chronos-nightly:b1 "/opt/work/" 3 hours ago Up 3 hours upbeat_al
|
||||
```
|
||||
3. Shut down the corresponding docker container by its ID:
|
||||
```bash
|
||||
sudo docker rm -f 40de2cdad025
|
||||
```
|
||||
```bash
|
||||
sudo docker rm -f 40de2cdad025
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1 +1 @@
|
|||
<svg width="535" height="368" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" overflow="hidden"><defs><clipPath id="clip0"><rect x="1771" y="750" width="535" height="368"/></clipPath></defs><g clip-path="url(#clip0)" transform="translate(-1771 -750)"><path d="M0 0 76.8928 246.699" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 1792 1097.7)"/><path d="M1868 846 1948.1 1102.31" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M1771 1097.5C1771 1086.18 1780.18 1077 1791.5 1077 1802.82 1077 1812 1086.18 1812 1097.5 1812 1108.82 1802.82 1118 1791.5 1118 1780.18 1118 1771 1108.82 1771 1097.5Z" fill="#70AD47" fill-rule="evenodd"/><path d="M1848 848C1848 836.402 1857.18 827 1868.5 827 1879.82 827 1889 836.402 1889 848 1889 859.598 1879.82 869 1868.5 869 1857.18 869 1848 859.598 1848 848Z" fill="#70AD47" fill-rule="evenodd"/><path d="M0 0 76.8928 246.699" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 1948 1097.7)"/><path d="M2025 846 2105.1 1102.31" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M1928 1097.5C1928 1086.18 1937.18 1077 1948.5 1077 1959.82 1077 1969 1086.18 1969 1097.5 1969 1108.82 1959.82 1118 1948.5 1118 1937.18 1118 1928 1108.82 1928 1097.5Z" fill="#70AD47" fill-rule="evenodd"/><path d="M2005 848C2005 836.402 2014.18 827 2025.5 827 2036.82 827 2046 836.402 2046 848 2046 859.598 2036.82 869 2025.5 869 2014.18 869 2005 859.598 2005 848Z" fill="#70AD47" fill-rule="evenodd"/><path d="M0 0 75.2197 328.131" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2107 1092.13)"/><path d="M2187 762 2284.72 1094.67" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2086 1092.5C2086 1081.18 2095.4 1072 2107 1072 2118.6 1072 2128 1081.18 2128 1092.5 2128 1103.82 2118.6 1113 2107 1113 2095.4 1113 2086 1103.82 2086 1092.5Z" fill="#70AD47" fill-rule="evenodd"/><path d="M2264 1092.5C2264 1081.18 2273.4 1072 2285 1072 2296.6 1072 2306 1081.18 2306 1092.5 2306 1103.82 2296.6 1113 2285 1113 2273.4 1113 2264 1103.82 2264 1092.5Z" fill="#70AD47" fill-rule="evenodd"/><path d="M2166 770.5C2166 759.178 2175.4 750 2187 750 2198.6 750 2208 759.178 2208 770.5 2208 781.822 2198.6 791 2187 791 2175.4 791 2166 781.822 2166 770.5Z" fill="#FF0000" fill-rule="evenodd"/></g></svg>
|
||||
<svg width="1320" height="990" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" overflow="hidden"><defs><clipPath id="clip0"><rect x="1907" y="139" width="1320" height="990"/></clipPath></defs><g clip-path="url(#clip0)" transform="translate(-1907 -139)"><rect x="1907" y="139" width="1320" height="990" fill="#FFFFFF"/><path d="M0.00756648 0.0364743 153.008 492.037" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2090.5 932.5)"/><path d="M2241.86 430.229 2400.86 941.229" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2047.36 930.729C2047.36 908.089 2065.72 889.728 2088.36 889.728 2111 889.728 2129.36 908.089 2129.36 930.729 2129.36 953.369 2111 971.729 2088.36 971.729 2065.72 971.729 2047.36 953.369 2047.36 930.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M2201.36 433.729C2201.36 410.532 2219.72 391.728 2242.36 391.728 2265 391.728 2283.36 410.532 2283.36 433.729 2283.36 456.924 2265 475.729 2242.36 475.729 2219.72 475.729 2201.36 456.924 2201.36 433.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M0.0637745 0.0364743 153.064 492.037" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2401.5 932.5)"/><path d="M2554.86 430.229 2713.86 941.229" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2360.36 930.729C2360.36 908.089 2378.72 889.728 2401.36 889.728 2424 889.728 2442.36 908.089 2442.36 930.729 2442.36 953.369 2424 971.729 2401.36 971.729 2378.72 971.729 2360.36 953.369 2360.36 930.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M2514.37 433.729C2514.37 410.532 2532.5 391.728 2554.87 391.728 2577.23 391.728 2595.37 410.532 2595.37 433.729 2595.37 456.924 2577.23 475.729 2554.87 475.729 2532.5 475.729 2514.37 456.924 2514.37 433.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M0.12133 0.005045 150.122 653.006" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2718.5 920.5)"/><path d="M2876.86 263.229 3071.86 926.23" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2675.36 920.729C2675.36 898.089 2694.16 879.728 2717.36 879.728 2740.56 879.728 2759.36 898.089 2759.36 920.729 2759.36 943.369 2740.56 961.729 2717.36 961.729 2694.16 961.729 2675.36 943.369 2675.36 920.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M3030.36 920.729C3030.36 898.089 3049.16 879.728 3072.36 879.728 3095.56 879.728 3114.36 898.089 3114.36 920.729 3114.36 943.369 3095.56 961.729 3072.36 961.729 3049.16 961.729 3030.36 943.369 3030.36 920.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M2835.37 279.232C2835.37 256.864 2853.94 238.732 2876.87 238.732 2899.79 238.732 2918.37 256.864 2918.37 279.232 2918.37 301.6 2899.79 319.732 2876.87 319.732 2853.94 319.732 2835.37 301.6 2835.37 279.232Z" fill="#DC3545" fill-rule="evenodd"/></g></svg>
|
||||
|
Before Width: | Height: | Size: 2.5 KiB After Width: | Height: | Size: 3 KiB |
|
|
@ -1 +1 @@
|
|||
<svg width="551" height="416" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" overflow="hidden"><defs><clipPath id="clip0"><rect x="681" y="729" width="551" height="416"/></clipPath></defs><g clip-path="url(#clip0)" transform="translate(-681 -729)"><path d="M692 996 813.747 1124.16" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M0 0 76.8928 246.699" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 814 1124.7)"/><path d="M891 873 1012.75 1001.16" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M0 0 76.8928 246.699" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" stroke-dasharray="41.25 30.9375" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 1012 1001.7)"/><path d="M1089 750 1210.75 878.155" stroke="#4472C4" stroke-width="10.3125" stroke-miterlimit="8" stroke-dasharray="41.25 30.9375" fill="none" fill-rule="evenodd"/><path d="M681 1007.5C681 996.178 690.178 987 701.5 987 712.822 987 722 996.178 722 1007.5 722 1018.82 712.822 1028 701.5 1028 690.178 1028 681 1018.82 681 1007.5Z" fill="#FFC000" fill-rule="evenodd"/><path d="M793 1124.5C793 1113.18 802.402 1104 814 1104 825.598 1104 835 1113.18 835 1124.5 835 1135.82 825.598 1145 814 1145 802.402 1145 793 1135.82 793 1124.5Z" fill="#FFC000" fill-rule="evenodd"/><path d="M870 875.5C870 864.178 879.402 855 891 855 902.598 855 912 864.178 912 875.5 912 886.822 902.598 896 891 896 879.402 896 870 886.822 870 875.5Z" fill="#FFC000" fill-rule="evenodd"/><path d="M992 996.5C992 985.178 1001.18 976 1012.5 976 1023.82 976 1033 985.178 1033 996.5 1033 1007.82 1023.82 1017 1012.5 1017 1001.18 1017 992 1007.82 992 996.5Z" fill="#FFC000" fill-rule="evenodd"/><path d="M1068 750C1068 738.402 1077.4 729 1089 729 1100.6 729 1110 738.402 1110 750 1110 761.598 1100.6 771 1089 771 1077.4 771 1068 761.598 1068 750Z" fill="#FFC000" fill-rule="evenodd"/><path d="M1191 873C1191 861.402 1200.18 852 1211.5 852 1222.82 852 1232 861.402 1232 873 1232 884.598 1222.82 894 1211.5 894 1200.18 894 1191 884.598 1191 873Z" fill="#FFC000" fill-rule="evenodd"/></g></svg>
|
||||
<svg width="1320" height="990" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" overflow="hidden"><defs><clipPath id="clip0"><rect x="192" y="139" width="1320" height="990"/></clipPath></defs><g clip-path="url(#clip0)" transform="translate(-192 -139)"><rect x="192" y="139" width="1320" height="990" fill="#FFFFFF"/><path d="M316.254 753.236 563.254 1013.24" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M0.048095 0.0538267 156.049 500.054" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 563.5 1014.5)"/><path d="M719.254 504.237 966.254 764.236" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M0.119695 0.00939258 156.12 500.01" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" stroke-dasharray="83.3791 62.5344" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 964.5 765.5)"/><path d="M1120.25 255.236 1367.25 515.237" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" stroke-dasharray="83.3791 62.5344" fill="none" fill-rule="evenodd"/><path d="M293.754 776.237C293.754 753.317 312.334 734.737 335.254 734.737 358.175 734.737 376.755 753.317 376.755 776.237 376.755 799.154 358.175 817.737 335.254 817.737 312.334 817.737 293.754 799.154 293.754 776.237Z" fill="#EE9040" fill-rule="evenodd"/><path d="M520.754 1013.24C520.754 990.321 539.782 971.737 563.255 971.737 586.727 971.737 605.754 990.321 605.754 1013.24 605.754 1036.15 586.727 1054.74 563.255 1054.74 539.782 1054.74 520.754 1036.15 520.754 1013.24Z" fill="#EE9040" fill-rule="evenodd"/><path d="M676.754 509.237C676.754 486.317 695.782 467.737 719.254 467.737 742.727 467.737 761.754 486.317 761.754 509.237 761.754 532.157 742.727 550.737 719.254 550.737 695.782 550.737 676.754 532.157 676.754 509.237Z" fill="#EE9040" fill-rule="evenodd"/><path d="M923.754 754.237C923.754 731.317 942.338 712.737 965.254 712.737 988.171 712.737 1006.75 731.317 1006.75 754.237 1006.75 777.154 988.171 795.737 965.254 795.737 942.338 795.737 923.754 777.154 923.754 754.237Z" fill="#EE9040" fill-rule="evenodd"/><path d="M1077.75 255.237C1077.75 231.765 1096.78 212.737 1120.25 212.737 1143.73 212.737 1162.75 231.765 1162.75 255.237 1162.75 278.709 1143.73 297.737 1120.25 297.737 1096.78 297.737 1077.75 278.709 1077.75 255.237Z" fill="#EE9040" fill-rule="evenodd"/><path d="M1326.75 504.237C1326.75 480.765 1345.34 461.737 1368.25 461.737 1391.17 461.737 1409.75 480.765 1409.75 504.237 1409.75 527.709 1391.17 546.737 1368.25 546.737 1345.34 546.737 1326.75 527.709 1326.75 504.237Z" fill="#EE9040" fill-rule="evenodd"/></g></svg>
|
||||
|
Before Width: | Height: | Size: 2.2 KiB After Width: | Height: | Size: 2.7 KiB |
|
Before Width: | Height: | Size: 4.9 KiB After Width: | Height: | Size: 5.9 KiB |
|
|
@ -1,4 +1,4 @@
|
|||
# Time Series Anomaly Detection Overview
|
||||
# Anomaly Detection
|
||||
|
||||
Anomaly Detection detects abnormal samples in a given time series. _Chronos_ provides a set of unsupervised anomaly detectors.
|
||||
|
||||
|
|
@ -23,7 +23,7 @@ DBScanDetector uses DBSCAN clustering algortihm for anomaly detection.
|
|||
|
||||
```eval_rst
|
||||
.. note::
|
||||
Users may install `scikit-learn-intelex` to accelerate this detector. Chronos will detect if `scikit-learn-intelex` is installed to decide if using it. More details please refer to: https://intel.github.io/scikit-learn-intelex/installation.html
|
||||
Users may install ``scikit-learn-intelex`` to accelerate this detector. Chronos will detect if ``scikit-learn-intelex`` is installed to decide if using it. More details please refer to: https://intel.github.io/scikit-learn-intelex/installation.html
|
||||
```
|
||||
|
||||
View anomaly detection [notebook][AIOps_anomaly_detect_unsupervised] and [DBScanDetector API Doc](../../PythonAPI/Chronos/anomaly_detectors.html#chronos-model-anomaly-dbscan-detector) for more details.
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
# Time Series Processing and Feature Engineering Overview
|
||||
# Data Processing and Feature Engineering
|
||||
|
||||
Time series data is a special data formulation with its specific operations. _Chronos_ provides [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) as a time series dataset abstract for data processing (e.g. impute, deduplicate, resample, scale/unscale, roll sampling) and auto feature engineering (e.g. datetime feature, aggregation feature). Chronos also provides [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) with same(or similar) API for distributed and parallelized data preprocessing on large data.
|
||||
|
||||
|
|
@ -176,7 +176,7 @@ Other than historical target data and other extra feature provided by users, som
|
|||
A time series dataset needs to be sampling and exporting as numpy ndarray/dataloader to be used in machine learning and deep learning models(e.g. forecasters, anomaly detectors, auto models, etc.).
|
||||
```eval_rst
|
||||
.. warning::
|
||||
You don't need to call any sampling or exporting methods introduced in this section when using `AutoTSEstimator`.
|
||||
You don't need to call any sampling or exporting methods introduced in this section when using ``AutoTSEstimator``.
|
||||
```
|
||||
### **6.1 Roll sampling**
|
||||
Roll sampling (or sliding window sampling) is useful when you want to train a RR type supervised deep learning forecasting model. It works as the [diagram](#RR-forecast-image) shows.
|
||||
|
|
@ -187,11 +187,11 @@ Please refer to the API doc [`roll`](../../PythonAPI/Chronos/tsdataset.html#bigd
|
|||
|
||||
```eval_rst
|
||||
.. note::
|
||||
**Difference between `roll` and `to_torch_data_loader`**:
|
||||
**Difference between** ``roll`` **and** ``to_torch_data_loader``:
|
||||
|
||||
`.roll(...)` performs the rolling before RR forecasters/auto models training while `.to_torch_data_loader(...)` performs rolling during the training.
|
||||
``.roll(...)`` performs the rolling before RR forecasters/auto models training while ``.to_torch_data_loader(...)`` performs rolling during the training.
|
||||
|
||||
It is fine to use either of them when you have a relatively small dataset (less than 1G). `.to_torch_data_loader(...)` is recommended when you have a large dataset (larger than 1G) to save memory usage.
|
||||
It is fine to use either of them when you have a relatively small dataset (less than 1G). ``.to_torch_data_loader(...)`` is recommended when you have a large dataset (larger than 1G) to save memory usage.
|
||||
```
|
||||
|
||||
```eval_rst
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
# Time Series Forecasting Overview
|
||||
# Time Series Forecasting
|
||||
|
||||
_Chronos_ provides both deep learning/machine learning models and traditional statistical models for forecasting.
|
||||
|
||||
|
|
@ -67,11 +67,11 @@ For AutoTS Pipeline, we will leverage `AutoTSEstimator`, `TSPipeline` and prefer
|
|||
3. Use the returned `TSPipeline` for further development.
|
||||
```eval_rst
|
||||
.. warning::
|
||||
`AutoTSTrainer` workflow has been deprecated, no feature updates or performance improvement will be carried out. Users of `AutoTSTrainer` may refer to `Chronos API doc <https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Chronos/autots.html>`_.
|
||||
``AutoTSTrainer`` workflow has been deprecated, no feature updates or performance improvement will be carried out. Users of ``AutoTSTrainer`` may refer to `Chronos API doc <https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Chronos/autots.html>`_.
|
||||
```
|
||||
```eval_rst
|
||||
.. note::
|
||||
`AutoTSEstimator` currently only support pytorch backend.
|
||||
``AutoTSEstimator`` currently only support pytorch backend.
|
||||
```
|
||||
View [Quick Start](../QuickStart/chronos-autotsest-quickstart.html) for a more detailed example.
|
||||
|
||||
|
|
@ -147,7 +147,7 @@ Detailed information please refer to [TSPipeline API doc](../../PythonAPI/Chrono
|
|||
|
||||
```eval_rst
|
||||
.. note::
|
||||
`init_orca_context` is not needed if you just use the trained TSPipeline for inference, evaluation or incremental fitting.
|
||||
``init_orca_context`` is not needed if you just use the trained TSPipeline for inference, evaluation or incremental fitting.
|
||||
```
|
||||
```eval_rst
|
||||
.. note::
|
||||
|
|
@ -199,7 +199,7 @@ View Network Traffic multivariate multistep Prediction [notebook][network_traffi
|
|||
```eval_rst
|
||||
.. note::
|
||||
**Additional Dependencies**:
|
||||
You need to install `bigdl-nano[tensorflow]` to enable this built-in model.
|
||||
You need to install ``bigdl-nano[tensorflow]`` to enable this built-in model.
|
||||
|
||||
``pip install bigdl-nano[tensorflow]``
|
||||
```
|
||||
|
|
@ -221,7 +221,7 @@ View High-dimensional Electricity Data Forecasting [example][run_electricity] an
|
|||
```eval_rst
|
||||
.. note::
|
||||
**Additional Dependencies**:
|
||||
You need to install `pmdarima` to enable this built-in model.
|
||||
You need to install ``pmdarima`` to enable this built-in model.
|
||||
|
||||
``pip install pmdarima==1.8.5``
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,59 +1,30 @@
|
|||
# Chronos User Guide
|
||||
|
||||
### **1. Overview**
|
||||
_BigDL-Chronos_ (_Chronos_ for short) is an application framework for building a fast, accurate and scalable time series analysis application.
|
||||
|
||||
You can use _Chronos_ to:
|
||||
|
||||
```eval_rst
|
||||
.. grid:: 3
|
||||
:gutter: 1
|
||||
|
||||
.. grid-item-card::
|
||||
:class-footer: sd-bg-light
|
||||
|
||||
**Forecasting**
|
||||
^^^
|
||||
|
||||
.. image:: ../Image/forecasting.svg
|
||||
:width: 200
|
||||
:alt: Alternative text
|
||||
|
||||
+++
|
||||
|
||||
Predict future using history data.
|
||||
|
||||
.. grid-item-card::
|
||||
:class-footer: sd-bg-light
|
||||
|
||||
**Anomaly Detection**
|
||||
^^^
|
||||
|
||||
.. image:: ../Image/anomaly_detection.svg
|
||||
:width: 200
|
||||
:alt: Alternative text
|
||||
|
||||
+++
|
||||
|
||||
Discover unexpected items in data.
|
||||
|
||||
.. grid-item-card::
|
||||
:class-footer: sd-bg-light
|
||||
|
||||
**Simulation**
|
||||
^^^
|
||||
|
||||
.. image:: ../Image/simulation.svg
|
||||
:width: 200
|
||||
:alt: Alternative text
|
||||
|
||||
+++
|
||||
|
||||
Generate similar data as history data.
|
||||
```
|
||||
# Chronos Installation
|
||||
|
||||
---
|
||||
### **2. Install**
|
||||
|
||||
#### **OS and Python version requirement**
|
||||
|
||||
|
||||
```eval_rst
|
||||
.. note::
|
||||
**Supported OS**:
|
||||
|
||||
Chronos is thoroughly tested on Ubuntu (16.04/18.04/20.04), and should works fine on CentOS. If you are a Windows user, the most convenient way to use Chronos on a windows laptop might be using WSL2, you may refer to https://docs.microsoft.com/en-us/windows/wsl/setup/environment or just install a ubuntu virtual machine.
|
||||
```
|
||||
```eval_rst
|
||||
.. note::
|
||||
**Supported Python Version**:
|
||||
|
||||
Chronos only supports Python 3.7.2 ~ latest 3.7.x. We are validating more Python versions.
|
||||
```
|
||||
|
||||
|
||||
|
||||
#### **Install using Conda**
|
||||
|
||||
We recommend using conda to manage the Chronos python environment. For more information about Conda, refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
||||
Select your preferences in the panel below to find the proper install command. Then run the install command as the example shown below.
|
||||
|
||||
|
||||
```eval_rst
|
||||
.. raw:: html
|
||||
|
|
@ -136,92 +107,17 @@ You can use _Chronos_ to:
|
|||
|
||||
</br>
|
||||
|
||||
#### **2.1 Pypi**
|
||||
When you install `bigdl-chronos` from PyPI. We recommend to install with a conda virtual environment. To install Conda, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
|
||||
|
||||
```bash
|
||||
# create a conda environment for chronos
|
||||
conda create -n my_env python=3.7 setuptools=58.0.4
|
||||
conda activate my_env
|
||||
# click the installation panel above to find which installation option to use
|
||||
pip install --pre --upgrade bigdl-chronos[pytorch] # or other options you may want to use
|
||||
|
||||
# select your preference in above panel to find the proper command to replace the below command, e.g.
|
||||
pip install --pre --upgrade bigdl-chronos[pytorch]
|
||||
|
||||
# init bigdl-nano to enable local accelerations
|
||||
source bigdl-nano-init # accelerate the conda env
|
||||
```
|
||||
|
||||
#### **2.2 OS and Python version requirement**
|
||||
|
||||
```eval_rst
|
||||
.. note::
|
||||
**Supported OS**:
|
||||
|
||||
Chronos is thoroughly tested on Ubuntu (16.04/18.04/20.04), and should works fine on CentOS. If you are a Windows user, the most convenient way to use Chronos on a windows laptop might be using WSL2, you may refer to https://docs.microsoft.com/en-us/windows/wsl/setup/environment or just install a ubuntu virtual machine.
|
||||
```
|
||||
```eval_rst
|
||||
.. note::
|
||||
**Supported Python Version**:
|
||||
|
||||
Chronos only supports Python 3.7.2 ~ latest 3.7.x. We are validating more Python versions.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
|
||||
### **3. Which document to see?**
|
||||
|
||||
```eval_rst
|
||||
.. grid:: 2
|
||||
:gutter: 1
|
||||
|
||||
.. grid-item-card::
|
||||
:class-footer: sd-bg-light
|
||||
|
||||
**Quick Tour**
|
||||
^^^
|
||||
|
||||
You may understand the basic usage of Chronos' components and learn to write the first runnable application in this quick tour page.
|
||||
|
||||
+++
|
||||
`Quick Tour <./quick-tour.html>`_
|
||||
|
||||
.. grid-item-card::
|
||||
:class-footer: sd-bg-light
|
||||
|
||||
**User Guides**
|
||||
^^^
|
||||
|
||||
Our user guides provide you with in-depth information, concepts and knowledges about Chronos.
|
||||
|
||||
+++
|
||||
|
||||
`Data <./data_processing_feature_engineering.html>`_ /
|
||||
`Forecast <./forecasting.html>`_ /
|
||||
`Detect <./anomaly_detection.html>`_ /
|
||||
`Simulate <./simulation.html>`_
|
||||
|
||||
.. grid:: 2
|
||||
:gutter: 1
|
||||
|
||||
.. grid-item-card::
|
||||
:class-footer: sd-bg-light
|
||||
|
||||
**How-to-Guide** / **Example**
|
||||
^^^
|
||||
|
||||
If you are meeting with some specific problems during the usage, how-to guides are good place to be checked.
|
||||
Examples provides short, high quality use case that users can emulated in their own works.
|
||||
|
||||
+++
|
||||
|
||||
`How-to-Guide <../Howto/index.html>`_ / `Example <../QuickStart/index.html>`_
|
||||
|
||||
.. grid-item-card::
|
||||
:class-footer: sd-bg-light
|
||||
|
||||
**API Document**
|
||||
^^^
|
||||
|
||||
API Document provides you with a detailed description of the Chronos APIs.
|
||||
|
||||
+++
|
||||
|
||||
`API Document <../../PythonAPI/Chronos/index.html>`_
|
||||
|
||||
```
|
||||
|
|
@ -1,15 +1,11 @@
|
|||
Chronos Quick Tour
|
||||
======================
|
||||
=================================
|
||||
Welcome to Chronos for building a fast, accurate and scalable time series analysis application🎉! Start with our quick tour to understand some critical concepts and how to use them to tackle your tasks.
|
||||
|
||||
.. grid:: 1 1 1 1
|
||||
|
||||
.. grid-item-card::
|
||||
:text-align: center
|
||||
:shadow: none
|
||||
:class-header: sd-bg-light
|
||||
:class-footer: sd-bg-light
|
||||
:class-card: sd-mb-2
|
||||
|
||||
**Data processing**
|
||||
^^^
|
||||
|
|
@ -22,13 +18,11 @@ Welcome to Chronos for building a fast, accurate and scalable time series analys
|
|||
|
||||
Get Started
|
||||
|
||||
.. grid:: 1 1 3 3
|
||||
.. grid:: 1 3 3 3
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
:text-align: center
|
||||
:shadow: none
|
||||
:class-header: sd-bg-light
|
||||
:class-footer: sd-bg-light
|
||||
:class-card: sd-mb-2
|
||||
|
||||
**Forecasting**
|
||||
|
|
@ -44,9 +38,6 @@ Welcome to Chronos for building a fast, accurate and scalable time series analys
|
|||
|
||||
.. grid-item-card::
|
||||
:text-align: center
|
||||
:shadow: none
|
||||
:class-header: sd-bg-light
|
||||
:class-footer: sd-bg-light
|
||||
:class-card: sd-mb-2
|
||||
|
||||
**Anomaly Detection**
|
||||
|
|
@ -62,9 +53,6 @@ Welcome to Chronos for building a fast, accurate and scalable time series analys
|
|||
|
||||
.. grid-item-card::
|
||||
:text-align: center
|
||||
:shadow: none
|
||||
:class-header: sd-bg-light
|
||||
:class-footer: sd-bg-light
|
||||
:class-card: sd-mb-2
|
||||
|
||||
**Simulation**
|
||||
|
|
@ -104,7 +92,7 @@ In Chronos, we provide a ``TSDataset`` (and a ``XShardsTSDataset`` to handle lar
|
|||
|
||||
|
||||
.. grid:: 2
|
||||
:gutter: 1
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
|
|
@ -192,7 +180,7 @@ For time series forecasting, we also provide an ``AutoTSEstimator`` for distribu
|
|||
stop_orca_context()
|
||||
|
||||
.. grid:: 3
|
||||
:gutter: 1
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
|
|
@ -246,7 +234,7 @@ To import a specific detector, you may use {algorithm name} + "Detector", and ca
|
|||
anomaly_indexes = detector.anomaly_indexes()
|
||||
|
||||
.. grid:: 3
|
||||
:gutter: 1
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
|
|
@ -280,7 +268,7 @@ Simulator(experimental)
|
|||
Simulator is still under activate development with unstable API.
|
||||
|
||||
.. grid:: 2
|
||||
:gutter: 1
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
|
|
|
|||
|
|
@ -1,10 +1,10 @@
|
|||
# Generate Synthetic Sequential Data Overview
|
||||
# Synthetic Data Generation
|
||||
|
||||
Chronos provides simulators to generate synthetic time series data for users who want to conquer limited data access in a deep learning/machine learning project or only want to generate some synthetic data to play with.
|
||||
|
||||
```eval_rst
|
||||
.. note::
|
||||
DPGANSimulator is the only simulator chronos provides at the moment, more simulators are on their way.
|
||||
``DPGANSimulator`` is the only simulator chronos provides at the moment, more simulators are on their way.
|
||||
```
|
||||
|
||||
## **1. DPGANSimulator**
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
# Speed up Chronos built-in models/customized time-series models
|
||||
# Accelerated Training and Inference
|
||||
|
||||
Chronos provides transparent acceleration for Chronos built-in models and customized time-series models. In this deep-dive page, we will introduce how to enable/disable them.
|
||||
|
||||
|
|
@ -80,7 +80,7 @@ Typically, throughput and latency is a trade-off pair. We have three optimizatio
|
|||
.. note::
|
||||
**Additional Dependencies**:
|
||||
|
||||
You need to install `neural-compressor` to enable quantization related methods.
|
||||
You need to install ``neural-compressor`` to enable quantization related methods.
|
||||
|
||||
``pip install neural-compressor==1.8.1``
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,56 +1,7 @@
|
|||
# Useful Functionalities Overview
|
||||
# Distributed Processing
|
||||
|
||||
#### **1. AutoML Visualization**
|
||||
|
||||
AutoML visualization provides two kinds of visualization. You may use them while fitting on auto models or AutoTS pipeline.
|
||||
* During the searching process, the visualizations of each trail are shown and updated every 30 seconds. (Monitor view)
|
||||
* After the searching process, a leaderboard of each trail's configs and metrics is shown. (Leaderboard view)
|
||||
|
||||
**Note**: AutoML visualization is based on tensorboard and tensorboardx. They should be installed properly before the training starts.
|
||||
|
||||
<span id="monitor_view">**Monitor view**</span>
|
||||
|
||||
Before training, start the tensorboard server through
|
||||
|
||||
```python
|
||||
tensorboard --logdir=<logs_dir>/<name>
|
||||
```
|
||||
|
||||
`logs_dir` is the log directory you set for your predictor(e.g. `AutoTSEstimator`, `AutoTCN`, etc.). `name ` is the name parameter you set for your predictor.
|
||||
|
||||
The data in SCALARS tag will be updated every 30 seconds for users to see the training progress.
|
||||
|
||||

|
||||
|
||||
After training, start the tensorboard server through
|
||||
|
||||
```python
|
||||
tensorboard --logdir=<logs_dir>/<name>_leaderboard/
|
||||
```
|
||||
|
||||
where `logs_dir` and `name` are the same as stated in [Monitor view](#monitor_view).
|
||||
|
||||
A dashboard of each trail's configs and metrics is shown in the SCALARS tag.
|
||||
|
||||

|
||||
|
||||
A leaderboard of each trail's configs and metrics is shown in the HPARAMS tag.
|
||||
|
||||

|
||||
|
||||
**Use visualization in Jupyter Notebook**
|
||||
|
||||
You can enable a tensorboard view in jupyter notebook by the following code.
|
||||
|
||||
```python
|
||||
%load_ext tensorboard
|
||||
# for scalar view
|
||||
%tensorboard --logdir <logs_dir>/<name>/
|
||||
# for leaderboard view
|
||||
%tensorboard --logdir <logs_dir>/<name>_leaderboard/
|
||||
```
|
||||
|
||||
#### **2. Distributed training**
|
||||
#### **Distributed training**
|
||||
LSTM, TCN and Seq2seq users can easily train their forecasters in a distributed fashion to **handle extra large dataset and utilize a cluster**. The functionality is powered by Project Orca.
|
||||
```python
|
||||
f = Forecaster(..., distributed=True)
|
||||
|
|
@ -59,10 +10,10 @@ f.predict(...)
|
|||
f.to_local() # collect the forecaster to single node
|
||||
f.predict_with_onnx(...) # onnxruntime only supports single node
|
||||
```
|
||||
#### **3. XShardsTSDataset**
|
||||
#### **Distributed Data processing: XShardsTSDataset**
|
||||
```eval_rst
|
||||
.. warning::
|
||||
`XShardsTSDataset` is still experimental.
|
||||
``XShardsTSDataset`` is still experimental.
|
||||
```
|
||||
`TSDataset` is a single thread lib with reasonable speed on large datasets(~10G). When you handle an extra large dataset or limited memory on a single node, `XShardsTSDataset` can be involved to handle the exact same functionality and usage as `TSDataset` in a distributed fashion.
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,49 @@
|
|||
# AutoML Visualization
|
||||
|
||||
AutoML visualization provides two kinds of visualization. You may use them while fitting on auto models or AutoTS pipeline.
|
||||
* During the searching process, the visualizations of each trail are shown and updated every 30 seconds. (Monitor view)
|
||||
* After the searching process, a leaderboard of each trail's configs and metrics is shown. (Leaderboard view)
|
||||
|
||||
**Note**: AutoML visualization is based on tensorboard and tensorboardx. They should be installed properly before the training starts.
|
||||
|
||||
<span id="monitor_view">**Monitor view**</span>
|
||||
|
||||
Before training, start the tensorboard server through
|
||||
|
||||
```python
|
||||
tensorboard --logdir=<logs_dir>/<name>
|
||||
```
|
||||
|
||||
`logs_dir` is the log directory you set for your predictor(e.g. `AutoTSEstimator`, `AutoTCN`, etc.). `name ` is the name parameter you set for your predictor.
|
||||
|
||||
The data in SCALARS tag will be updated every 30 seconds for users to see the training progress.
|
||||
|
||||

|
||||
|
||||
After training, start the tensorboard server through
|
||||
|
||||
```python
|
||||
tensorboard --logdir=<logs_dir>/<name>_leaderboard/
|
||||
```
|
||||
|
||||
where `logs_dir` and `name` are the same as stated in [Monitor view](#monitor_view).
|
||||
|
||||
A dashboard of each trail's configs and metrics is shown in the SCALARS tag.
|
||||
|
||||

|
||||
|
||||
A leaderboard of each trail's configs and metrics is shown in the HPARAMS tag.
|
||||
|
||||

|
||||
|
||||
**Use visualization in Jupyter Notebook**
|
||||
|
||||
You can enable a tensorboard view in jupyter notebook by the following code.
|
||||
|
||||
```python
|
||||
%load_ext tensorboard
|
||||
# for scalar view
|
||||
%tensorboard --logdir <logs_dir>/<name>/
|
||||
# for leaderboard view
|
||||
%tensorboard --logdir <logs_dir>/<name>_leaderboard/
|
||||
```
|
||||
|
|
@ -8,7 +8,7 @@
|
|||
|
||||
**In this guide we will demonstrate how to use _Chronos TSDataset_ and _Chronos Forecaster_ for time seires processing and forecasting in 4 simple steps.**
|
||||
|
||||
### **Step 0: Prepare Environment**
|
||||
### Step 0: Prepare Environment
|
||||
|
||||
We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../Overview/chronos.html#install) for more details.
|
||||
|
||||
|
|
|
|||
89
docs/readthedocs/source/doc/Chronos/index.rst
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
BigDL-Chronos
|
||||
========================
|
||||
|
||||
**BigDL-Chronos** (**Chronos** for short) is an application framework for building a fast, accurate and scalable time series analysis application.
|
||||
|
||||
You can use **Chronos** for:
|
||||
|
||||
.. grid:: 1 3 3 3
|
||||
|
||||
.. grid-item::
|
||||
|
||||
.. image:: ./Image/forecasting.svg
|
||||
:alt: Forcasting example diagram
|
||||
|
||||
**Forecasting:** Predict future using history data.
|
||||
|
||||
.. grid-item::
|
||||
|
||||
.. image:: ./Image/anomaly_detection.svg
|
||||
:alt: Anomaly Detection example diagram
|
||||
|
||||
**Anomaly Detection:** Discover unexpected items in data.
|
||||
|
||||
.. grid-item::
|
||||
|
||||
.. image:: ./Image/simulation.svg
|
||||
:alt: Simulation example diagram
|
||||
|
||||
**Simulation:** Generate similar data as history data.
|
||||
|
||||
-------
|
||||
|
||||
|
||||
.. grid:: 1 2 2 2
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Get Started**
|
||||
^^^
|
||||
|
||||
You may understand the basic usage of Chronos' components and learn to write the first runnable application in this quick tour page.
|
||||
|
||||
+++
|
||||
:bdg-link:`Chronos in 5 minutes <./Overview/quick-tour.html>` |
|
||||
:bdg-link:`Installation <./Overview/install.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Key Features Guide**
|
||||
^^^
|
||||
|
||||
Our user guides provide you with in-depth information, concepts and knowledges about Chronos.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Data <./Overview/data_processing_feature_engineering.html>` |
|
||||
:bdg-link:`Forecast <./Overview/forecasting.html>` |
|
||||
:bdg-link:`Detect <./Overview/anomaly_detection.html>` |
|
||||
:bdg-link:`Simulate <./Overview/simulation.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**How-to-Guide** / **Tutorials**
|
||||
^^^
|
||||
|
||||
If you are meeting with some specific problems during the usage, how-to guides are good place to be checked.
|
||||
Examples provides short, high quality use case that users can emulated in their own works.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`How-to-Guide <./Howto/index.html>` | :bdg-link:`Example <./QuickStart/index.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**API Document**
|
||||
^^^
|
||||
|
||||
API Document provides you with a detailed description of the Chronos APIs.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`API Document <../PythonAPI/Chronos/index.html>`
|
||||
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
BigDL-Chronos Document <self>
|
||||
BIN
docs/readthedocs/source/doc/DLlib/Image/tensorboard-histo1.png
Normal file
|
After Width: | Height: | Size: 170 KiB |
BIN
docs/readthedocs/source/doc/DLlib/Image/tensorboard-histo2.png
Normal file
|
After Width: | Height: | Size: 143 KiB |
BIN
docs/readthedocs/source/doc/DLlib/Image/tensorboard-scalar.png
Normal file
|
After Width: | Height: | Size: 86 KiB |
|
|
@ -1,6 +1,6 @@
|
|||
# DLlib User Guide
|
||||
# DLlib in 5 minutes
|
||||
|
||||
## 1. Overview
|
||||
## Overview
|
||||
|
||||
DLlib is a distributed deep learning library for Apache Spark; with DLlib, users can write their deep learning applications as standard Spark programs (using either Scala or Python APIs).
|
||||
|
||||
|
|
@ -9,36 +9,30 @@ It includes the functionalities of the [original BigDL](https://github.com/intel
|
|||
* [Keras-like API](keras-api.md)
|
||||
* [Spark ML pipeline support](nnframes.md)
|
||||
|
||||
## 2. Scala user guide
|
||||
### 2.1 Install and Run
|
||||
Please refer [scala guide](../../UserGuide/scala.md) for details.
|
||||
|
||||
---
|
||||
|
||||
### 2.2 Get started
|
||||
---
|
||||
## Scala Example
|
||||
|
||||
This section show a single example of how to use dllib to build a deep learning application on Spark, using Keras APIs
|
||||
|
||||
---
|
||||
#### **LeNet Model on MNIST using Keras-Style API**
|
||||
|
||||
This tutorial is an explanation of what is happening in the [lenet](https://github.com/intel-analytics/BigDL/tree/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/keras) example
|
||||
|
||||
A bigdl-dllib program starts with initialize as follows.
|
||||
````scala
|
||||
val conf = Engine.createSparkConf()
|
||||
val conf = Engine.createSparkConf()
|
||||
.setAppName("Train Lenet on MNIST")
|
||||
.set("spark.task.maxFailures", "1")
|
||||
val sc = new SparkContext(conf)
|
||||
Engine.init
|
||||
val sc = new SparkContext(conf)
|
||||
Engine.init
|
||||
````
|
||||
|
||||
After the initialization, we need to:
|
||||
|
||||
1. Load train and validation data by _**creating the [```DataSet```](https://github.com/intel-analytics/BigDL/blob/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/feature/dataset/DataSet.scala)**_ (e.g., ````SampleToGreyImg````, ````GreyImgNormalizer```` and ````GreyImgToBatch````):
|
||||
|
||||
````scala
|
||||
````scala
|
||||
val trainSet = (if (sc.isDefined) {
|
||||
DataSet.array(load(trainData, trainLabel), sc.get, param.nodeNumber)
|
||||
} else {
|
||||
|
|
@ -49,10 +43,10 @@ After the initialization, we need to:
|
|||
val validationSet = DataSet.array(load(validationData, validationLabel), sc) ->
|
||||
BytesToGreyImg(28, 28) -> GreyImgNormalizer(testMean, testStd) -> GreyImgToBatch(
|
||||
param.batchSize)
|
||||
````
|
||||
````
|
||||
|
||||
2. We then define Lenet model using Keras-style api
|
||||
````scala
|
||||
````scala
|
||||
val input = Input(inputShape = Shape(28, 28, 1))
|
||||
val reshape = Reshape(Array(1, 28, 28)).inputs(input)
|
||||
val conv1 = Convolution2D(6, 5, 5, activation = "tanh").setName("conv1_5x5").inputs(reshape)
|
||||
|
|
@ -66,75 +60,22 @@ After the initialization, we need to:
|
|||
````
|
||||
|
||||
3. After that, we configure the learning process. Set the ````optimization method```` and the ````Criterion```` (which, given input and target, computes gradient per given loss function):
|
||||
````scala
|
||||
````scala
|
||||
model.compile(optimizer = optimMethod,
|
||||
loss = ClassNLLCriterion[Float](logProbAsInput = false),
|
||||
metrics = Array(new Top1Accuracy[Float](), new Top5Accuracy[Float](), new Loss[Float]))
|
||||
````
|
||||
````
|
||||
|
||||
Finally we _**train the model**_ by calling ````model.fit````:
|
||||
````scala
|
||||
model.fit(trainSet, nbEpoch = param.maxEpoch, validationData = validationSet)
|
||||
model.fit(trainSet, nbEpoch = param.maxEpoch, validationData = validationSet)
|
||||
````
|
||||
|
||||
---
|
||||
|
||||
## 3. Python user guide
|
||||
## Python Example
|
||||
|
||||
### 3.1 Install
|
||||
|
||||
#### 3.1.1 Official Release
|
||||
|
||||
Run below command to install _bigdl-dllib_.
|
||||
|
||||
```bash
|
||||
conda create -n my_env python=3.7
|
||||
conda activate my_env
|
||||
pip install bigdl-dllib
|
||||
```
|
||||
|
||||
#### 3.1.2 Nightly build
|
||||
|
||||
You can install the latest nightly build of bigdl-dllib as follows:
|
||||
```bash
|
||||
pip install --pre --upgrade bigdl-dllib
|
||||
```
|
||||
|
||||
|
||||
### 3.2 Run
|
||||
|
||||
#### **3.2.1 Interactive Shell**
|
||||
|
||||
You may test if the installation is successful using the interactive Python shell as follows:
|
||||
|
||||
* Type `python` in the command line to start a REPL.
|
||||
* Try to run the example code below to verify the installation:
|
||||
|
||||
```python
|
||||
from bigdl.dllib.utils.nncontext import *
|
||||
|
||||
sc = init_nncontext() # Initiation of bigdl-dllib on the underlying cluster.
|
||||
```
|
||||
|
||||
#### **3.2.2 Jupyter Notebook**
|
||||
|
||||
You can start the Jupyter notebook as you normally do using the following command and run bigdl-dllib programs directly in a Jupyter notebook:
|
||||
|
||||
```bash
|
||||
jupyter notebook --notebook-dir=./ --ip=* --no-browser
|
||||
```
|
||||
|
||||
#### **3.2.3 Python Script**
|
||||
|
||||
You can directly write bigdl-dlllib programs in a Python file (e.g. script.py) and run in the command line as a normal Python program:
|
||||
|
||||
```bash
|
||||
python script.py
|
||||
```
|
||||
---
|
||||
### 3.3 Get started
|
||||
|
||||
#### **NN Context**
|
||||
#### **Initialize NN Context**
|
||||
|
||||
`NNContext` is the main entry for provisioning the dllib program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
|
||||
|
||||
|
|
@ -158,15 +99,15 @@ This tutorial describes the [Autograd](https://github.com/intel-analytics/BigDL/
|
|||
|
||||
The example first do the initializton using `init_nncontext()`:
|
||||
```python
|
||||
sc = init_nncontext()
|
||||
sc = init_nncontext()
|
||||
```
|
||||
|
||||
It then generate the input data X_, Y_
|
||||
|
||||
```python
|
||||
data_len = 1000
|
||||
X_ = np.random.uniform(0, 1, (1000, 2))
|
||||
Y_ = ((2 * X_).sum(1) + 0.4).reshape([data_len, 1])
|
||||
data_len = 1000
|
||||
X_ = np.random.uniform(0, 1, (1000, 2))
|
||||
Y_ = ((2 * X_).sum(1) + 0.4).reshape([data_len, 1])
|
||||
```
|
||||
|
||||
It then define the custom loss
|
||||
|
|
@ -179,18 +120,18 @@ def mean_absolute_error(y_true, y_pred):
|
|||
|
||||
After that, the example creates the model as follows and set the criterion as the custom loss:
|
||||
```python
|
||||
a = Input(shape=(2,))
|
||||
b = Dense(1)(a)
|
||||
c = Lambda(function=add_one_func)(b)
|
||||
model = Model(input=a, output=c)
|
||||
a = Input(shape=(2,))
|
||||
b = Dense(1)(a)
|
||||
c = Lambda(function=add_one_func)(b)
|
||||
model = Model(input=a, output=c)
|
||||
|
||||
model.compile(optimizer=SGD(learningrate=1e-2),
|
||||
model.compile(optimizer=SGD(learningrate=1e-2),
|
||||
loss=mean_absolute_error)
|
||||
```
|
||||
Finally the example trains the model by calling `model.fit`:
|
||||
|
||||
```python
|
||||
model.fit(x=X_,
|
||||
model.fit(x=X_,
|
||||
y=Y_,
|
||||
batch_size=32,
|
||||
nb_epoch=int(options.nb_epoch),
|
||||
|
|
|
|||
6
docs/readthedocs/source/doc/DLlib/Overview/index.rst
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
DLLib Key Features
|
||||
================================
|
||||
|
||||
* `Keras-like API <keras-api.html>`_
|
||||
* `Spark ML Pipeline Support <nnframes.html>`_
|
||||
* `Visualization <visualization.html>`_
|
||||
41
docs/readthedocs/source/doc/DLlib/Overview/install.md
Normal file
|
|
@ -0,0 +1,41 @@
|
|||
# Installation
|
||||
|
||||
|
||||
## Scala
|
||||
|
||||
Refer to [BigDl Install guide for Scala](../../UserGuide/scala.md).
|
||||
|
||||
|
||||
## Python
|
||||
|
||||
|
||||
### Install a Stable Release
|
||||
|
||||
Run below command to install _bigdl-dllib_.
|
||||
|
||||
```bash
|
||||
conda create -n my_env python=3.7
|
||||
conda activate my_env
|
||||
pip install bigdl-dllib
|
||||
```
|
||||
|
||||
### Install Nightly build version
|
||||
|
||||
You can install the latest nightly build of bigdl-dllib as follows:
|
||||
```bash
|
||||
pip install --pre --upgrade bigdl-dllib
|
||||
```
|
||||
|
||||
### Verify your install
|
||||
|
||||
You may verify if the installation is successful using the interactive Python shell as follows:
|
||||
|
||||
* Type `python` in the command line to start a REPL.
|
||||
* Try to run the example code below to verify the installation:
|
||||
|
||||
```python
|
||||
from bigdl.dllib.utils.nncontext import *
|
||||
|
||||
sc = init_nncontext() # Initiation of bigdl-dllib on the underlying cluster.
|
||||
```
|
||||
|
||||
|
|
@ -1,216 +0,0 @@
|
|||
# Python DLLib Getting Start Guide
|
||||
|
||||
## 1. Code initialization
|
||||
```nncontext``` is the main entry for provisioning the dllib program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
|
||||
|
||||
It is recommended to initialize `nncontext` at the beginning of your program:
|
||||
```
|
||||
from bigdl.dllib.nncontext import *
|
||||
sc = init_nncontext()
|
||||
```
|
||||
For more information about ```nncontext```, please refer to [nncontext](https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/dllib.html#nn-context)
|
||||
|
||||
## 3. Distributed Data Loading
|
||||
|
||||
#### Using Spark Dataframe APIs
|
||||
DLlib supports Spark Dataframes as the input to the distributed training, and as
|
||||
the input/output of the distributed inference. Consequently, the user can easily
|
||||
process large-scale dataset using Apache Spark, and directly apply AI models on
|
||||
the distributed (and possibly in-memory) Dataframes without data conversion or serialization
|
||||
|
||||
We create Spark session so we can use Spark API to load and process the data
|
||||
```
|
||||
spark = SQLContext(sc)
|
||||
```
|
||||
|
||||
1. We can use Spark API to load the data into Spark DataFrame, eg. read csv file into Spark DataFrame
|
||||
```
|
||||
path = "pima-indians-diabetes.data.csv"
|
||||
spark.read.csv(path)
|
||||
```
|
||||
|
||||
If the feature column for the model is a Spark ML Vector. Please assemble related columns into a Vector and pass it to the model. eg.
|
||||
```
|
||||
from pyspark.ml.feature import VectorAssembler
|
||||
vecAssembler = VectorAssembler(outputCol="features")
|
||||
vecAssembler.setInputCols(["num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age"])
|
||||
assemble_df = vecAssembler.transform(df)
|
||||
assemble_df.withColumn("label", col("class").cast(DoubleType) + lit(1))
|
||||
```
|
||||
|
||||
2. If the training data is image, we can use DLLib api to load image into Spark DataFrame. Eg.
|
||||
```
|
||||
imgPath = "cats_dogs/"
|
||||
imageDF = NNImageReader.readImages(imgPath, sc)
|
||||
```
|
||||
|
||||
It will load the images and generate feature tensors automatically. Also we need generate labels ourselves. eg:
|
||||
```
|
||||
labelDF = imageDF.withColumn("name", getName(col("image"))) \
|
||||
.withColumn("label", getLabel(col('name')))
|
||||
```
|
||||
|
||||
Then split the Spark DataFrame into traing part and validation part
|
||||
```
|
||||
(trainingDF, validationDF) = labelDF.randomSplit([0.9, 0.1])
|
||||
```
|
||||
|
||||
## 4. Model Definition
|
||||
|
||||
#### Using Keras-like APIs
|
||||
|
||||
To define a model, you can use the [Keras Style API](https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/keras-api.html).
|
||||
```
|
||||
x1 = Input(shape=[8])
|
||||
dense1 = Dense(12, activation="relu")(x1)
|
||||
dense2 = Dense(8, activation="relu")(dense1)
|
||||
dense3 = Dense(2)(dense2)
|
||||
dmodel = Model(input=x1, output=dense3)
|
||||
```
|
||||
|
||||
After creating the model, you will have to decide which loss function to use in training.
|
||||
|
||||
Now you can use `compile` function of the model to set the loss function, optimization method.
|
||||
```
|
||||
dmodel.compile(optimizer = "adam", loss = "sparse_categorical_crossentropy")
|
||||
```
|
||||
|
||||
Now the model is built and ready to train.
|
||||
|
||||
## 5. Distributed Model Training
|
||||
Now you can use 'fit' begin the training, please set the label columns. Model Evaluation can be performed periodically during a training.
|
||||
1. If the dataframe is generated using Spark apis, you also need set the feature columns. eg.
|
||||
```
|
||||
model.fit(df, feature_cols=["features"], label_cols=["label"], batch_size=4, nb_epoch=1)
|
||||
```
|
||||
Note: Above model accepts single input(column `features`) and single output(column `label`).
|
||||
|
||||
If your model accepts multiple inputs(eg. column `f1`, `f2`, `f3`), please set the features as below:
|
||||
```
|
||||
model.fit(df, feature_cols=["f1", "f2"], label_cols=["label"], batch_size=4, nb_epoch=1)
|
||||
```
|
||||
|
||||
Similarly, if the model accepts multiple outputs(eg. column `label1`, `label2`), please set the label columns as below:
|
||||
```
|
||||
model.fit(df, feature_cols=["features"], label_cols=["l1", "l2"], batch_size=4, nb_epoch=1)
|
||||
```
|
||||
|
||||
2. If the dataframe is generated using DLLib `NNImageReader`, we don't need set `feature_cols`, we can set `transform` to config how to process the images before training. Eg.
|
||||
```
|
||||
from bigdl.dllib.feature.image import transforms
|
||||
transformers = transforms.Compose([ImageResize(50, 50), ImageMirror()])
|
||||
model.fit(image_df, label_cols=["label"], batch_size=1, nb_epoch=1, transform=transformers)
|
||||
```
|
||||
For more details about how to use DLLib keras api to train image data, you may want to refer [ImageClassification](https://github.com/intel-analytics/BigDL/tree/main/python/dllib/examples/keras/image_classification.py)
|
||||
|
||||
## 6. Model saving and loading
|
||||
When training is finished, you may need to save the final model for later use.
|
||||
|
||||
BigDL allows you to save your BigDL model on local filesystem, HDFS, or Amazon s3.
|
||||
- **save**
|
||||
```
|
||||
modelPath = "/tmp/demo/keras.model"
|
||||
dmodel.saveModel(modelPath)
|
||||
```
|
||||
|
||||
- **load**
|
||||
```
|
||||
loadModel = Model.loadModel(modelPath)
|
||||
preDF = loadModel.predict(df, feature_cols=["features"], prediction_col="predict")
|
||||
```
|
||||
|
||||
You may want to refer [Save/Load](https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/keras-api.html#save)
|
||||
|
||||
## 7. Distributed evaluation and inference
|
||||
After training finishes, you can then use the trained model for prediction or evaluation.
|
||||
|
||||
- **inference**
|
||||
1. For dataframe generated by Spark API, please set `feature_cols` and `prediction_col`
|
||||
```
|
||||
dmodel.predict(df, feature_cols=["features"], prediction_col="predict")
|
||||
```
|
||||
2. For dataframe generated by `NNImageReader`, please set `prediction_col` and you can set `transform` if needed
|
||||
```
|
||||
model.predict(df, prediction_col="predict", transform=transformers)
|
||||
```
|
||||
|
||||
- **evaluation**
|
||||
Similary for dataframe generated by Spark API, the code is as below:
|
||||
```
|
||||
dmodel.evaluate(df, batch_size=4, feature_cols=["features"], label_cols=["label"])
|
||||
```
|
||||
|
||||
For dataframe generated by `NNImageReader`:
|
||||
```
|
||||
model.evaluate(image_df, batch_size=1, label_cols=["label"], transform=transformers)
|
||||
```
|
||||
|
||||
## 8. Checkpointing and resuming training
|
||||
You can configure periodically taking snapshots of the model.
|
||||
```
|
||||
cpPath = "/tmp/demo/cp"
|
||||
dmodel.set_checkpoint(cpPath)
|
||||
```
|
||||
You can also set ```over_write``` to ```true``` to enable overwriting any existing snapshot files
|
||||
|
||||
After training stops, you can resume from any saved point. Choose one of the model snapshots to resume (saved in checkpoint path, details see Checkpointing). Use Models.loadModel to load the model snapshot into an model object.
|
||||
```
|
||||
loadModel = Model.loadModel(path)
|
||||
```
|
||||
|
||||
## 9. Monitor your training
|
||||
|
||||
- **Tensorboard**
|
||||
|
||||
BigDL provides a convenient way to monitor/visualize your training progress. It writes the statistics collected during training/validation. Saved summary can be viewed via TensorBoard.
|
||||
|
||||
In order to take effect, it needs to be called before fit.
|
||||
```
|
||||
dmodel.set_tensorboard("./", "dllib_demo")
|
||||
```
|
||||
For more details, please refer [visulization](visualization.md)
|
||||
|
||||
## 10. Transfer learning and finetuning
|
||||
|
||||
- **freeze and trainable**
|
||||
BigDL DLLib supports exclude some layers of model from training.
|
||||
```
|
||||
dmodel.freeze(layer_names)
|
||||
```
|
||||
Layers that match the given names will be freezed. If a layer is freezed, its parameters(weight/bias, if exists) are not changed in training process.
|
||||
|
||||
BigDL DLLib also support unFreeze operations. The parameters for the layers that match the given names will be trained(updated) in training process
|
||||
```
|
||||
dmodel.unFreeze(layer_names)
|
||||
```
|
||||
For more information, you may refer [freeze](freeze.md)
|
||||
|
||||
## 11. Hyperparameter tuning
|
||||
- **optimizer**
|
||||
|
||||
DLLib supports a list of optimization methods.
|
||||
For more details, please refer [optimization](optim-Methods.md)
|
||||
|
||||
- **learning rate scheduler**
|
||||
|
||||
DLLib supports a list of learning rate scheduler.
|
||||
For more details, please refer [lr_scheduler](learningrate-Scheduler.md)
|
||||
|
||||
- **batch size**
|
||||
|
||||
DLLib supports set batch size during training and prediction. We can adjust the batch size to tune the model's accuracy.
|
||||
|
||||
- **regularizer**
|
||||
|
||||
DLLib supports a list of regularizers.
|
||||
For more details, please refer [regularizer](regularizers.md)
|
||||
|
||||
- **clipping**
|
||||
|
||||
DLLib supports gradient clipping operations.
|
||||
For more details, please refer [gradient_clip](clipping.md)
|
||||
|
||||
## 12. Running program
|
||||
```
|
||||
python you_app_code.py
|
||||
```
|
||||
|
|
@ -1,301 +0,0 @@
|
|||
# DLLib Getting Start Guide
|
||||
|
||||
## 1. Creating dev environment
|
||||
|
||||
#### Scala project (maven & sbt)
|
||||
|
||||
- **Maven**
|
||||
|
||||
To use BigDL DLLib to build your own deep learning application, you can use maven to create your project and add bigdl-dllib to your dependency. Please add below code to your pom.xml to add BigDL DLLib as your dependency:
|
||||
```
|
||||
<dependency>
|
||||
<groupId>com.intel.analytics.bigdl</groupId>
|
||||
<artifactId>bigdl-dllib-spark_2.4.6</artifactId>
|
||||
<version>0.14.0</version>
|
||||
</dependency>
|
||||
```
|
||||
|
||||
- **SBT**
|
||||
```
|
||||
libraryDependencies += "com.intel.analytics.bigdl" % "bigdl-dllib-spark_2.4.6" % "0.14.0"
|
||||
```
|
||||
For more information about how to add BigDL dependency, please refer https://bigdl.readthedocs.io/en/latest/doc/UserGuide/scala.html#build-a-scala-project
|
||||
|
||||
#### IDE (Intelij)
|
||||
Open up IntelliJ and click File => Open
|
||||
|
||||
Navigate to your project. If you have add BigDL DLLib as dependency in your pom.xml.
|
||||
The IDE will automatically download it from maven and you are able to run your application.
|
||||
|
||||
For more details about how to setup IDE for BigDL project, please refer https://bigdl-project.github.io/master/#ScalaUserGuide/install-build-src/#setup-ide
|
||||
|
||||
|
||||
## 2. Code initialization
|
||||
```NNContext``` is the main entry for provisioning the dllib program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
|
||||
|
||||
It is recommended to initialize `NNContext` at the beginning of your program:
|
||||
```
|
||||
import com.intel.analytics.bigdl.dllib.NNContext
|
||||
import com.intel.analytics.bigdl.dllib.keras.Model
|
||||
import com.intel.analytics.bigdl.dllib.keras.models.Models
|
||||
import com.intel.analytics.bigdl.dllib.keras.optimizers.Adam
|
||||
import com.intel.analytics.bigdl.dllib.nn.ClassNLLCriterion
|
||||
import com.intel.analytics.bigdl.dllib.utils.Shape
|
||||
import com.intel.analytics.bigdl.dllib.keras.layers._
|
||||
import com.intel.analytics.bigdl.numeric.NumericFloat
|
||||
import org.apache.spark.ml.feature.VectorAssembler
|
||||
import org.apache.spark.sql.SQLContext
|
||||
import org.apache.spark.sql.functions._
|
||||
import org.apache.spark.sql.types.DoubleType
|
||||
|
||||
val sc = NNContext.initNNContext("dllib_demo")
|
||||
```
|
||||
For more information about ```NNContext```, please refer to [NNContext](https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/dllib.html#nn-context)
|
||||
|
||||
## 3. Distributed Data Loading
|
||||
|
||||
#### Using Spark Dataframe APIs
|
||||
DLlib supports Spark Dataframes as the input to the distributed training, and as
|
||||
the input/output of the distributed inference. Consequently, the user can easily
|
||||
process large-scale dataset using Apache Spark, and directly apply AI models on
|
||||
the distributed (and possibly in-memory) Dataframes without data conversion or serialization
|
||||
|
||||
We create Spark session so we can use Spark API to load and process the data
|
||||
```
|
||||
val spark = new SQLContext(sc)
|
||||
```
|
||||
|
||||
1. We can use Spark API to load the data into Spark DataFrame, eg. read csv file into Spark DataFrame
|
||||
```
|
||||
val path = "pima-indians-diabetes.data.csv"
|
||||
val df = spark.read.options(Map("inferSchema"->"true","delimiter"->",")).csv(path)
|
||||
.toDF("num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age", "class")
|
||||
```
|
||||
|
||||
If the feature column for the model is a Spark ML Vector. Please assemble related columns into a Vector and pass it to the model. eg.
|
||||
```
|
||||
val assembler = new VectorAssembler()
|
||||
.setInputCols(Array("num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age"))
|
||||
.setOutputCol("features")
|
||||
val assembleredDF = assembler.transform(df)
|
||||
val df2 = assembleredDF.withColumn("label",col("class").cast(DoubleType) + lit(1))
|
||||
```
|
||||
|
||||
2. If the training data is image, we can use DLLib api to load image into Spark DataFrame. Eg.
|
||||
```
|
||||
val createLabel = udf { row: Row =>
|
||||
if (new Path(row.getString(0)).getName.contains("cat")) 1 else 2
|
||||
}
|
||||
val imagePath = "cats_dogs/"
|
||||
val imgDF = NNImageReader.readImages(imagePath, sc)
|
||||
```
|
||||
|
||||
It will load the images and generate feature tensors automatically. Also we need generate labels ourselves. eg:
|
||||
```
|
||||
val df = imgDF.withColumn("label", createLabel(col("image")))
|
||||
```
|
||||
|
||||
Then split the Spark DataFrame into traing part and validation part
|
||||
```
|
||||
val Array(trainDF, valDF) = df.randomSplit(Array(0.8, 0.2))
|
||||
```
|
||||
|
||||
## 4. Model Definition
|
||||
|
||||
#### Using Keras-like APIs
|
||||
|
||||
To define a model, you can use the [Keras Style API](https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/keras-api.html).
|
||||
```
|
||||
val x1 = Input(Shape(8))
|
||||
val dense1 = Dense(12, activation="relu").inputs(x1)
|
||||
val dense2 = Dense(8, activation="relu").inputs(dense1)
|
||||
val dense3 = Dense(2).inputs(dense2)
|
||||
val dmodel = Model(x1, dense3)
|
||||
```
|
||||
|
||||
After creating the model, you will have to decide which loss function to use in training.
|
||||
|
||||
Now you can use `compile` function of the model to set the loss function, optimization method.
|
||||
```
|
||||
dmodel.compile(optimizer = new Adam(), loss = ClassNLLCriterion())
|
||||
```
|
||||
|
||||
Now the model is built and ready to train.
|
||||
|
||||
## 5. Distributed Model Training
|
||||
Now you can use 'fit' begin the training, please set the label columns. Model Evaluation can be performed periodically during a training.
|
||||
1. If the dataframe is generated using Spark apis, you also need set the feature columns. eg.
|
||||
```
|
||||
model.fit(x=trainDF, batchSize=4, nbEpoch = 2,
|
||||
featureCols = Array("feature1"), labelCols = Array("label"), valX=valDF)
|
||||
```
|
||||
Note: Above model accepts single input(column `feature1`) and single output(column `label`).
|
||||
|
||||
If your model accepts multiple inputs(eg. column `f1`, `f2`, `f3`), please set the features as below:
|
||||
```
|
||||
model.fit(x=dataframe, batchSize=4, nbEpoch = 2,
|
||||
featureCols = Array("f1", "f2", "f3"), labelCols = Array("label"))
|
||||
```
|
||||
|
||||
Similarly, if the model accepts multiple outputs(eg. column `label1`, `label2`), please set the label columns as below:
|
||||
```
|
||||
model.fit(x=dataframe, batchSize=4, nbEpoch = 2,
|
||||
featureCols = Array("f1", "f2", "f3"), labelCols = Array("label1", "label2"))
|
||||
```
|
||||
|
||||
2. If the dataframe is generated using DLLib `NNImageReader`, we don't need set `featureCols`, we can set `transform` to config how to process the images before training. Eg.
|
||||
```
|
||||
val transformers = transforms.Compose(Array(ImageResize(50, 50),
|
||||
ImageMirror()))
|
||||
model.fit(x=dataframe, batchSize=4, nbEpoch = 2,
|
||||
labelCols = Array("label"), transform = transformers)
|
||||
```
|
||||
For more details about how to use DLLib keras api to train image data, you may want to refer [ImageClassification](https://github.com/intel-analytics/BigDL/blob/main/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/keras/ImageClassification.scala)
|
||||
|
||||
## 6. Model saving and loading
|
||||
When training is finished, you may need to save the final model for later use.
|
||||
|
||||
BigDL allows you to save your BigDL model on local filesystem, HDFS, or Amazon s3.
|
||||
- **save**
|
||||
```
|
||||
val modelPath = "/tmp/demo/keras.model"
|
||||
dmodel.saveModel(modelPath)
|
||||
```
|
||||
|
||||
- **load**
|
||||
```
|
||||
val loadModel = Models.loadModel(modelPath)
|
||||
|
||||
val preDF2 = loadModel.predict(valDF, featureCols = Array("features"), predictionCol = "predict")
|
||||
```
|
||||
|
||||
You may want to refer [Save/Load](https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/keras-api.html#save)
|
||||
|
||||
## 7. Distributed evaluation and inference
|
||||
After training finishes, you can then use the trained model for prediction or evaluation.
|
||||
|
||||
- **inference**
|
||||
1. For dataframe generated by Spark API, please set `featureCols`
|
||||
```
|
||||
dmodel.predict(trainDF, featureCols = Array("features"), predictionCol = "predict")
|
||||
```
|
||||
2. For dataframe generated by `NNImageReader`, no need to set `featureCols` and you can set `transform` if needed
|
||||
```
|
||||
model.predict(imgDF, predictionCol = "predict", transform = transformers)
|
||||
```
|
||||
|
||||
- **evaluation**
|
||||
Similary for dataframe generated by Spark API, the code is as below:
|
||||
```
|
||||
dmodel.evaluate(trainDF, batchSize = 4, featureCols = Array("features"),
|
||||
labelCols = Array("label"))
|
||||
```
|
||||
|
||||
For dataframe generated by `NNImageReader`:
|
||||
```
|
||||
model.evaluate(imgDF, batchSize = 1, labelCols = Array("label"), transform = transformers)
|
||||
```
|
||||
|
||||
## 8. Checkpointing and resuming training
|
||||
You can configure periodically taking snapshots of the model.
|
||||
```
|
||||
val cpPath = "/tmp/demo/cp"
|
||||
dmodel.setCheckpoint(cpPath, overWrite=false)
|
||||
```
|
||||
You can also set ```overWrite``` to ```true``` to enable overwriting any existing snapshot files
|
||||
|
||||
After training stops, you can resume from any saved point. Choose one of the model snapshots to resume (saved in checkpoint path, details see Checkpointing). Use Models.loadModel to load the model snapshot into an model object.
|
||||
```
|
||||
val loadModel = Models.loadModel(path)
|
||||
```
|
||||
|
||||
## 9. Monitor your training
|
||||
|
||||
- **Tensorboard**
|
||||
|
||||
BigDL provides a convenient way to monitor/visualize your training progress. It writes the statistics collected during training/validation. Saved summary can be viewed via TensorBoard.
|
||||
|
||||
In order to take effect, it needs to be called before fit.
|
||||
```
|
||||
dmodel.setTensorBoard("./", "dllib_demo")
|
||||
```
|
||||
For more details, please refer [visulization](visualization.md)
|
||||
|
||||
## 10. Transfer learning and finetuning
|
||||
|
||||
- **freeze and trainable**
|
||||
BigDL DLLib supports exclude some layers of model from training.
|
||||
```
|
||||
dmodel.freeze(layer_names)
|
||||
```
|
||||
Layers that match the given names will be freezed. If a layer is freezed, its parameters(weight/bias, if exists) are not changed in training process.
|
||||
|
||||
BigDL DLLib also support unFreeze operations. The parameters for the layers that match the given names will be trained(updated) in training process
|
||||
```
|
||||
dmodel.unFreeze(layer_names)
|
||||
```
|
||||
For more information, you may refer [freeze](freeze.md)
|
||||
|
||||
## 11. Hyperparameter tuning
|
||||
- **optimizer**
|
||||
|
||||
DLLib supports a list of optimization methods.
|
||||
For more details, please refer [optimization](optim-Methods.md)
|
||||
|
||||
- **learning rate scheduler**
|
||||
|
||||
DLLib supports a list of learning rate scheduler.
|
||||
For more details, please refer [lr_scheduler](learningrate-Scheduler.md)
|
||||
|
||||
- **batch size**
|
||||
|
||||
DLLib supports set batch size during training and prediction. We can adjust the batch size to tune the model's accuracy.
|
||||
|
||||
- **regularizer**
|
||||
|
||||
DLLib supports a list of regularizers.
|
||||
For more details, please refer [regularizer](regularizers.md)
|
||||
|
||||
- **clipping**
|
||||
|
||||
DLLib supports gradient clipping operations.
|
||||
For more details, please refer [gradient_clip](clipping.md)
|
||||
|
||||
## 12. Running program
|
||||
You can run a bigdl-dllib program as a standard Spark program (running on either a local machine or a distributed cluster) as follows:
|
||||
```
|
||||
# Spark local mode
|
||||
${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
|
||||
--master local[2] \
|
||||
--class class_name \
|
||||
jar_path
|
||||
|
||||
# Spark standalone mode
|
||||
## ${SPARK_HOME}/sbin/start-master.sh
|
||||
## check master URL from http://localhost:8080
|
||||
${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
|
||||
--master spark://... \
|
||||
--executor-cores cores_per_executor \
|
||||
--total-executor-cores total_cores_for_the_job \
|
||||
--class class_name \
|
||||
jar_path
|
||||
|
||||
# Spark yarn client mode
|
||||
${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
|
||||
--master yarn \
|
||||
--deploy-mode client \
|
||||
--executor-cores cores_per_executor \
|
||||
--num-executors executors_number \
|
||||
--class class_name \
|
||||
jar_path
|
||||
|
||||
# Spark yarn cluster mode
|
||||
${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
|
||||
--master yarn \
|
||||
--deploy-mode cluster \
|
||||
--executor-cores cores_per_executor \
|
||||
--num-executors executors_number \
|
||||
--class class_name
|
||||
jar_path
|
||||
```
|
||||
For more detail about how to run BigDL scala application, please refer https://bigdl.readthedocs.io/en/latest/doc/UserGuide/scala.html
|
||||
|
|
@ -31,10 +31,10 @@ After that, navigate to the TensorBoard dashboard using a browser. You can find
|
|||
* **Visualizations in TensorBoard**
|
||||
|
||||
Within the TensorBoard dashboard, you will be able to read the visualizations of each run, including the “Loss” and “Throughput” curves under the SCALARS tab (as illustrated below):
|
||||

|
||||

|
||||
|
||||
And “weights”, “bias”, “gradientWeights” and “gradientBias” under the DISTRIBUTIONS and HISTOGRAMS tabs (as illustrated below):
|
||||

|
||||

|
||||

|
||||

|
||||
|
||||
---
|
||||
9
docs/readthedocs/source/doc/DLlib/QuickStart/index.md
Normal file
|
|
@ -0,0 +1,9 @@
|
|||
# DLlib Tutorial
|
||||
|
||||
|
||||
- [**Python Quickstart Notebook**](./python-getting-started.html)
|
||||
|
||||
> [Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/branch-2.0/python/dllib/colab-notebook/dllib_keras_api.ipynb) [View source on GitHub](https://github.com/intel-analytics/BigDL/blob/branch-2.0/python/dllib/colab-notebook/dllib_keras_api.ipynb)
|
||||
|
||||
In this guide we will demonstrate how to use _DLlib keras style api_ and _DLlib NNClassifier_ for classification.
|
||||
|
||||
|
|
@ -0,0 +1,218 @@
|
|||
# DLLib Python Getting Start Guide
|
||||
|
||||
## 1. Code initialization
|
||||
```nncontext``` is the main entry for provisioning the dllib program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
|
||||
|
||||
It is recommended to initialize `nncontext` at the beginning of your program:
|
||||
```
|
||||
from bigdl.dllib.nncontext import *
|
||||
sc = init_nncontext()
|
||||
```
|
||||
For more information about ```nncontext```, please refer to [nncontext](../Overview/dllib.md#initialize-nn-context)
|
||||
|
||||
## 2. Distributed Data Loading
|
||||
|
||||
#### Using Spark Dataframe APIs
|
||||
DLlib supports Spark Dataframes as the input to the distributed training, and as
|
||||
the input/output of the distributed inference. Consequently, the user can easily
|
||||
process large-scale dataset using Apache Spark, and directly apply AI models on
|
||||
the distributed (and possibly in-memory) Dataframes without data conversion or serialization
|
||||
|
||||
We create Spark session so we can use Spark API to load and process the data
|
||||
```
|
||||
spark = SQLContext(sc)
|
||||
```
|
||||
|
||||
1. We can use Spark API to load the data into Spark DataFrame, eg. read csv file into Spark DataFrame
|
||||
```
|
||||
path = "pima-indians-diabetes.data.csv"
|
||||
spark.read.csv(path)
|
||||
```
|
||||
|
||||
If the feature column for the model is a Spark ML Vector. Please assemble related columns into a Vector and pass it to the model. eg.
|
||||
```
|
||||
from pyspark.ml.feature import VectorAssembler
|
||||
vecAssembler = VectorAssembler(outputCol="features")
|
||||
vecAssembler.setInputCols(["num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age"])
|
||||
assemble_df = vecAssembler.transform(df)
|
||||
assemble_df.withColumn("label", col("class").cast(DoubleType) + lit(1))
|
||||
```
|
||||
|
||||
2. If the training data is image, we can use DLLib api to load image into Spark DataFrame. Eg.
|
||||
```
|
||||
imgPath = "cats_dogs/"
|
||||
imageDF = NNImageReader.readImages(imgPath, sc)
|
||||
```
|
||||
|
||||
It will load the images and generate feature tensors automatically. Also we need generate labels ourselves. eg:
|
||||
```
|
||||
labelDF = imageDF.withColumn("name", getName(col("image"))) \
|
||||
.withColumn("label", getLabel(col('name')))
|
||||
```
|
||||
|
||||
Then split the Spark DataFrame into traing part and validation part
|
||||
```
|
||||
(trainingDF, validationDF) = labelDF.randomSplit([0.9, 0.1])
|
||||
```
|
||||
|
||||
## 3. Model Definition
|
||||
|
||||
#### Using Keras-like APIs
|
||||
|
||||
To define a model, you can use the [Keras Style API](../Overview/keras-api.md).
|
||||
```
|
||||
x1 = Input(shape=[8])
|
||||
dense1 = Dense(12, activation="relu")(x1)
|
||||
dense2 = Dense(8, activation="relu")(dense1)
|
||||
dense3 = Dense(2)(dense2)
|
||||
dmodel = Model(input=x1, output=dense3)
|
||||
```
|
||||
|
||||
After creating the model, you will have to decide which loss function to use in training.
|
||||
|
||||
Now you can use `compile` function of the model to set the loss function, optimization method.
|
||||
```
|
||||
dmodel.compile(optimizer = "adam", loss = "sparse_categorical_crossentropy")
|
||||
```
|
||||
|
||||
Now the model is built and ready to train.
|
||||
|
||||
## 4. Distributed Model Training
|
||||
Now you can use 'fit' begin the training, please set the label columns. Model Evaluation can be performed periodically during a training.
|
||||
1. If the dataframe is generated using Spark apis, you also need set the feature columns. eg.
|
||||
```
|
||||
model.fit(df, feature_cols=["features"], label_cols=["label"], batch_size=4, nb_epoch=1)
|
||||
```
|
||||
Note: Above model accepts single input(column `features`) and single output(column `label`).
|
||||
|
||||
If your model accepts multiple inputs(eg. column `f1`, `f2`, `f3`), please set the features as below:
|
||||
```
|
||||
model.fit(df, feature_cols=["f1", "f2"], label_cols=["label"], batch_size=4, nb_epoch=1)
|
||||
```
|
||||
|
||||
Similarly, if the model accepts multiple outputs(eg. column `label1`, `label2`), please set the label columns as below:
|
||||
```
|
||||
model.fit(df, feature_cols=["features"], label_cols=["l1", "l2"], batch_size=4, nb_epoch=1)
|
||||
```
|
||||
|
||||
2. If the dataframe is generated using DLLib `NNImageReader`, we don't need set `feature_cols`, we can set `transform` to config how to process the images before training. Eg.
|
||||
```
|
||||
from bigdl.dllib.feature.image import transforms
|
||||
transformers = transforms.Compose([ImageResize(50, 50), ImageMirror()])
|
||||
model.fit(image_df, label_cols=["label"], batch_size=1, nb_epoch=1, transform=transformers)
|
||||
```
|
||||
For more details about how to use DLLib keras api to train image data, you may want to refer [ImageClassification](https://github.com/intel-analytics/BigDL/tree/main/python/dllib/examples/keras/image_classification.py)
|
||||
|
||||
## 5. Model saving and loading
|
||||
When training is finished, you may need to save the final model for later use.
|
||||
|
||||
BigDL allows you to save your BigDL model on local filesystem, HDFS, or Amazon s3.
|
||||
- **save**
|
||||
```
|
||||
modelPath = "/tmp/demo/keras.model"
|
||||
dmodel.saveModel(modelPath)
|
||||
```
|
||||
|
||||
- **load**
|
||||
```
|
||||
loadModel = Model.loadModel(modelPath)
|
||||
preDF = loadModel.predict(df, feature_cols=["features"], prediction_col="predict")
|
||||
```
|
||||
|
||||
You may want to refer [Save/Load](../Overview/keras-api.html#save)
|
||||
|
||||
## 6. Distributed evaluation and inference
|
||||
After training finishes, you can then use the trained model for prediction or evaluation.
|
||||
|
||||
- **inference**
|
||||
1. For dataframe generated by Spark API, please set `feature_cols` and `prediction_col`
|
||||
```
|
||||
dmodel.predict(df, feature_cols=["features"], prediction_col="predict")
|
||||
```
|
||||
2. For dataframe generated by `NNImageReader`, please set `prediction_col` and you can set `transform` if needed
|
||||
```
|
||||
model.predict(df, prediction_col="predict", transform=transformers)
|
||||
```
|
||||
|
||||
- **evaluation**
|
||||
|
||||
Similary for dataframe generated by Spark API, the code is as below:
|
||||
```
|
||||
dmodel.evaluate(df, batch_size=4, feature_cols=["features"], label_cols=["label"])
|
||||
```
|
||||
|
||||
For dataframe generated by `NNImageReader`:
|
||||
```
|
||||
model.evaluate(image_df, batch_size=1, label_cols=["label"], transform=transformers)
|
||||
```
|
||||
|
||||
## 7. Checkpointing and resuming training
|
||||
You can configure periodically taking snapshots of the model.
|
||||
```
|
||||
cpPath = "/tmp/demo/cp"
|
||||
dmodel.set_checkpoint(cpPath)
|
||||
```
|
||||
You can also set ```over_write``` to ```true``` to enable overwriting any existing snapshot files
|
||||
|
||||
After training stops, you can resume from any saved point. Choose one of the model snapshots to resume (saved in checkpoint path, details see Checkpointing). Use Models.loadModel to load the model snapshot into an model object.
|
||||
```
|
||||
loadModel = Model.loadModel(path)
|
||||
```
|
||||
|
||||
## 8. Monitor your training
|
||||
|
||||
- **Tensorboard**
|
||||
|
||||
BigDL provides a convenient way to monitor/visualize your training progress. It writes the statistics collected during training/validation. Saved summary can be viewed via TensorBoard.
|
||||
|
||||
In order to take effect, it needs to be called before fit.
|
||||
```
|
||||
dmodel.set_tensorboard("./", "dllib_demo")
|
||||
```
|
||||
For more details, please refer [visulization](../Overview/visualization.md)
|
||||
|
||||
## 9. Transfer learning and finetuning
|
||||
|
||||
- **freeze and trainable**
|
||||
|
||||
BigDL DLLib supports exclude some layers of model from training.
|
||||
```
|
||||
dmodel.freeze(layer_names)
|
||||
```
|
||||
Layers that match the given names will be freezed. If a layer is freezed, its parameters(weight/bias, if exists) are not changed in training process.
|
||||
|
||||
BigDL DLLib also support unFreeze operations. The parameters for the layers that match the given names will be trained(updated) in training process
|
||||
```
|
||||
dmodel.unFreeze(layer_names)
|
||||
```
|
||||
For more information, you may refer [freeze](../../PythonAPI/DLlib/freeze.md)
|
||||
|
||||
## 10. Hyperparameter tuning
|
||||
- **optimizer**
|
||||
|
||||
DLLib supports a list of optimization methods.
|
||||
For more details, please refer [optimization](../../PythonAPI/DLlib/optim-Methods.md)
|
||||
|
||||
- **learning rate scheduler**
|
||||
|
||||
DLLib supports a list of learning rate scheduler.
|
||||
For more details, please refer [lr_scheduler](../../PythonAPI/DLlib/learningrate-Scheduler.md)
|
||||
|
||||
- **batch size**
|
||||
|
||||
DLLib supports set batch size during training and prediction. We can adjust the batch size to tune the model's accuracy.
|
||||
|
||||
- **regularizer**
|
||||
|
||||
DLLib supports a list of regularizers.
|
||||
For more details, please refer [regularizer](../../PythonAPI/DLlib/regularizers.md)
|
||||
|
||||
- **clipping**
|
||||
|
||||
DLLib supports gradient clipping operations.
|
||||
For more details, please refer [gradient_clip](../../PythonAPI/DLlib/clipping.md)
|
||||
|
||||
## 11. Running program
|
||||
```
|
||||
python you_app_code.py
|
||||
```
|
||||
|
|
@ -0,0 +1,303 @@
|
|||
# DLLib Scala Getting Start Guide
|
||||
|
||||
## 1. Creating dev environment
|
||||
|
||||
#### Scala project (maven & sbt)
|
||||
|
||||
- **Maven**
|
||||
|
||||
To use BigDL DLLib to build your own deep learning application, you can use maven to create your project and add bigdl-dllib to your dependency. Please add below code to your pom.xml to add BigDL DLLib as your dependency:
|
||||
```
|
||||
<dependency>
|
||||
<groupId>com.intel.analytics.bigdl</groupId>
|
||||
<artifactId>bigdl-dllib-spark_2.4.6</artifactId>
|
||||
<version>0.14.0</version>
|
||||
</dependency>
|
||||
```
|
||||
|
||||
- **SBT**
|
||||
```
|
||||
libraryDependencies += "com.intel.analytics.bigdl" % "bigdl-dllib-spark_2.4.6" % "0.14.0"
|
||||
```
|
||||
For more information about how to add BigDL dependency, please refer [scala docs](../../UserGuide/scala.md#build-a-scala-project)
|
||||
|
||||
#### IDE (Intelij)
|
||||
Open up IntelliJ and click File => Open
|
||||
|
||||
Navigate to your project. If you have add BigDL DLLib as dependency in your pom.xml.
|
||||
The IDE will automatically download it from maven and you are able to run your application.
|
||||
|
||||
For more details about how to setup IDE for BigDL project, please refer [IDE Setup Guide](../../UserGuide/develop.html#id2)
|
||||
|
||||
|
||||
## 2. Code initialization
|
||||
```NNContext``` is the main entry for provisioning the dllib program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
|
||||
|
||||
It is recommended to initialize `NNContext` at the beginning of your program:
|
||||
```
|
||||
import com.intel.analytics.bigdl.dllib.NNContext
|
||||
import com.intel.analytics.bigdl.dllib.keras.Model
|
||||
import com.intel.analytics.bigdl.dllib.keras.models.Models
|
||||
import com.intel.analytics.bigdl.dllib.keras.optimizers.Adam
|
||||
import com.intel.analytics.bigdl.dllib.nn.ClassNLLCriterion
|
||||
import com.intel.analytics.bigdl.dllib.utils.Shape
|
||||
import com.intel.analytics.bigdl.dllib.keras.layers._
|
||||
import com.intel.analytics.bigdl.numeric.NumericFloat
|
||||
import org.apache.spark.ml.feature.VectorAssembler
|
||||
import org.apache.spark.sql.SQLContext
|
||||
import org.apache.spark.sql.functions._
|
||||
import org.apache.spark.sql.types.DoubleType
|
||||
|
||||
val sc = NNContext.initNNContext("dllib_demo")
|
||||
```
|
||||
For more information about ```NNContext```, please refer to [NNContext](../Overview/dllib.md#initialize-nn-context)
|
||||
|
||||
## 3. Distributed Data Loading
|
||||
|
||||
#### Using Spark Dataframe APIs
|
||||
DLlib supports Spark Dataframes as the input to the distributed training, and as
|
||||
the input/output of the distributed inference. Consequently, the user can easily
|
||||
process large-scale dataset using Apache Spark, and directly apply AI models on
|
||||
the distributed (and possibly in-memory) Dataframes without data conversion or serialization
|
||||
|
||||
We create Spark session so we can use Spark API to load and process the data
|
||||
```
|
||||
val spark = new SQLContext(sc)
|
||||
```
|
||||
|
||||
1. We can use Spark API to load the data into Spark DataFrame, eg. read csv file into Spark DataFrame
|
||||
```
|
||||
val path = "pima-indians-diabetes.data.csv"
|
||||
val df = spark.read.options(Map("inferSchema"->"true","delimiter"->",")).csv(path)
|
||||
.toDF("num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age", "class")
|
||||
```
|
||||
|
||||
If the feature column for the model is a Spark ML Vector. Please assemble related columns into a Vector and pass it to the model. eg.
|
||||
```
|
||||
val assembler = new VectorAssembler()
|
||||
.setInputCols(Array("num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age"))
|
||||
.setOutputCol("features")
|
||||
val assembleredDF = assembler.transform(df)
|
||||
val df2 = assembleredDF.withColumn("label",col("class").cast(DoubleType) + lit(1))
|
||||
```
|
||||
|
||||
2. If the training data is image, we can use DLLib api to load image into Spark DataFrame. Eg.
|
||||
```
|
||||
val createLabel = udf { row: Row =>
|
||||
if (new Path(row.getString(0)).getName.contains("cat")) 1 else 2
|
||||
}
|
||||
val imagePath = "cats_dogs/"
|
||||
val imgDF = NNImageReader.readImages(imagePath, sc)
|
||||
```
|
||||
|
||||
It will load the images and generate feature tensors automatically. Also we need generate labels ourselves. eg:
|
||||
```
|
||||
val df = imgDF.withColumn("label", createLabel(col("image")))
|
||||
```
|
||||
|
||||
Then split the Spark DataFrame into traing part and validation part
|
||||
```
|
||||
val Array(trainDF, valDF) = df.randomSplit(Array(0.8, 0.2))
|
||||
```
|
||||
|
||||
## 4. Model Definition
|
||||
|
||||
#### Using Keras-like APIs
|
||||
|
||||
To define a model, you can use the [Keras Style API](../Overview/keras-api.md).
|
||||
```
|
||||
val x1 = Input(Shape(8))
|
||||
val dense1 = Dense(12, activation="relu").inputs(x1)
|
||||
val dense2 = Dense(8, activation="relu").inputs(dense1)
|
||||
val dense3 = Dense(2).inputs(dense2)
|
||||
val dmodel = Model(x1, dense3)
|
||||
```
|
||||
|
||||
After creating the model, you will have to decide which loss function to use in training.
|
||||
|
||||
Now you can use `compile` function of the model to set the loss function, optimization method.
|
||||
```
|
||||
dmodel.compile(optimizer = new Adam(), loss = ClassNLLCriterion())
|
||||
```
|
||||
|
||||
Now the model is built and ready to train.
|
||||
|
||||
## 5. Distributed Model Training
|
||||
Now you can use 'fit' begin the training, please set the label columns. Model Evaluation can be performed periodically during a training.
|
||||
1. If the dataframe is generated using Spark apis, you also need set the feature columns. eg.
|
||||
```
|
||||
model.fit(x=trainDF, batchSize=4, nbEpoch = 2,
|
||||
featureCols = Array("feature1"), labelCols = Array("label"), valX=valDF)
|
||||
```
|
||||
Note: Above model accepts single input(column `feature1`) and single output(column `label`).
|
||||
|
||||
If your model accepts multiple inputs(eg. column `f1`, `f2`, `f3`), please set the features as below:
|
||||
```
|
||||
model.fit(x=dataframe, batchSize=4, nbEpoch = 2,
|
||||
featureCols = Array("f1", "f2", "f3"), labelCols = Array("label"))
|
||||
```
|
||||
|
||||
Similarly, if the model accepts multiple outputs(eg. column `label1`, `label2`), please set the label columns as below:
|
||||
```
|
||||
model.fit(x=dataframe, batchSize=4, nbEpoch = 2,
|
||||
featureCols = Array("f1", "f2", "f3"), labelCols = Array("label1", "label2"))
|
||||
```
|
||||
|
||||
2. If the dataframe is generated using DLLib `NNImageReader`, we don't need set `featureCols`, we can set `transform` to config how to process the images before training. Eg.
|
||||
```
|
||||
val transformers = transforms.Compose(Array(ImageResize(50, 50),
|
||||
ImageMirror()))
|
||||
model.fit(x=dataframe, batchSize=4, nbEpoch = 2,
|
||||
labelCols = Array("label"), transform = transformers)
|
||||
```
|
||||
For more details about how to use DLLib keras api to train image data, you may want to refer [ImageClassification](https://github.com/intel-analytics/BigDL/blob/main/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/keras/ImageClassification.scala)
|
||||
|
||||
## 6. Model saving and loading
|
||||
When training is finished, you may need to save the final model for later use.
|
||||
|
||||
BigDL allows you to save your BigDL model on local filesystem, HDFS, or Amazon s3.
|
||||
- **save**
|
||||
```
|
||||
val modelPath = "/tmp/demo/keras.model"
|
||||
dmodel.saveModel(modelPath)
|
||||
```
|
||||
|
||||
- **load**
|
||||
```
|
||||
val loadModel = Models.loadModel(modelPath)
|
||||
|
||||
val preDF2 = loadModel.predict(valDF, featureCols = Array("features"), predictionCol = "predict")
|
||||
```
|
||||
|
||||
You may want to refer [Save/Load](../Overview/keras-api.html#save)
|
||||
|
||||
## 7. Distributed evaluation and inference
|
||||
After training finishes, you can then use the trained model for prediction or evaluation.
|
||||
|
||||
- **inference**
|
||||
1. For dataframe generated by Spark API, please set `featureCols`
|
||||
```
|
||||
dmodel.predict(trainDF, featureCols = Array("features"), predictionCol = "predict")
|
||||
```
|
||||
2. For dataframe generated by `NNImageReader`, no need to set `featureCols` and you can set `transform` if needed
|
||||
```
|
||||
model.predict(imgDF, predictionCol = "predict", transform = transformers)
|
||||
```
|
||||
|
||||
- **evaluation**
|
||||
|
||||
Similary for dataframe generated by Spark API, the code is as below:
|
||||
```
|
||||
dmodel.evaluate(trainDF, batchSize = 4, featureCols = Array("features"),
|
||||
labelCols = Array("label"))
|
||||
```
|
||||
|
||||
For dataframe generated by `NNImageReader`:
|
||||
```
|
||||
model.evaluate(imgDF, batchSize = 1, labelCols = Array("label"), transform = transformers)
|
||||
```
|
||||
|
||||
## 8. Checkpointing and resuming training
|
||||
You can configure periodically taking snapshots of the model.
|
||||
```
|
||||
val cpPath = "/tmp/demo/cp"
|
||||
dmodel.setCheckpoint(cpPath, overWrite=false)
|
||||
```
|
||||
You can also set ```overWrite``` to ```true``` to enable overwriting any existing snapshot files
|
||||
|
||||
After training stops, you can resume from any saved point. Choose one of the model snapshots to resume (saved in checkpoint path, details see Checkpointing). Use Models.loadModel to load the model snapshot into an model object.
|
||||
```
|
||||
val loadModel = Models.loadModel(path)
|
||||
```
|
||||
|
||||
## 9. Monitor your training
|
||||
|
||||
- **Tensorboard**
|
||||
|
||||
BigDL provides a convenient way to monitor/visualize your training progress. It writes the statistics collected during training/validation. Saved summary can be viewed via TensorBoard.
|
||||
|
||||
In order to take effect, it needs to be called before fit.
|
||||
```
|
||||
dmodel.setTensorBoard("./", "dllib_demo")
|
||||
```
|
||||
For more details, please refer [visulization](../Overview/visualization.md)`
|
||||
|
||||
## 10. Transfer learning and finetuning
|
||||
|
||||
- **freeze and trainable**
|
||||
|
||||
BigDL DLLib supports exclude some layers of model from training.
|
||||
```
|
||||
dmodel.freeze(layer_names)
|
||||
```
|
||||
Layers that match the given names will be freezed. If a layer is freezed, its parameters(weight/bias, if exists) are not changed in training process.
|
||||
|
||||
BigDL DLLib also support unFreeze operations. The parameters for the layers that match the given names will be trained(updated) in training process
|
||||
```
|
||||
dmodel.unFreeze(layer_names)
|
||||
```
|
||||
For more information, you may refer [freeze](../../PythonAPI/DLlib/freeze.md)
|
||||
|
||||
## 11. Hyperparameter tuning
|
||||
- **optimizer**
|
||||
|
||||
DLLib supports a list of optimization methods.
|
||||
For more details, please refer [optimization](../../PythonAPI/DLlib/optim-Methods.md)
|
||||
|
||||
- **learning rate scheduler**
|
||||
|
||||
DLLib supports a list of learning rate scheduler.
|
||||
For more details, please refer [lr_scheduler](../../PythonAPI/DLlib/learningrate-Scheduler.md)
|
||||
|
||||
- **batch size**
|
||||
|
||||
DLLib supports set batch size during training and prediction. We can adjust the batch size to tune the model's accuracy.
|
||||
|
||||
- **regularizer**
|
||||
|
||||
DLLib supports a list of regularizers.
|
||||
For more details, please refer [regularizer](../../PythonAPI/DLlib/regularizers.md)
|
||||
|
||||
- **clipping**
|
||||
|
||||
DLLib supports gradient clipping operations.
|
||||
For more details, please refer [gradient_clip](../../PythonAPI/DLlib/clipping.md)
|
||||
|
||||
## 12. Running program
|
||||
You can run a bigdl-dllib program as a standard Spark program (running on either a local machine or a distributed cluster) as follows:
|
||||
```
|
||||
# Spark local mode
|
||||
${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
|
||||
--master local[2] \
|
||||
--class class_name \
|
||||
jar_path
|
||||
|
||||
# Spark standalone mode
|
||||
## ${SPARK_HOME}/sbin/start-master.sh
|
||||
## check master URL from http://localhost:8080
|
||||
${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
|
||||
--master spark://... \
|
||||
--executor-cores cores_per_executor \
|
||||
--total-executor-cores total_cores_for_the_job \
|
||||
--class class_name \
|
||||
jar_path
|
||||
|
||||
# Spark yarn client mode
|
||||
${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
|
||||
--master yarn \
|
||||
--deploy-mode client \
|
||||
--executor-cores cores_per_executor \
|
||||
--num-executors executors_number \
|
||||
--class class_name \
|
||||
jar_path
|
||||
|
||||
# Spark yarn cluster mode
|
||||
${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
|
||||
--master yarn \
|
||||
--deploy-mode cluster \
|
||||
--executor-cores cores_per_executor \
|
||||
--num-executors executors_number \
|
||||
--class class_name
|
||||
jar_path
|
||||
```
|
||||
For more detail about how to run BigDL scala application, please refer to [Scala UserGuide](../../UserGuide/scala.md)
|
||||
62
docs/readthedocs/source/doc/DLlib/index.rst
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
BigDL-DLlib
|
||||
=========================
|
||||
|
||||
**BigDL-DLlib** (or **DLlib** for short) is a distributed deep learning library for Apache Spark; with DLlib, users can write their deep learning applications as standard Spark programs (using either Scala or Python APIs).
|
||||
|
||||
-------
|
||||
|
||||
|
||||
.. grid:: 1 2 2 2
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Get Started**
|
||||
^^^
|
||||
|
||||
Documents in these sections helps you getting started quickly with DLLib.
|
||||
|
||||
+++
|
||||
:bdg-link:`DLlib in 5 minutes <./Overview/dllib.html>` |
|
||||
:bdg-link:`Installation <./Overview/install.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Key Features Guide**
|
||||
^^^
|
||||
|
||||
Each guide in this section provides you with in-depth information, concepts and knowledges about DLLib key features.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Keras-Like API <./Overview/keras-api.html>` |
|
||||
:bdg-link:`Spark ML Pipeline <./Overview/nnframes.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Examples**
|
||||
^^^
|
||||
|
||||
DLLib Examples and Tutorials.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Tutorials <./QuickStart/index.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**API Document**
|
||||
^^^
|
||||
|
||||
API Document provides detailed description of DLLib APIs.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`API Document <../PythonAPI/DLlib/index.html>`
|
||||
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
BigDL-DLlib Document <self>
|
||||
|
||||
70
docs/readthedocs/source/doc/Friesian/examples.md
Normal file
|
|
@ -0,0 +1,70 @@
|
|||
### Use Cases
|
||||
|
||||
|
||||
- **Train a DeepFM model using recsys data**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/deep_fm)
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Run DeepRec with BigDL**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/deeprec)
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Train DIEN using the Amazon Book Reviews dataset**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/dien)
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Preprocess the Criteo dataset for DLRM Model**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/dlrm)
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Train an LightGBM model using Twitter dataset**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/lightGBM)
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Running Friesian listwise example**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/listwise_ranking)
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Multi-task Recommendation with BigDL**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/multi_task)
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Train an NCF model on MovieLens**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/ncf)
|
||||
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Offline Recall with Faiss on Spark**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/recall)
|
||||
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Recommend items using Friesian-Serving Framework**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/serving)
|
||||
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Train a two tower model using recsys data**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/two_tower)
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Preprocess the Criteo dataset for WideAndDeep Model**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/wnd)
|
||||
|
||||
|
||||
---------------------------
|
||||
|
||||
- **Train an XGBoost model using Twitter dataset**
|
||||
>[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/xgb)
|
||||
|
||||
66
docs/readthedocs/source/doc/Friesian/index.rst
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
BigDL-Friesian
|
||||
=========================
|
||||
|
||||
|
||||
|
||||
BigDL Friesian is an application framework for building optimized large-scale recommender solutions. The recommending workflows built on top of Friesian can seamlessly scale out to distributed big data clusters in the production environment.
|
||||
|
||||
Friesian provides end-to-end support for three typical stages in a modern recommendation system:
|
||||
|
||||
- Offline stage: distributed feature engineering and model training.
|
||||
- Nearline stage: Feature and model updates.
|
||||
- Online stage: Recall and ranking.
|
||||
|
||||
-------
|
||||
|
||||
.. grid:: 1 2 2 2
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Get Started**
|
||||
^^^
|
||||
|
||||
Documents in these sections helps you getting started quickly with Friesian.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Introduction <./intro.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Key Features Guide**
|
||||
^^^
|
||||
|
||||
Each guide in this section provides you with in-depth information, concepts and knowledges about Friesian key features.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Serving <./serving.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Use Cases**
|
||||
^^^
|
||||
|
||||
Use Cases and Examples.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Use Cases <./examples.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**API Document**
|
||||
^^^
|
||||
|
||||
API Document provides detailed description of Nano APIs.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`API Document <../PythonAPI/Friesian/index.html>`
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
BigDL-Friesian Document <self>
|
||||
17
docs/readthedocs/source/doc/Friesian/intro.rst
Normal file
|
|
@ -0,0 +1,17 @@
|
|||
Friesian Introduction
|
||||
==========================
|
||||
|
||||
BigDL Friesian is an application framework for building optimized large-scale recommender solutions. The recommending workflows built on top of Friesian can seamlessly scale out to distributed big data clusters in the production environment.
|
||||
|
||||
Friesian provides end-to-end support for three typical stages in a modern recommendation system:
|
||||
|
||||
- Offline stage: distributed feature engineering and model training.
|
||||
- Nearline stage: Feature and model updates.
|
||||
- Online stage: Recall and ranking.
|
||||
|
||||
The overall architecture of Friesian is shown in the following diagram:
|
||||
|
||||
|
||||
.. image:: ../../../image/friesian_architecture.png
|
||||
|
||||
|
||||
600
docs/readthedocs/source/doc/Friesian/serving.md
Normal file
|
|
@ -0,0 +1,600 @@
|
|||
## Serving Recommendation Framework
|
||||
|
||||
### Architecture of the serving pipelines
|
||||
|
||||
The diagram below demonstrates the components of the friesian serving system, which typically consists of three stages:
|
||||
|
||||
- Offline: Preprocess the data to get user/item DNN features and user/item Embedding features. Then use the embedding features and embedding model to get embedding vectors.
|
||||
- Nearline: Retrieve user/item profiles and keep them in the Key-Value store. Retrieve item embedding vectors and build the faiss index. Make updates to the profiles from time to time.
|
||||
- Online: Trigger the recommendation process whenever a user comes. Recall service generate candidates from millions of items based on embeddings and the deep learning model ranks the candidates for the final recommendation results.
|
||||
|
||||

|
||||
|
||||
|
||||
### Services and APIs
|
||||
The friesian serving system consists of 4 types of services:
|
||||
- Ranking Service: performs model inference and returns the results.
|
||||
- `rpc doPredict(Content) returns (Prediction) {}`
|
||||
- Input: The `encodeStr` is a Base64 string encoded from a bigdl [Activity](https://github.com/intel-analytics/BigDL/blob/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/nn/abstractnn/Activity.scala) serialized byte array.
|
||||
```bash
|
||||
message Content {
|
||||
string encodedStr = 1;
|
||||
}
|
||||
```
|
||||
- Output: The `predictStr` is a Base64 string encoded from a bigdl [Activity](https://github.com/intel-analytics/BigDL/blob/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/nn/abstractnn/Activity.scala) (the inference result) serialized byte array.
|
||||
```bash
|
||||
message Prediction {
|
||||
string predictStr = 1;
|
||||
}
|
||||
```
|
||||
- Feature Service: searches user embeddings, user features or item features in Redis, and returns the features.
|
||||
- `rpc getUserFeatures(IDs) returns (Features) {}` and `rpc getItemFeatures(IDs) returns (Features) {}`
|
||||
- Input: The user/item id list for searching.
|
||||
```bash
|
||||
message IDs {
|
||||
repeated int32 ID = 1;
|
||||
}
|
||||
```
|
||||
- Output: `colNames` is a string list of the column names. `b64Feature` is a list of Base64 string, each string is encoded from java serialized array of objects. `ID` is a list of ids corresponding `b64Feature`.
|
||||
```bash
|
||||
message Features {
|
||||
repeated string colNames = 1;
|
||||
repeated string b64Feature = 2;
|
||||
repeated int32 ID = 3;
|
||||
}
|
||||
```
|
||||
- Recall Service: searches item candidates in the built faiss index and returns candidates id list.
|
||||
- `rpc searchCandidates(Query) returns (Candidates) {}`
|
||||
- Input: `userID` is the id of the user to search similar item candidates. `k` is the number of candidates.
|
||||
```bash
|
||||
message Query {
|
||||
int32 userID = 1;
|
||||
int32 k = 2;
|
||||
}
|
||||
```
|
||||
- Output: `candidate` is the list of ids of item candidates.
|
||||
```bash
|
||||
message Candidates {
|
||||
repeated int32 candidate = 1;
|
||||
}
|
||||
```
|
||||
- Recommender Service: gets candidates from the recall service, calls the feature service to get the user and item candidate's features, then sorts the inference results from ranking service and returns the top recommendNum items.
|
||||
- `rpc getRecommendIDs(RecommendRequest) returns (RecommendIDProbs) {}`
|
||||
- Input: `ID` is a list of user ids to recommend. `recommendNum` is the number of items to recommend. `candidateNum` is the number of generated candidates to inference in ranking service.
|
||||
```bash
|
||||
message RecommendRequest {
|
||||
int32 recommendNum = 1;
|
||||
int32 candidateNum = 2;
|
||||
repeated int32 ID = 3;
|
||||
}
|
||||
```
|
||||
- Output: `IDProbList` is a list of results corresponding to user `ID` in input. Each `IDProbs` consists of `ID` and `prob`, `ID` is the list of item ids, and `prob` is the corresponding probability.
|
||||
```bash
|
||||
message RecommendIDProbs {
|
||||
repeated IDProbs IDProbList = 1;
|
||||
}
|
||||
message IDProbs {
|
||||
repeated int32 ID = 1;
|
||||
repeated float prob = 2;
|
||||
}
|
||||
```
|
||||
|
||||
### Quick Start
|
||||
You can run Friesian Serving Recommendation Framework using the official Docker images.
|
||||
|
||||
You can follow the following steps to run the WnD demo.
|
||||
|
||||
1. Pull docker image from dockerhub
|
||||
```bash
|
||||
docker pull intelanalytics/friesian-grpc:0.0.2
|
||||
```
|
||||
|
||||
2. Run & enter docker container
|
||||
```bash
|
||||
docker run -itd --name friesian --net=host intelanalytics/friesian-grpc:0.0.2
|
||||
docker exec -it friesian bash
|
||||
```
|
||||
|
||||
3. Add vec_feature_user_prediction.parquet, vec_feature_item_prediction.parquet, wnd model,
|
||||
wnd_item.parquet and wnd_user.parquet (You can check [the schema of the parquet files](#schema-of-the-parquet-files))
|
||||
|
||||
4. Start ranking service
|
||||
```bash
|
||||
export OMP_NUM_THREADS=1
|
||||
java -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.ranking.RankingServer -c config_ranking.yaml > logs/inf.log 2>&1 &
|
||||
```
|
||||
|
||||
5. Start feature service for recommender service
|
||||
```bash
|
||||
./redis-5.0.5/src/redis-server &
|
||||
java -Dspark.master=local[*] -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.feature.FeatureServer -c config_feature.yaml > logs/feature.log 2>&1 &
|
||||
```
|
||||
|
||||
6. Start feature service for recall service
|
||||
```bash
|
||||
java -Dspark.master=local[*] -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.feature.FeatureServer -c config_feature_vec.yaml > logs/fea_recall.log 2>&1 &
|
||||
```
|
||||
|
||||
7. Start recall service
|
||||
```bash
|
||||
java -Dspark.master=local[*] -Dspark.driver.maxResultSize=2G -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.recall.RecallServer -c config_recall.yaml > logs/vec.log 2>&1 &
|
||||
```
|
||||
|
||||
8. Start recommender service
|
||||
```bash
|
||||
java -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.recommender.RecommenderServer -c config_recommender.yaml > logs/rec.log 2>&1 &
|
||||
```
|
||||
|
||||
9. Check if the services are running
|
||||
```bash
|
||||
ps aux|grep friesian
|
||||
```
|
||||
You will see 5 processes start with 'java'
|
||||
|
||||
10. Run client to test
|
||||
```bash
|
||||
java -Dspark.master=local[*] -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.recommender.RecommenderMultiThreadClient -target localhost:8980 -dataDir wnd_user.parquet -k 50 -clientNum 4 -testNum 2
|
||||
```
|
||||
11. Close services
|
||||
```bash
|
||||
ps aux|grep friesian (find the service pid)
|
||||
kill xxx (pid of the service which should be closed)
|
||||
```
|
||||
|
||||
### Schema of the parquet files
|
||||
|
||||
#### The schema of the user and item embedding files
|
||||
The embedding parquet files should contain at least 2 columns, id column and prediction column.
|
||||
The id column should be IntegerType and the column name should be specified in the config files.
|
||||
The prediction column should be DenseVector type, and you can transfer your existing embedding vectors using pyspark:
|
||||
```python
|
||||
from pyspark.sql import SparkSession
|
||||
from pyspark.sql.functions import udf, col
|
||||
from pyspark.ml.linalg import VectorUDT, DenseVector
|
||||
|
||||
spark = SparkSession.builder \
|
||||
.master("local[*]") \
|
||||
.config("spark.driver.memory", "2g") \
|
||||
.getOrCreate()
|
||||
|
||||
df = spark.read.parquet("data_path")
|
||||
|
||||
def trans_densevector(data):
|
||||
return DenseVector(data)
|
||||
|
||||
vector_udf = udf(lambda x: trans_densevector(x), VectorUDT())
|
||||
# suppose the embedding column (ArrayType(FloatType,true)) is the existing user/item embedding.
|
||||
df = df.withColumn("prediction", vector_udf(col("embedding")))
|
||||
df.write.parquet("output_file_path", mode="overwrite")
|
||||
```
|
||||
|
||||
#### The schema of the recommendation model feature files
|
||||
The feature parquet files should contain at least 2 columns, the id column and other feature columns.
|
||||
The feature columns can be int, float, double, long and array of int, float, double and long.
|
||||
Here is an example of the WideAndDeep model feature.
|
||||
```bash
|
||||
+-------------+--------+--------+----------+--------------------------------+---------------------------------+------------+-----------+---------+----------------------+-----------------------------+
|
||||
|present_media|language|tweet_id|tweet_type|engaged_with_user_follower_count|engaged_with_user_following_count|len_hashtags|len_domains|len_links|present_media_language|engaged_with_user_is_verified|
|
||||
+-------------+--------+--------+----------+--------------------------------+---------------------------------+------------+-----------+---------+----------------------+-----------------------------+
|
||||
| 9| 43| 924| 2| 6| 3| 0.0| 0.1| 0.1| 45| 1|
|
||||
| 0| 6| 4741724| 2| 3| 3| 0.0| 0.0| 0.0| 527| 0|
|
||||
+-------------+--------+--------+----------+--------------------------------+---------------------------------+------------+-----------+---------+----------------------+-----------------------------+
|
||||
```
|
||||
|
||||
### The data schema in Redis
|
||||
The user features, item features and user embedding vectors are saved in Redis.
|
||||
The data saved in Redis is a key-value set.
|
||||
|
||||
#### Key in Redis
|
||||
The key in Redis consists of 3 parts: key prefix, data type, and data id.
|
||||
- Key prefix is `redisKeyPrefix` specified in the feature service config file.
|
||||
- Data type is one of `user` or `item`.
|
||||
- Data id is the value of `userIDColumn` or `itemIDColumn`.
|
||||
Here is an example of key: `2tower_user:29`
|
||||
|
||||
#### Value in Redis
|
||||
A row in the input parquet file will be converted to java array of object, then serialized into byte array, and encoded into Base64 string.
|
||||
|
||||
#### Data schema entry
|
||||
Every key prefix and data type combination has its data schema entry to save the corresponding column names. The key of the schema entry is `keyPrefix + dataType`, such as `2tower_user`. The value of the schema entry is a string of column names separated by `,`, such as `enaging_user_follower_count,enaging_user_following_count,enaging_user_is_verified`.
|
||||
|
||||
### Config for different service
|
||||
You can pass some important information to services using `-c config.yaml`
|
||||
```bash
|
||||
java -Dspark.master=local[*] -Dspark.driver.maxResultSize=2G -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.recall.RecallServer -c config_recall.yaml
|
||||
```
|
||||
|
||||
#### Ranking Service Config
|
||||
Config with example:
|
||||
```yaml
|
||||
# Default: 8980, which port to create the server
|
||||
servicePort: 8083
|
||||
|
||||
# Default: 0, open a port for prometheus monitoring tool, if set, user can check the
|
||||
# performance using prometheus
|
||||
monitorPort: 1234
|
||||
|
||||
# model path must be provided
|
||||
modelPath: /home/yina/Documents/model/recys2021/wnd_813/recsys_wnd
|
||||
|
||||
# default: null, savedmodel input list if the model is tf savedmodel. If not provided, the inputs
|
||||
# of the savedmodel will be arranged in alphabetical order
|
||||
savedModelInputs: serving_default_input_1:0, serving_default_input_2:0, serving_default_input_3:0, serving_default_input_4:0, serving_default_input_5:0, serving_default_input_6:0, serving_default_input_7:0, serving_default_input_8:0, serving_default_input_9:0, serving_default_input_10:0, serving_default_input_11:0, serving_default_input_12:0, serving_default_input_13:0
|
||||
|
||||
# default: 1, number of models used in inference service
|
||||
modelParallelism: 4
|
||||
```
|
||||
|
||||
##### Feature Service Config
|
||||
Config with example:
|
||||
1. load data into redis. Search data from redis
|
||||
```yaml
|
||||
### Basic setting
|
||||
# Default: 8980, which port to create the server
|
||||
servicePort: 8082
|
||||
|
||||
# Default: null, open a port for prometheus monitoring tool, if set, user can check the
|
||||
# performance using prometheus
|
||||
monitorPort: 1235
|
||||
|
||||
# 'kv' or 'inference' default: kv
|
||||
serviceType: kv
|
||||
|
||||
# default: false, if need to load initial data to redis, set true
|
||||
loadInitialData: true
|
||||
|
||||
# default: "", prefix for redis key
|
||||
redisKeyPrefix:
|
||||
|
||||
# default: 0, item slot type on redis cluster. 0 means slot number use the default value 16384, 1 means all keys save to same slot, 2 means use the last character of id as hash tag.
|
||||
redisClusterItemSlotType: 2
|
||||
|
||||
# default: null, if loadInitialData=true, initialUserDataPath or initialItemDataPath must be
|
||||
# provided. Only support parquet file
|
||||
initialUserDataPath: /home/yina/Documents/data/recsys/preprocess_output/wnd_user.parquet
|
||||
initialItemDataPath: /home/yina/Documents/data/recsys/preprocess_output/wnd_exp1/wnd_item.parquet
|
||||
|
||||
# default: null, if loadInitialData=true and initialUserDataPath != null, userIDColumn and
|
||||
# userFeatureColumns must be provided
|
||||
userIDColumn: enaging_user_id
|
||||
userFeatureColumns: enaging_user_follower_count,enaging_user_following_count
|
||||
|
||||
# default: null, if loadInitialData=true and initialItemDataPath != null, userIDColumn and
|
||||
# userFeatureColumns must be provided
|
||||
itemIDColumn: tweet_id
|
||||
itemFeatureColumns: present_media, language, tweet_id, hashtags, present_links, present_domains, tweet_type, engaged_with_user_follower_count,engaged_with_user_following_count, len_hashtags, len_domains, len_links, present_media_language, tweet_id_engaged_with_user_id
|
||||
|
||||
# default: null, user model path or item model path must be provided if serviceType
|
||||
# contains 'inference'. If serviceType=kv, usermodelPath, itemModelPath and modelParallelism will
|
||||
# be ignored
|
||||
# userModelPath:
|
||||
|
||||
# default: null, user model path or item model path must be provided if serviceType
|
||||
# contains 'inference'. If serviceType=kv, usermodelPath, itemModelPath and modelParallelism will
|
||||
# be ignored
|
||||
# itemModelPath:
|
||||
|
||||
# default: 1, number of models used for inference
|
||||
# modelParallelism:
|
||||
|
||||
### Redis Configuration
|
||||
# default: localhost:6379
|
||||
# redisUrl:
|
||||
|
||||
# default: 256, JedisPoolMaxTotal
|
||||
# redisPoolMaxTotal:
|
||||
```
|
||||
|
||||
2. load user features into redis. Get features from redis, use model at 'userModelPath' to do
|
||||
inference and get the user embedding
|
||||
```yaml
|
||||
### Basic setting
|
||||
# Default: 8980, which port to create the server
|
||||
servicePort: 8085
|
||||
|
||||
# Default: null, open a port for prometheus monitoring tool, if set, user can check the
|
||||
# performance using prometheus
|
||||
monitorPort: 1236
|
||||
|
||||
# 'kv' or 'inference' default: kv
|
||||
serviceType: kv, inference
|
||||
|
||||
# default: false, if need to load initial data to redis, set true
|
||||
loadInitialData: true
|
||||
|
||||
# default: ""
|
||||
redisKeyPrefix: 2tower_
|
||||
|
||||
# default: 0, item slot type on redis cluster. 0 means slot number use the default value 16384, 1 means all keys save to same slot, 2 means use the last character of id as hash tag.
|
||||
redisClusterItemSlotType: 2
|
||||
|
||||
# default: null, if loadInitialData=true, initialDataPath must be provided. Only support parquet
|
||||
# file
|
||||
initialUserDataPath: /home/yina/Documents/data/recsys/preprocess_output/guoqiong/vec_feature_user.parquet
|
||||
# initialItemDataPath:
|
||||
|
||||
# default: null, if loadInitialData=true and initialUserDataPath != null, userIDColumn and
|
||||
# userFeatureColumns must be provided
|
||||
#userIDColumn: user
|
||||
userIDColumn: enaging_user_id
|
||||
userFeatureColumns: user
|
||||
|
||||
# default: null, if loadInitialData=true and initialItemDataPath != null, userIDColumn and
|
||||
# userFeatureColumns must be provided
|
||||
# itemIDColumn:
|
||||
# itemFeatureColumns:
|
||||
|
||||
# default: null, user model path or item model path must be provided if serviceType
|
||||
# includes 'inference'. If serviceType=kv, usermodelPath, itemModelPath and modelParallelism will
|
||||
# be ignored
|
||||
userModelPath: /home/yina/Documents/model/recys2021/2tower/guoqiong/user-model
|
||||
|
||||
# default: null, user model path or item model path must be provided if serviceType
|
||||
# contains 'inference'. If serviceType=kv, usermodelPath, itemModelPath and modelParallelism will
|
||||
# be ignored
|
||||
# itemModelPath:
|
||||
|
||||
# default: 1, number of models used for inference
|
||||
# modelParallelism:
|
||||
|
||||
### Redis Configuration
|
||||
# default: localhost:6379
|
||||
# redisUrl:
|
||||
|
||||
# default: 256, JedisPoolMaxTotal
|
||||
# redisPoolMaxTotal:
|
||||
```
|
||||
|
||||
#### Recall Service Config
|
||||
Config with example:
|
||||
|
||||
1. load initial item vector from vec_feature_item.parquet and item-model to build faiss index.
|
||||
```yaml
|
||||
# Default: 8980, which port to create the server
|
||||
servicePort: 8084
|
||||
|
||||
# Default: null, open a port for prometheus monitoring tool, if set, user can check the
|
||||
# performance using prometheus
|
||||
monitorPort: 1238
|
||||
|
||||
# default: 128, the dimensionality of the embedding vectors
|
||||
indexDim: 50
|
||||
|
||||
# default: false, if load saved index, set true
|
||||
# loadSavedIndex: true
|
||||
|
||||
# default: false, if true, the built index will be saved to indexPath. Ignored when
|
||||
# loadSavedIndex=true
|
||||
saveBuiltIndex: true
|
||||
|
||||
# default: null, path to saved index path, must be provided if loadSavedIndex=true
|
||||
indexPath: ./2tower_item_full.idx
|
||||
|
||||
# default: false
|
||||
getFeatureFromFeatureService: true
|
||||
|
||||
# default: localhost:8980, feature service target
|
||||
featureServiceURL: localhost:8085
|
||||
|
||||
itemIDColumn: tweet_id
|
||||
itemFeatureColumns: item
|
||||
|
||||
# default: null, user model path must be provided if getFeatureFromFeatureService=false
|
||||
# userModelPath:
|
||||
|
||||
# default: null, item model path must be provided if loadSavedIndex=false and initialDataPath is
|
||||
# not orca predict result
|
||||
itemModelPath: /home/yina/Documents/model/recys2021/2tower/guoqiong/item-model
|
||||
|
||||
# default: null, Only support parquet file
|
||||
initialDataPath: /home/yina/Documents/data/recsys/preprocess_output/guoqiong/vec_feature_item.parquet
|
||||
|
||||
# default: 1, number of models used in inference service
|
||||
modelParallelism: 1
|
||||
```
|
||||
|
||||
2. load existing faiss index
|
||||
```yaml
|
||||
# Default: 8980, which port to create the server
|
||||
servicePort: 8084
|
||||
|
||||
# Default: null, open a port for prometheus monitoring tool, if set, user can check the
|
||||
# performance using prometheus
|
||||
monitorPort: 1238
|
||||
|
||||
# default: 128, the dimensionality of the embedding vectors
|
||||
# indexDim:
|
||||
|
||||
# default: false, if load saved index, set true
|
||||
loadSavedIndex: true
|
||||
|
||||
# default: null, path to saved index path, must be provided if loadSavedIndex=true
|
||||
indexPath: ./2tower_item_full.idx
|
||||
|
||||
# default: false
|
||||
getFeatureFromFeatureService: true
|
||||
|
||||
# default: localhost:8980, feature service target
|
||||
featureServiceURL: localhost:8085
|
||||
|
||||
# itemIDColumn:
|
||||
# itemFeatureColumns:
|
||||
|
||||
# default: null, user model path must be provided if getFeatureFromFeatureService=false
|
||||
# userModelPath:
|
||||
|
||||
# default: null, item model path must be provided if loadSavedIndex=false and initialDataPath is
|
||||
# not orca predict result
|
||||
# itemModelPath:
|
||||
|
||||
# default: null, Only support parquet file
|
||||
# initialDataPath:
|
||||
|
||||
# default: 1, number of models used in inference service
|
||||
# modelParallelism:
|
||||
```
|
||||
#### Recommender Service Config
|
||||
Config with example:
|
||||
|
||||
```yaml
|
||||
Default: 8980, which port to create the server
|
||||
servicePort: 8980
|
||||
|
||||
# Default: null, open a port for prometheus monitoring tool, if set, user can check the
|
||||
# performance using prometheus
|
||||
monitorPort: 1237
|
||||
|
||||
# default: null, must be provided, item column name
|
||||
itemIDColumn: tweet_id
|
||||
|
||||
# default: null, must be provided, column names for inference, order related.
|
||||
inferenceColumns: present_media_language, present_media, tweet_type, language, hashtags, present_links, present_domains, tweet_id_engaged_with_user_id, engaged_with_user_follower_count, engaged_with_user_following_count, enaging_user_follower_count, enaging_user_following_count, len_hashtags, len_domains, len_links
|
||||
|
||||
# default: 0, if set, ranking service request will be divided
|
||||
inferenceBatch: 0
|
||||
|
||||
# default: localhost:8980, recall service target
|
||||
recallServiceURL: localhost:8084
|
||||
|
||||
# default: localhost:8980, feature service target
|
||||
featureServiceURL: localhost:8082
|
||||
|
||||
# default: localhost:8980, inference service target
|
||||
rankingServiceURL: localhost:8083
|
||||
```
|
||||
|
||||
### Run Java Client
|
||||
|
||||
#### Generate proto java files
|
||||
You should init a maven project and use proto files in [friesian gRPC project](https://github.com/analytics-zoo/friesian/tree/recsys-grpc/src/main/proto)
|
||||
Make sure to add the following extensions and plugins in your pom.xml, and replace
|
||||
*protocExecutable* with your own protoc executable.
|
||||
```xml
|
||||
<build>
|
||||
<extensions>
|
||||
<extension>
|
||||
<groupId>kr.motd.maven</groupId>
|
||||
<artifactId>os-maven-plugin</artifactId>
|
||||
<version>1.6.2</version>
|
||||
</extension>
|
||||
</extensions>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.apache.maven.plugins</groupId>
|
||||
<artifactId>maven-compiler-plugin</artifactId>
|
||||
<version>3.8.0</version>
|
||||
<configuration>
|
||||
<source>8</source>
|
||||
<target>8</target>
|
||||
</configuration>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<groupId>org.xolstice.maven.plugins</groupId>
|
||||
<artifactId>protobuf-maven-plugin</artifactId>
|
||||
<version>0.6.1</version>
|
||||
<configuration>
|
||||
<protocArtifact>com.google.protobuf:protoc:3.12.0:exe:${os.detected.classifier}</protocArtifact>
|
||||
<pluginId>grpc-java</pluginId>
|
||||
<pluginArtifact>io.grpc:protoc-gen-grpc-java:1.37.0:exe:${os.detected.classifier}</pluginArtifact>
|
||||
<protocExecutable>/home/yina/Documents/protoc/bin/protoc</protocExecutable>
|
||||
</configuration>
|
||||
<executions>
|
||||
<execution>
|
||||
<goals>
|
||||
<goal>compile</goal>
|
||||
<goal>compile-custom</goal>
|
||||
</goals>
|
||||
</execution>
|
||||
</executions>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
```
|
||||
Then you can generate the gRPC files with
|
||||
```bash
|
||||
mvn clean install
|
||||
```
|
||||
#### Call recommend service function using blocking stub
|
||||
You can check the [Recommend service client example](https://github.com/analytics-zoo/friesian/blob/recsys-grpc/src/main/java/grpc/recommend/RecommendClient.java) on Github
|
||||
|
||||
```java
|
||||
import com.intel.analytics.bigdl.friesian.serving.grpc.generated.recommender.RecommenderGrpc;
|
||||
import com.intel.analytics.bigdl.friesian.serving.grpc.generated.recommender.RecommenderProto.*;
|
||||
|
||||
public class RecommendClient {
|
||||
public static void main(String[] args) {
|
||||
// Create a channel
|
||||
ManagedChannel channel = ManagedChannelBuilder.forTarget(targetURL).usePlaintext().build();
|
||||
// Init a recommend service blocking stub
|
||||
RecommenderGrpc.RecommenderBlockingStub blockingStub = RecommenderGrpc.newBlockingStub(channel);
|
||||
// Construct a request
|
||||
int[] userIds = new int[]{1};
|
||||
int candidateNum = 50;
|
||||
int recommendNum = 10;
|
||||
RecommendRequest.Builder request = RecommendRequest.newBuilder();
|
||||
for (int id : userIds) {
|
||||
request.addID(id);
|
||||
}
|
||||
request.setCandidateNum(candidateNum);
|
||||
request.setRecommendNum(recommendNum);
|
||||
RecommendIDProbs recommendIDProbs = null;
|
||||
try {
|
||||
recommendIDProbs = blockingStub.getRecommendIDs(request.build());
|
||||
logger.info(recommendIDProbs.getIDProbListList());
|
||||
} catch (StatusRuntimeException e) {
|
||||
logger.warn("RPC failed: " + e.getStatus().toString());
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Run Python Client
|
||||
Install the python packages listed below (you may encounter [pyspark error](https://stackoverflow.com/questions/58700384/how-to-fix-typeerror-an-integer-is-required-got-type-bytes-error-when-tryin) if you have python>=3.8 installed, try to downgrade to python<=3.7 and try again).
|
||||
```bash
|
||||
pip install jupyter notebook==6.1.4 grpcio grpcio-tools pandas fastparquet pyarrow
|
||||
```
|
||||
After you activate your server successfully, you can
|
||||
|
||||
#### Generate proto python files
|
||||
Generate the files with
|
||||
```bash
|
||||
python -m grpc_tools.protoc -I../../protos --python_out=<path_to_output_folder> --grpc_python_out=<path_to_output_folder> <path_to_friesian>/src/main/proto/*.proto
|
||||
```
|
||||
|
||||
#### Call recommend service function using blocking stub
|
||||
You can check the [Recommend service client example](https://github.com/analytics-zoo/friesian/blob/recsys-grpc/Serving/WideDeep/recommend_client.ipynb) on Github
|
||||
```python
|
||||
# create a channel
|
||||
channel = grpc.insecure_channel('localhost:8980')
|
||||
# create a recommend service stub
|
||||
stub = recommender_pb2_grpc.RecommenderStub(channel)
|
||||
request = recommender_pb2.RecommendRequest(recommendNum=10, candidateNum=50, ID=[36407])
|
||||
results = stub.getRecommendIDs(request)
|
||||
print(results.IDProbList)
|
||||
|
||||
```
|
||||
### Scale-out for Big Data
|
||||
#### Redis Cluster
|
||||
For large data set, Redis standalone has no enough memory to store whole data set, data sharding and Redis cluster are supported to handle it. You only need to set up a Redis Cluster to get it work.
|
||||
|
||||
First, start N Redis instance on N machines.
|
||||
```
|
||||
redis-server --cluster-enabled yes --cluster-config-file nodes-0.conf --cluster-node-timeout 50000 --appendonly no --save "" --logfile 0.log --daemonize yes --protected-mode no --port 6379
|
||||
```
|
||||
on each machine, choose a different port and start another M instances(M>=1), as the slave nodes of above N instances.
|
||||
|
||||
Then, call initialization command on one machine, if you choose M=1 above, use `--cluster-replicas 1`
|
||||
```
|
||||
redis-cli --cluster create 172.168.3.115:6379 172.168.3.115:6380 172.168.3.116:6379 172.168.3.116:6380 172.168.3.117:6379 172.168.3.117:6380 --cluster-replicas 1
|
||||
```
|
||||
and the Redis cluster would be ready.
|
||||
|
||||
#### Scale Service with Envoy
|
||||
Each of the services could be scaled out. It is recommended to use the same resource, e.g. single machine with same CPU and memory, to test which service is bottleneck. From empirical observations, vector search and inference usually be.
|
||||
|
||||
##### How to run envoy:
|
||||
1. [download](https://www.envoyproxy.io/docs/envoy/latest/start/install) and deploy envoy(below use docker as example):
|
||||
* download: `docker pull envoyproxy/envoy-dev:21df5e8676a0f705709f0b3ed90fc2dbbd63cfc5`
|
||||
2. run command: `docker run --rm -it -p 9082:9082 -p 9090:9090 envoyproxy/envoy-dev:79ade4aebd02cf15bd934d6d58e90aa03ef6909e --config-yaml "$(cat path/to/service-specific-envoy.yaml)" --parent-shutdown-time-s 1000000`
|
||||
3. validate: run `netstat -tnlp` to see if the envoy process is listening to the corresponding port in the envoy config file.
|
||||
4. For details on envoy and sample procedure, read [envoy](envoy.md).
|
||||
6
docs/readthedocs/source/doc/GetStarted/index.rst
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
User Guide
|
||||
=========================
|
||||
|
||||
|
||||
Getting Started
|
||||
===========================================
|
||||
2
docs/readthedocs/source/doc/GetStarted/install.rst
Normal file
|
|
@ -0,0 +1,2 @@
|
|||
Install Locally
|
||||
=========================
|
||||
28
docs/readthedocs/source/doc/GetStarted/paper.md
Normal file
|
|
@ -0,0 +1,28 @@
|
|||
# Paper
|
||||
|
||||
|
||||
## Paper
|
||||
|
||||
* Dai, Jason Jinquan, et al. "BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022. [paper](https://arxiv.org/ftp/arxiv/papers/2204/2204.01715.pdf) [video]() [demo]()
|
||||
|
||||
* Dai, Jason Jinquan, et al. "BigDL: A distributed deep learning framework for big data." Proceedings of the ACM Symposium on Cloud Computing. 2019. [paper](https://arxiv.org/abs/1804.05839)
|
||||
|
||||
|
||||
|
||||
|
||||
## Citing
|
||||
If you've found BigDL useful for your project, you may cite the [paper](https://arxiv.org/abs/1804.05839) as follows:
|
||||
|
||||
```
|
||||
@inproceedings{SOCC2019_BIGDL,
|
||||
title={BigDL: A Distributed Deep Learning Framework for Big Data},
|
||||
author={Dai, Jason (Jinquan) and Wang, Yiheng and Qiu, Xin and Ding, Ding and Zhang, Yao and Wang, Yanzhang and Jia, Xianyan and Zhang, Li (Cherry) and Wan, Yan and Li, Zhichao and Wang, Jiao and Huang, Shengsheng and Wu, Zhongyuan and Wang, Yang and Yang, Yuhao and She, Bowen and Shi, Dongjie and Lu, Qi and Huang, Kai and Song, Guoqiong},
|
||||
booktitle={Proceedings of the ACM Symposium on Cloud Computing},
|
||||
publisher={Association for Computing Machinery},
|
||||
pages={50--60},
|
||||
year={2019},
|
||||
series={SoCC'19},
|
||||
doi={10.1145/3357223.3362707},
|
||||
url={https://arxiv.org/pdf/1804.05839.pdf}
|
||||
}
|
||||
```
|
||||
2
docs/readthedocs/source/doc/GetStarted/usecase.rst
Normal file
|
|
@ -0,0 +1,2 @@
|
|||
Use Cases
|
||||
============================
|
||||
0
docs/readthedocs/source/doc/GetStarted/videos.md
Normal file
|
|
@ -1,4 +1,4 @@
|
|||
AutoML Overview
|
||||
AutoML
|
||||
***************
|
||||
|
||||
Nano provides built-in AutoML support through hyperparameter optimization.
|
||||
8
docs/readthedocs/source/doc/Nano/Overview/index.rst
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
Nano Key Features
|
||||
================================
|
||||
|
||||
* `PyTorch Training <pytorch_train.html>`_
|
||||
* `PyTorch Inference <pytorch_inference.html>`_
|
||||
* `Tensorflow Training <tensorflow_train.html>`_
|
||||
* `Tensorflow Inference <tensorflow_inference.html>`_
|
||||
* `AutoML <hpo.html>`_
|
||||
36
docs/readthedocs/source/doc/Nano/Overview/install.md
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
# Nano Installation
|
||||
|
||||
Note: For windows users, we recommend using Windows Subsystem for Linux 2 (WSL2) to run BigDL-Nano. Please refer to [Nano Windows install guide](../Howto/windows_guide.md) for instructions.
|
||||
|
||||
|
||||
BigDL-Nano can be installed using pip and we recommend installing BigDL-Nano in a conda environment.
|
||||
|
||||
For PyTorch Users, you can install bigdl-nano along with some dependencies specific to PyTorch using the following commands.
|
||||
|
||||
```bash
|
||||
conda create -n env
|
||||
conda activate env
|
||||
pip install bigdl-nano[pytorch]
|
||||
```
|
||||
|
||||
For TensorFlow users, you can install bigdl-nano along with some dependencies specific to TensorFlow using the following commands.
|
||||
|
||||
```bash
|
||||
conda create -n env
|
||||
conda activate env
|
||||
pip install bigdl-nano[tensorflow]
|
||||
```
|
||||
|
||||
After installing bigdl-nano, you can run the following command to setup a few environment variables.
|
||||
|
||||
```bash
|
||||
source bigdl-nano-init
|
||||
```
|
||||
|
||||
The `bigdl-nano-init` scripts will export a few environment variable according to your hardware to maximize performance.
|
||||
|
||||
In a conda environment, `source bigdl-nano-init` will also be added to `$CONDA_PREFIX/etc/conda/activate.d/`, which will automaticly run when you activate your current environment.
|
||||
|
||||
In a pure pip environment, you need to run `source bigdl-nano-init` every time you open a new shell to get optimal performance and run `source bigdl-nano-unset-env` if you want to unset these environment variables.
|
||||
|
||||
---
|
||||
|
|
@ -1,49 +1,11 @@
|
|||
# Nano User Guide
|
||||
# Nano in 5 minutes
|
||||
|
||||
## **1. Overview**
|
||||
BigDL-Nano is a Python package to transparently accelerate PyTorch and TensorFlow applications on Intel hardware. It provides a unified and easy-to-use API for several optimization techniques and tools, so that users can only apply a few lines of code changes to make their PyTorch or TensorFlow code run faster.
|
||||
|
||||
BigDL Nano is a Python package to transparently accelerate PyTorch and TensorFlow applications on Intel hardware. It provides a unified and easy-to-use API for several optimization techniques and tools, so that users can only apply a few lines of code changes to make their PyTorch or TensorFlow code run faster.
|
||||
----
|
||||
|
||||
---
|
||||
## **2. Install**
|
||||
|
||||
Note: For windows users, we recommend using Windows Subsystem for Linux 2 (WSL2) to run BigDL-Nano. Please refer to [Nano Windows install guide](../Howto/windows_guide.md) for instructions.
|
||||
|
||||
BigDL-Nano can be installed using pip and we recommend installing BigDL-Nano in a conda environment.
|
||||
|
||||
For PyTorch Users, you can install bigdl-nano along with some dependencies specific to PyTorch using the following commands.
|
||||
|
||||
```bash
|
||||
conda create -n env
|
||||
conda activate env
|
||||
pip install bigdl-nano[pytorch]
|
||||
```
|
||||
|
||||
For TensorFlow users, you can install bigdl-nano along with some dependencies specific to TensorFlow using the following commands.
|
||||
|
||||
```bash
|
||||
conda create -n env
|
||||
conda activate env
|
||||
pip install bigdl-nano[tensorflow]
|
||||
```
|
||||
|
||||
After installing bigdl-nano, you can run the following command to setup a few environment variables.
|
||||
|
||||
```bash
|
||||
source bigdl-nano-init
|
||||
```
|
||||
|
||||
The `bigdl-nano-init` scripts will export a few environment variable according to your hardware to maximize performance.
|
||||
|
||||
In a conda environment, `source bigdl-nano-init` will also be added to `$CONDA_PREFIX/etc/conda/activate.d/`, which will automaticly run when you activate your current environment.
|
||||
|
||||
In a pure pip environment, you need to run `source bigdl-nano-init` every time you open a new shell to get optimal performance and run `source bigdl-nano-unset-env` if you want to unset these environment variables.
|
||||
|
||||
---
|
||||
|
||||
## **3. Get Started**
|
||||
|
||||
#### **3.1 PyTorch**
|
||||
### **PyTorch Bite-sized Example**
|
||||
|
||||
BigDL-Nano supports both PyTorch and PyTorch Lightning models and most optimizations require only changing a few "import" lines in your code and adding a few flags.
|
||||
|
||||
|
|
@ -74,7 +36,8 @@ MyNano(use_ipex=True, num_processes=2).train()
|
|||
|
||||
For more details on the BigDL-Nano's PyTorch usage, please refer to the [PyTorch Training](../QuickStart/pytorch_train.md) and [PyTorch Inference](../QuickStart/pytorch_inference.md) page.
|
||||
|
||||
### **3.2 TensorFlow**
|
||||
|
||||
### **TensorFlow Bite-sized Example**
|
||||
|
||||
BigDL-Nano supports `tensorflow.keras` API and most optimizations require only changing a few "import" lines in your code and adding a few flags.
|
||||
|
||||
|
|
@ -104,4 +67,4 @@ model.compile(optimizer='adam',
|
|||
model.fit(x_train, y_train, epochs=5, num_processes=4)
|
||||
```
|
||||
|
||||
For more details on the BigDL-Nano's PyTorch usage, please refer to the [TensorFlow Training](../QuickStart/tensorflow_train.md) and [TensorFlow Inference](../QuickStart/tensorflow_inference.md) page.
|
||||
For more details on the BigDL-Nano's Tensorflow usage, please refer to the [TensorFlow Training](../QuickStart/tensorflow_train.md) and [TensorFlow Inference](../QuickStart/tensorflow_inference.md) page.
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
# BigDL-Nano PyTorch Inference Overview
|
||||
# PyTorch Inference
|
||||
|
||||
BigDL-Nano provides several APIs which can help users easily apply optimizations on inference pipelines to improve latency and throughput. Currently, performance accelerations are achieved by integrating extra runtimes as inference backend engines or using quantization methods on full-precision trained models to reduce computation during inference. InferenceOptimizer (`bigdl.nano.pytorch.InferenceOptimizer`) provides the APIs for all optimizations that you need for inference.
|
||||
|
||||
|
|
@ -1,4 +1,4 @@
|
|||
# BigDL-Nano PyTorch Training Overview
|
||||
# PyTorch Training
|
||||
|
||||
BigDL-Nano can be used to accelerate PyTorch or PyTorch-Lightning applications on training workloads. The optimizations in BigDL-Nano are delivered through an extended version of PyTorch-Lightning `Trainer`. These optimizations are either enabled by default or can be easily turned on by setting a parameter or calling a method.
|
||||
|
||||
|
|
@ -1,4 +1,5 @@
|
|||
# BigDL-Nano TensorFlow Inference Overview
|
||||
# TensorFlow Inference
|
||||
|
||||
BigDL-Nano provides several APIs which can help users easily apply optimizations on inference pipelines to improve latency and throughput. Currently, performance accelerations are achieved by integrating extra runtimes as inference backend engines or using quantization methods on full-precision trained models to reduce computation during inference. Keras Model (`bigdl.nano.tf.keras.Model`) and Sequential (`bigdl.nano.tf.keras.Sequential`) provides the APIs for all optimizations that you need for inference.
|
||||
|
||||
For quantization, BigDL-Nano provides only post-training quantization in `Model.quantize()` for users to infer with models of 8-bit precision. Quantization-Aware Training is not available for now. Model conversion to 16-bit like BF16, and FP16 will be coming soon.
|
||||
|
|
@ -1,4 +1,4 @@
|
|||
# BigDL-Nano TensorFlow Training Overview
|
||||
# TensorFlow Training
|
||||
|
||||
BigDL-Nano can be used to accelerate TensorFlow Keras applications on training workloads. The optimizations in BigDL-Nano are delivered through BigDL-Nano's `Model` and `Sequential` classes, which have identical APIs with `tf.keras.Model` and `tf.keras.Sequential`. For most cases, you can just replace your `tf.keras.Model` with `bigdl.nano.tf.keras.Model` and `tf.keras.Sequential` with `bigdl.nano.tf.keras.Sequential` to benefit from BigDL-Nano.
|
||||
|
||||
0
docs/readthedocs/source/doc/Nano/Overview/userguide.rst
Normal file
63
docs/readthedocs/source/doc/Nano/index.rst
Normal file
|
|
@ -0,0 +1,63 @@
|
|||
BigDL-Nano
|
||||
=========================
|
||||
|
||||
**BigDL-Nano** (or **Nano** for short) is a Python package to transparently accelerate PyTorch and TensorFlow applications on Intel hardware. It provides a unified and easy-to-use API for several optimization techniques and tools, so that users can only apply a few lines of code changes to make their PyTorch or TensorFlow code run faster.
|
||||
|
||||
-------
|
||||
|
||||
|
||||
.. grid:: 1 2 2 2
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Get Started**
|
||||
^^^
|
||||
|
||||
Documents in these sections helps you getting started quickly with Nano.
|
||||
|
||||
+++
|
||||
:bdg-link:`Nano in 5 minutes <./Overview/nano.html>` |
|
||||
:bdg-link:`Installation <./Overview/install.html>` |
|
||||
:bdg-link:`Tutorials <./QuickStart/index.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Key Features Guide**
|
||||
^^^
|
||||
|
||||
Each guide in this section provides you with in-depth information, concepts and knowledges about Nano key features.
|
||||
|
||||
+++
|
||||
|
||||
:bdg:`PyTorch` :bdg-link:`Infer <./Overview/pytorch_inference.html>` :bdg-link:`Train <./Overview/pytorch_train.html>` |
|
||||
:bdg:`TensorFlow` :bdg-link:`Infer <./Overview/tensorflow_inference.html>` :bdg-link:`Train <./Overview/tensorflow_train.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**How-to Guide**
|
||||
^^^
|
||||
|
||||
How-to Guide provides bite-sized, actionable examples of how to use specific Nano features, different from our tutorials
|
||||
which are full-length examples each implementing a full usage scenario.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`How-to-Guide <./Howto/index.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**API Document**
|
||||
^^^
|
||||
|
||||
API Document provides detailed description of Nano APIs.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`API Document <../PythonAPI/Nano/index.html>`
|
||||
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
BigDL-Nano Document <self>
|
||||
|
|
@ -4,23 +4,6 @@
|
|||
|
||||
**Orca `AutoEstimator` provides similar APIs as Orca `Estimator` for distributed hyper-parameter tuning.**
|
||||
|
||||
### **Install**
|
||||
We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the Python environment.
|
||||
```bash
|
||||
conda create -n bigdl-orca-automl python=3.7 # "bigdl-orca-automl" is conda environment name, you can use any name you like.
|
||||
conda activate bigdl-orca-automl
|
||||
pip install bigdl-orca[automl]
|
||||
````
|
||||
You can install the latest release version of BigDL Orca as follows:
|
||||
```bash
|
||||
pip install --pre --upgrade bigdl-orca[automl]
|
||||
```
|
||||
_Note that with extra key of [automl], `pip` will automatically install the additional dependencies for distributed hyper-parameter tuning,
|
||||
including `ray[tune]==1.9.2`, `scikit-learn`, `tensorboard`, `xgboost`._
|
||||
|
||||
To use [Pytorch Estimator](#pytorch-autoestimator), you need to install Pytorch with `pip install torch==1.8.1`.
|
||||
|
||||
To use [TensorFlow/Keras AutoEstimator](#tensorflow-keras-autoestimator), you need to install Tensorflow with `pip install tensorflow==1.15.0`.
|
||||
|
||||
|
||||
### **1. AutoEstimator**
|
||||
|
|
|
|||
2
docs/readthedocs/source/doc/Orca/Overview/getstarted.rst
Normal file
|
|
@ -0,0 +1,2 @@
|
|||
Orca Key Features
|
||||
=================================
|
||||
8
docs/readthedocs/source/doc/Orca/Overview/index.rst
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
Orca Key Features
|
||||
=================================
|
||||
|
||||
* `Orca Context <orca-context.html>`_
|
||||
* `Distributed Data Processing <data-parallel-processing.html>`_
|
||||
* `Distributed Training and Inference <distributed-training-inference.html>`_
|
||||
* `Distributed Hyper Parameter Tuning <distributed-tuning.html>`_
|
||||
* `RayOnSpark <ray.html>`_
|
||||
45
docs/readthedocs/source/doc/Orca/Overview/install.md
Normal file
|
|
@ -0,0 +1,45 @@
|
|||
# Installation
|
||||
|
||||
|
||||
## To use Distributed Data processing, training, and/or inference
|
||||
We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the Python environment.
|
||||
```bash
|
||||
conda create -n py37 python=3.7 # "py37" is conda environment name, you can use any name you like.
|
||||
conda activate py37
|
||||
pip install bigdl-orca
|
||||
```
|
||||
|
||||
You can install bigdl-orca nightly build version using
|
||||
```bash
|
||||
pip install --pre --upgrade bigdl-orca
|
||||
```
|
||||
|
||||
## To use RayOnSpark
|
||||
|
||||
There're some additional dependencies required for running [RayOnSpark](ray.md). Use extra key `[ray]` to install.
|
||||
|
||||
```bash
|
||||
pip install bigdl-orca[ray]
|
||||
```
|
||||
|
||||
or to install nightly build, use
|
||||
```bash
|
||||
pip install --pre --upgrade bigdl-orca[ray]
|
||||
```
|
||||
|
||||
## To use Orca AutoML
|
||||
|
||||
There're some additional dependencies required for Orca AutoML support. Use extra key `[automl]` to install.
|
||||
|
||||
```bash
|
||||
pip install bigdl-orca[automl]
|
||||
````
|
||||
|
||||
|
||||
_Note that with extra key of [automl], `pip` will automatically install the additional dependencies for distributed hyper-parameter tuning,
|
||||
including `ray[tune]==1.9.2`, `scikit-learn`, `tensorboard`, `xgboost`._
|
||||
|
||||
To use [Pytorch Estimator](#pytorch-autoestimator), you need to install Pytorch with `pip install torch==1.8.1`.
|
||||
|
||||
To use [TensorFlow/Keras AutoEstimator](#tensorflow-keras-autoestimator), you need to install Tensorflow with `pip install tensorflow==1.15.0`.
|
||||
|
||||
|
|
@ -85,3 +85,43 @@ To solve this issue, you need to set the path of `libhdfs.so` in Cloudera to the
|
|||
# For yarn-cluster mode
|
||||
spark-submit --conf spark.executorEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64 \
|
||||
--conf spark.yarn.appMasterEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64
|
||||
|
||||
|
||||
### **Spark Dynamic Allocation**
|
||||
|
||||
By design, BigDL does not support Spark Dynamic Allocation mode, and needs to allocate fixed resources for deep learning model training. Thus if your environment has already configured Spark Dynamic Allocation, or stipulated that Spark Dynamic Allocation must be used, you may encounter the following error:
|
||||
|
||||
> **requirement failed: Engine.init: spark.dynamicAllocation.maxExecutors and spark.dynamicAllocation.minExecutors must be identical in dynamic allocation for BigDL**
|
||||
>
|
||||
|
||||
Here we provide a workaround for running BigDL under Spark Dynamic Allocation mode.
|
||||
|
||||
For `spark-submit` cluster mode, the first solution is to disable the Spark Dynamic Allocation mode in `SparkConf` when you submit your application as follows:
|
||||
|
||||
```bash
|
||||
spark-submit --conf spark.dynamicAllocation.enabled=false
|
||||
```
|
||||
|
||||
Otherwise, if you can not set this configuration due to your cluster settings, you can set `spark.dynamicAllocation.minExecutors` to be equal to `spark.dynamicAllocation.maxExecutors` as follows:
|
||||
|
||||
```bash
|
||||
spark-submit --conf spark.dynamicAllocation.enabled=true \
|
||||
--conf spark.dynamicAllocation.minExecutors 2 \
|
||||
--conf spark.dynamicAllocation.maxExecutors 2
|
||||
```
|
||||
|
||||
For other cluster modes, such as `yarn` and `k8s`, our program will initiate `SparkContext` for you, and the Spark Dynamic Allocation mode is disabled by default. Thus, generally you wouldn't encounter such problem.
|
||||
|
||||
If you are using Spark Dynamic Allocation, you have to disable barrier execution mode at the very beginning of your application as follows:
|
||||
|
||||
```python
|
||||
from bigdl.orca import OrcaContext
|
||||
|
||||
OrcaContext.barrier_mode = False
|
||||
```
|
||||
|
||||
For Spark Dynamic Allocation mode, you are also recommended to manually set `num_ray_nodes` and `ray_node_cpu_cores` equal to `spark.dynamicAllocation.minExecutors` and `spark.executor.cores` respectively. You can specify `num_ray_nodes` and `ray_node_cpu_cores` in `init_orca_context` as follows:
|
||||
|
||||
```python
|
||||
init_orca_context(..., num_ray_nodes=2, ray_node_cpu_cores=4)
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,30 +1,12 @@
|
|||
# The Orca Library
|
||||
# Orca in 5 minutes
|
||||
|
||||
## 1. Overview
|
||||
### Overview
|
||||
|
||||
Most AI projects start with a Python notebook running on a single laptop; however, one usually needs to go through a mountain of pains to scale it to handle larger data set in a distributed fashion. The _**Orca**_ library seamlessly scales out your single node Python notebook across large clusters (so as to process distributed Big Data).
|
||||
|
||||
## 2. Install
|
||||
We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the Python environment.
|
||||
```bash
|
||||
conda create -n py37 python=3.7 # "py37" is conda environment name, you can use any name you like.
|
||||
conda activate py37
|
||||
pip install bigdl-orca
|
||||
```
|
||||
---
|
||||
|
||||
When installing bigdl-orca with pip, you can specify the extras key `[ray]` to additionally install the additional dependencies
|
||||
essential for running [RayOnSpark](../../Ray/Overview/ray.md)
|
||||
```bash
|
||||
pip install bigdl-orca[ray]
|
||||
```
|
||||
|
||||
You can install bigdl-orca nightly release version using
|
||||
```bash
|
||||
pip install --pre --upgrade bigdl-orca
|
||||
pip install --pre --upgrade bigdl-orca[ray]
|
||||
```
|
||||
|
||||
## 3. Run
|
||||
### **Tensorflow Bite-sized Example**
|
||||
|
||||
This section uses TensorFlow 1.15, and you should install TensorFlow before running this example:
|
||||
```bash
|
||||
|
|
@ -73,8 +55,3 @@ est.fit(data=df,
|
|||
feature_cols=['user', 'item'],
|
||||
label_cols=['label'])
|
||||
```
|
||||
|
||||
## Get Started
|
||||
|
||||
See [TensorFlow](../QuickStart/orca-tf-quickstart.md) and [PyTorch](../QuickStart/orca-pytorch-quickstart.md) quickstart for more details.
|
||||
|
||||
|
|
|
|||
|
|
@ -3,7 +3,7 @@
|
|||
---
|
||||
|
||||
[Ray](https://github.com/ray-project/ray) is an open source distributed framework for emerging AI applications.
|
||||
With the _**RayOnSpark**_ support packaged in [BigDL Orca](../../Orca/Overview/orca.md),
|
||||
With the _**RayOnSpark**_ support packaged in [BigDL Orca](../Overview/orca.md),
|
||||
Users can seamlessly integrate Ray applications into the big data processing pipeline on the underlying Big Data cluster
|
||||
(such as [Hadoop/YARN](../../UserGuide/hadoop.md) or [K8s](../../UserGuide/k8s.md)).
|
||||
|
||||
|
|
@ -23,7 +23,7 @@ conda activate py37
|
|||
pip install bigdl-orca[ray]
|
||||
```
|
||||
|
||||
View [Python User Guide](../../UserGuide/python.html#install) and [Orca User Guide](../../Orca/Overview/orca.md) for more installation instructions.
|
||||
View [Python User Guide](../../UserGuide/python.html#install) and [Orca User Guide](../Overview/orca.md) for more installation instructions.
|
||||
|
||||
---
|
||||
### **2. Initialize**
|
||||
|
|
@ -58,7 +58,7 @@ from bigdl.orca import OrcaContext
|
|||
OrcaContext.barrier_mode = False
|
||||
```
|
||||
|
||||
View [Orca Context](../../Orca/Overview/orca-context.md) for more details.
|
||||
View [Orca Context](../Overview/orca-context.md) for more details.
|
||||
|
||||
---
|
||||
### **3. Run**
|
||||
|
|
@ -82,7 +82,7 @@ View [Orca Context](../../Orca/Overview/orca-context.md) for more details.
|
|||
print(ray.get([c.increment.remote() for c in counters]))
|
||||
```
|
||||
|
||||
- You can retrieve the information of the Ray cluster via [`OrcaContext`](../../Orca/Overview/orca-context.md):
|
||||
- You can retrieve the information of the Ray cluster via [`OrcaContext`](../Overview/orca-context.md):
|
||||
|
||||
```python
|
||||
from bigdl.orca import OrcaContext
|
||||
44
docs/readthedocs/source/doc/Orca/QuickStart/index.md
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
# Orca Tutorial
|
||||
|
||||
|
||||
- [**Orca TensorFlow 1.15 Quickstart**](./orca-tf-quickstart.html)
|
||||
|
||||
> [Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf_lenet_mnist.ipynb) [View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf_lenet_mnist.ipynb)
|
||||
|
||||
In this guide we will describe how to scale out TensorFlow 1.15 programs using Orca in 4 simple steps.
|
||||
|
||||
---------------------------
|
||||
|
||||
- [**Orca TensorFlow 2 Quickstart**](./orca-tf2keras-quickstart.html)
|
||||
|
||||
> [Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf2_keras_lenet_mnist.ipynb) [View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/tf2_keras_lenet_mnist.ipynb)
|
||||
|
||||
In this guide we will describe how to to scale out TensorFlow 2 programs using Orca in 4 simple steps.
|
||||
|
||||
---------------------------
|
||||
|
||||
- [**Orca Keras 2.3 Quickstart**](./orca-keras-quickstart.html)
|
||||
|
||||
> [Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/keras_lenet_mnist.ipynb) [View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/keras_lenet_mnist.ipynb)
|
||||
|
||||
In this guide we will describe how to scale out Keras 2.3 programs using Orca in 4 simple steps.
|
||||
|
||||
---------------------------
|
||||
|
||||
|
||||
- [**Orca PyTorch Quickstart**](./orca-pytorch-quickstart.html)
|
||||
|
||||
> [Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/pytorch_lenet_mnist.ipynb) [View source on GitHub](https://github.com/intel-analytics/BigDL/blob/main/python/orca/colab-notebook/quickstart/pytorch_lenet_mnist.ipynb)
|
||||
|
||||
In this guide we will describe how to scale out PyTorch programs using Orca in 4 simple steps.
|
||||
|
||||
---------------------------
|
||||
|
||||
- [**Orca RayOnSpark Quickstart**](./ray-quickstart.html)
|
||||
|
||||
> [Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/branch-2.0/python/orca/colab-notebook/quickstart/ray_parameter_server.ipynb) [View source on GitHub](https://github.com/intel-analytics/BigDL/blob/branch-2.0/python/orca/colab-notebook/quickstart/ray_parameter_server.ipynb)
|
||||
|
||||
In this guide, we will describe how to use RayOnSpark to directly run Ray programs on Big Data clusters in 2 simple steps.
|
||||
|
||||
---------------------------
|
||||
|
||||
|
|
@ -33,7 +33,7 @@ elif cluster_mode == "yarn": # For Hadoop/YARN cluster
|
|||
sc = init_orca_context(cluster_mode="yarn", num_nodes=2, cores=2, memory="10g", driver_memory="10g", driver_cores=1, init_ray_on_spark=True)
|
||||
```
|
||||
|
||||
This is the only place where you need to specify local or distributed mode. See [here](./../../Ray/Overview/ray.md#initialize) for more RayOnSpark related arguments when you `init_orca_context`.
|
||||
This is the only place where you need to specify local or distributed mode. See [here](./../Overview/ray.md#initialize) for more RayOnSpark related arguments when you `init_orca_context`.
|
||||
|
||||
By default, the Ray cluster would be launched using Spark barrier execution mode, you can turn it off via the configurations of `OrcaContext`:
|
||||
|
||||
|
|
@ -43,7 +43,7 @@ from bigdl.orca import OrcaContext
|
|||
OrcaContext.barrier_mode = False
|
||||
```
|
||||
|
||||
View [Orca Context](./../../Orca/Overview/orca-context.md) for more details.
|
||||
View [Orca Context](./../Overview/orca-context.md) for more details.
|
||||
|
||||
**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
|
||||
|
||||
63
docs/readthedocs/source/doc/Orca/index.rst
Normal file
|
|
@ -0,0 +1,63 @@
|
|||
BigDL-Orca
|
||||
=========================
|
||||
|
||||
Most AI projects start with a Python notebook running on a single laptop; however, one usually needs to go through a mountain of pains to scale it to handle larger data set in a distributed fashion. The **BigDL-Orca** (or **Orca** for short) library seamlessly scales out your single node Python notebook across large clusters (so as to process distributed Big Data).
|
||||
|
||||
|
||||
-------
|
||||
|
||||
|
||||
.. grid:: 1 2 2 2
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Get Started**
|
||||
^^^
|
||||
|
||||
Documents in these sections helps you get started quickly with Orca.
|
||||
|
||||
+++
|
||||
:bdg-link:`Orca in 5 minutes <./Overview/orca.html>` |
|
||||
:bdg-link:`Installation <./Overview/install.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Key Features Guide**
|
||||
^^^
|
||||
|
||||
Each guide in this section provides you with in-depth information, concepts and knowledges about Orca key features.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Data <./Overview/data-parallel-processing.html>` |
|
||||
:bdg-link:`Estimator <./Overview/distributed-training-inference.html>` |
|
||||
:bdg-link:`RayOnSpark <./Overview/ray.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Tutorials**
|
||||
^^^
|
||||
|
||||
Orca Tutorials and Examples.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Tutorials <./QuickStart/index.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**API Document**
|
||||
^^^
|
||||
|
||||
API Document provides detailed description of Orca APIs.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`API Document <../PythonAPI/Orca/index.html>`
|
||||
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
BigDL-Orca Document <self>
|
||||
|
|
@ -37,17 +37,18 @@ On `Subscribe` page, input your subscription, your Azure container registry, you
|
|||
|
||||
* Go to your Azure container regsitry, check `Repostirories`, and find `intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene`
|
||||
* Login to the created VM. Then login to your Azure container registry, pull BigDL PPML image using this command:
|
||||
```bash
|
||||
docker pull myContainerRegistry/intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene
|
||||
```
|
||||
```bash
|
||||
docker pull myContainerRegistry/intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene
|
||||
```
|
||||
* Start container of this image
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
export LOCAL_IP=YOUR_LOCAL_IP
|
||||
export DOCKER_IMAGE=intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
sudo docker run -itd \
|
||||
export LOCAL_IP=YOUR_LOCAL_IP
|
||||
export DOCKER_IMAGE=intel_corporation/bigdl-ppml-trusted-big-data-ml-python-graphene
|
||||
|
||||
sudo docker run -itd \
|
||||
--privileged \
|
||||
--net=host \
|
||||
--cpuset-cpus="0-5" \
|
||||
|
|
@ -60,7 +61,8 @@ sudo docker run -itd \
|
|||
-e LOCAL_IP=$LOCAL_IP \
|
||||
-e SGX_MEM_SIZE=64G \
|
||||
$DOCKER_IMAGE bash
|
||||
```
|
||||
|
||||
```
|
||||
|
||||
### 2.3 Create AKS(Azure Kubernetes Services) or use existing AKs
|
||||
First, login to your client VM and enter your BigDL PPML container:
|
||||
|
|
@ -89,34 +91,35 @@ You can check the information by running:
|
|||
/ppml/trusted-big-data-ml/azure/create-aks.sh --help
|
||||
```
|
||||
|
||||
## 2.4 Create Azure Data Lake Store Gen 2
|
||||
### 2.4.1 Create Data Lake Storage account or use an existing one.
|
||||
### 2.4 Create Azure Data Lake Store Gen 2
|
||||
#### 2.4.1 Create Data Lake Storage account or use an existing one.
|
||||
The example command to create Data Lake store is as below:
|
||||
```bash
|
||||
az dls account create --account myDataLakeAccount --location myLocation --resource-group myResourceGroup
|
||||
```
|
||||
* Create Container to put user data
|
||||
Example command to create container
|
||||
```bash
|
||||
az storage fs create -n myFS --account-name myDataLakeAccount --auth-mode login
|
||||
```
|
||||
* Create folder, upload file/folder
|
||||
Example command to create folder:
|
||||
```bash
|
||||
az storage fs directory create -n myDirectory -f myFS --account-name myDataLakeAccount --auth-mode login
|
||||
```
|
||||
|
||||
Example command to upload file
|
||||
```bash
|
||||
az storage fs file upload -s "path/to/file" -p myDirectory/file -f myFS --account-name myDataLakeAccount --auth-mode login
|
||||
```
|
||||
Example command to upload directory
|
||||
```bash
|
||||
az storage fs directory upload -f myFS --account-name myDataLakeAccount -s "path/to/directory" -d myDirectory --recursive
|
||||
```
|
||||
### 2.4.2 Access data in Hadoop through ABFS(Azure Blob Filesystem) driver
|
||||
Example command to create container
|
||||
```bash
|
||||
az storage fs create -n myFS --account-name myDataLakeAccount --auth-mode login
|
||||
```
|
||||
* Create folder, upload file/folder
|
||||
|
||||
Example command to create folder
|
||||
```bash
|
||||
az storage fs directory create -n myDirectory -f myFS --account-name myDataLakeAccount --auth-mode login
|
||||
```
|
||||
Example command to upload file
|
||||
```bash
|
||||
az storage fs file upload -s "path/to/file" -p myDirectory/file -f myFS --account-name myDataLakeAccount --auth-mode login
|
||||
```
|
||||
Example command to upload directory
|
||||
```bash
|
||||
az storage fs directory upload -f myFS --account-name myDataLakeAccount -s "path/to/directory" -d myDirectory --recursive
|
||||
```
|
||||
#### 2.4.2 Access data in Hadoop through ABFS(Azure Blob Filesystem) driver
|
||||
You can access Data Lake Storage in Hadoop filesytem by such URI: ```abfs[s]://file_system@account_name.dfs.core.windows.net/<path>/<path>/<file_name>```
|
||||
#### Authentication
|
||||
##### Authentication
|
||||
The ABFS driver supports two forms of authentication so that the Hadoop application may securely access resources contained within a Data Lake Storage Gen2 capable account.
|
||||
- Shared Key: This permits users to access to ALL resources in the account. The key is encrypted and stored in Hadoop configuration.
|
||||
|
||||
|
|
@ -124,13 +127,13 @@ The ABFS driver supports two forms of authentication so that the Hadoop applicat
|
|||
|
||||
By default, in our solution, we use shared key authentication.
|
||||
- Get Access key list of the storage account:
|
||||
```bash
|
||||
az storage account keys list -g MyResourceGroup -n myDataLakeAccount
|
||||
```
|
||||
```bash
|
||||
az storage account keys list -g MyResourceGroup -n myDataLakeAccount
|
||||
```
|
||||
Use one of the keys for authentication.
|
||||
|
||||
## 2.5 Create Azure Key Vault
|
||||
### 2.5.1 Create or use an existing Azure Key Vault
|
||||
### 2.5 Create Azure Key Vault
|
||||
#### 2.5.1 Create or use an existing Azure Key Vault
|
||||
Example command to create key vault
|
||||
```bash
|
||||
az keyvault create -n myKeyVault -g myResourceGroup -l location
|
||||
|
|
@ -142,29 +145,30 @@ Take note of the following properties for use in the next section:
|
|||
* The name of your Azure key vault resource
|
||||
* The Azure tenant ID that the subscription belongs to
|
||||
|
||||
### 2.5.2 Set access policy for the client VM
|
||||
#### 2.5.2 Set access policy for the client VM
|
||||
* Run such command to get the system identity:
|
||||
```bash
|
||||
az vm identity assign -g myResourceGroup -n myVM
|
||||
```
|
||||
The output would be like this:
|
||||
```bash
|
||||
{
|
||||
```bash
|
||||
az vm identity assign -g myResourceGroup -n myVM
|
||||
```
|
||||
The output would be like this:
|
||||
```bash
|
||||
{
|
||||
"systemAssignedIdentity": "ff5505d6-8f72-4b99-af68-baff0fbd20f5",
|
||||
"userAssignedIdentities": {}
|
||||
}
|
||||
```
|
||||
Take note of the systemAssignedIdentity of the client VM.
|
||||
}
|
||||
```
|
||||
Take note of the systemAssignedIdentity of the client VM.
|
||||
|
||||
* Set access policy for client VM
|
||||
Example command:
|
||||
```bash
|
||||
az keyvault set-policy --name myKeyVault --object-id <mySystemAssignedIdentity> --secret-permissions all --key-permissions all --certificate-permissions all
|
||||
```
|
||||
|
||||
### 2.5.3 AKS access Key Vault
|
||||
#### 2.5.3.1 Set access for AKS VM ScaleSet
|
||||
##### a. Find your VM ScaleSet in your AKS, and assign system managed identity to VM ScaleSet.
|
||||
Example command:
|
||||
```bash
|
||||
az keyvault set-policy --name myKeyVault --object-id <mySystemAssignedIdentity> --secret-permissions all --key-permissions all --certificate-permissions all
|
||||
```
|
||||
|
||||
#### 2.5.3 AKS access Key Vault
|
||||
##### 2.5.3.1 Set access for AKS VM ScaleSet
|
||||
###### a. Find your VM ScaleSet in your AKS, and assign system managed identity to VM ScaleSet.
|
||||
```bash
|
||||
az vm identity assign -g myResourceGroup -n myAKSVMSS
|
||||
```
|
||||
|
|
@ -179,50 +183,53 @@ userAssignedIdentities:
|
|||
principalId: xxxxx
|
||||
```
|
||||
Take note of principalId of the first line as System Managed Identity of your VMSS.
|
||||
##### b. Set access policy for AKS VM ScaleSet
|
||||
###### b. Set access policy for AKS VM ScaleSet
|
||||
Example command:
|
||||
```bash
|
||||
az keyvault set-policy --name myKeyVault --object-id <systemManagedIdentityOfVMSS> --secret-permissions get --key-permissions all
|
||||
```
|
||||
#### 2.5.3.2 Set access for AKS
|
||||
##### a. Enable Azure Key Vault Provider for Secrets Store CSI Driver support
|
||||
##### 2.5.3.2 Set access for AKS
|
||||
###### a. Enable Azure Key Vault Provider for Secrets Store CSI Driver support
|
||||
Example command:
|
||||
```bash
|
||||
az aks enable-addons --addons azure-keyvault-secrets-provider --name myAKSCluster --resource-group myResourceGroup
|
||||
```
|
||||
* Verify the Azure Key Vault Provider for Secrets Store CSI Driver installation
|
||||
Example command:
|
||||
```bash
|
||||
kubectl get pods -n kube-system -l 'app in (secrets-store-csi-driver, secrets-store-provider-azure)'
|
||||
```
|
||||
Be sure that a Secrets Store CSI Driver pod and an Azure Key Vault Provider pod are running on each node in your cluster's node pools.
|
||||
|
||||
Example command:
|
||||
```bash
|
||||
kubectl get pods -n kube-system -l 'app in (secrets-store-csi-driver, secrets-store-provider-azure)'
|
||||
```
|
||||
Be sure that a Secrets Store CSI Driver pod and an Azure Key Vault Provider pod are running on each node in your cluster's node pools.
|
||||
* Enable Azure Key Vault Provider for Secrets Store CSI Driver to track of secret update in key vault
|
||||
```bash
|
||||
az aks update -g myResourceGroup -n myAKSCluster --enable-secret-rotation
|
||||
```
|
||||
#### b. Provide an identity to access the Azure Key Vault
|
||||
```bash
|
||||
az aks update -g myResourceGroup -n myAKSCluster --enable-secret-rotation
|
||||
```
|
||||
###### b. Provide an identity to access the Azure Key Vault
|
||||
There are several ways to provide identity for Azure Key Vault Provider for Secrets Store CSI Driver to access Azure Key Vault: `An Azure Active Directory pod identity`, `user-assigned identity` or `system-assigned managed identity`. In our solution, we use user-assigned managed identity.
|
||||
* Enable managed identity in AKS
|
||||
```bash
|
||||
az aks update -g myResourceGroup -n myAKSCluster --enable-managed-identity
|
||||
```
|
||||
```bash
|
||||
az aks update -g myResourceGroup -n myAKSCluster --enable-managed-identity
|
||||
```
|
||||
* Get user-assigned managed identity that you created when you enabled a managed identity on your AKS cluster
|
||||
Run:
|
||||
```bash
|
||||
az aks show -g myResourceGroup -n myAKSCluster --query addonProfiles.azureKeyvaultSecretsProvider.identity.clientId -o tsv
|
||||
```
|
||||
The output would be like:
|
||||
```bash
|
||||
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
|
||||
```
|
||||
Take note of this output as your user-assigned managed identity of Azure KeyVault Secrets Provider
|
||||
|
||||
Run:
|
||||
```bash
|
||||
az aks show -g myResourceGroup -n myAKSCluster --query addonProfiles.azureKeyvaultSecretsProvider.identity.clientId -o tsv
|
||||
```
|
||||
The output would be like:
|
||||
```bash
|
||||
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
|
||||
```
|
||||
Take note of this output as your user-assigned managed identity of Azure KeyVault Secrets Provider
|
||||
* Grant your user-assigned managed identity permissions that enable it to read your key vault and view its contents
|
||||
Example command:
|
||||
```bash
|
||||
az keyvault set-policy -n myKeyVault --key-permissions get --spn xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
|
||||
az keyvault set-policy -n myKeyVault --secret-permissions get --spn xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
|
||||
```
|
||||
#### c. Create a SecretProviderClass to access your Key Vault
|
||||
|
||||
Example command:
|
||||
```bash
|
||||
az keyvault set-policy -n myKeyVault --key-permissions get --spn xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
|
||||
az keyvault set-policy -n myKeyVault --secret-permissions get --spn xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
|
||||
```
|
||||
###### c. Create a SecretProviderClass to access your Key Vault
|
||||
On your client docker container, edit `/ppml/trusted-big-data-ml/azure/secretProviderClass.yaml` file, modify `<client-id>` to your user-assigned managed identity of Azure KeyVault Secrets Provider, and modify `<key-vault-name>` and `<tenant-id>` to your real key vault name and tenant id.
|
||||
|
||||
Then run:
|
||||
|
|
|
|||
8
docs/readthedocs/source/doc/PPML/Overview/examples.rst
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
Tutorials & Examples
|
||||
=====================================
|
||||
|
||||
* `A Hello World Example <../Overview/quicktour.html>`__ is a very simple exmaple for getting started.
|
||||
|
||||
* `PPML e2e Example <../QuickStart/end-to-end.html>`__ introduces the end-to-end PPML workflow using SimpleQuery as an example.
|
||||
|
||||
* You can also find Trusted Data Analysis, Trusted ML, Trusted DL and Trusted FL examples in `more examples <https://github.com/intel-analytics/BigDL/tree/main/ppml/docs/examples.md>`__.
|
||||
35
docs/readthedocs/source/doc/PPML/Overview/intro.md
Normal file
|
|
@ -0,0 +1,35 @@
|
|||
# PPML Introduction
|
||||
|
||||
## 1. What is BigDL PPML?
|
||||
|
||||
<video src="https://user-images.githubusercontent.com/61072813/184758908-da01f8ea-8f52-4300-9736-8c5ee981d4c0.mp4" width="100%" controls></video>
|
||||
|
||||
---
|
||||
|
||||
Protecting data privacy and confidentiality is critical in a world where data is everywhere. In recent years, more and more countries have enacted data privacy legislation or are expected to pass comprehensive legislation to protect data privacy, the importance of privacy and data protection is increasingly recognized.
|
||||
|
||||
To better protect sensitive data, it's necessary to ensure security for all dimensions of data lifecycle: data at rest, data in transit, and data in use. Data being transferred on a network is `in transit`, data in storage is `at rest`, and data being processed is `in use`.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://user-images.githubusercontent.com/61072813/177720405-60297d62-d186-4633-8b5f-ff4876cc96d6.png" alt="data lifecycle" width='390px' height='260px'/>
|
||||
</p>
|
||||
|
||||
To protect data in transit, enterprises often choose to encrypt sensitive data prior to moving or use encrypted connections (HTTPS, SSL, TLS, FTPS, etc) to protect the contents of data in transit. For protecting data at rest, enterprises can simply encrypt sensitive files prior to storing them or choose to encrypt the storage drive itself. However, the third state, data in use has always been a weakly protected target. There are three emerging solutions seek to reduce the data-in-use attack surface: homomorphic encryption, multi-party computation, and confidential computing.
|
||||
|
||||
Among these security technologies, [Confidential computing](https://www.intel.com/content/www/us/en/security/confidential-computing.html) protects data in use by performing computation in a hardware-based [Trusted Execution Environment (TEE)](https://en.wikipedia.org/wiki/Trusted_execution_environment). [Intel® SGX](https://www.intel.com/content/www/us/en/developer/tools/software-guard-extensions/overview.html) is Intel's Trusted Execution Environment (TEE), offering hardware-based memory encryption that isolates specific application code and data in memory. [Intel® TDX](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html) is the next generation Intel's Trusted Execution Environment (TEE), introducing new, architectural elements to help deploy hardware-isolated, virtual machines (VMs) called trust domains (TDs).
|
||||
|
||||
[PPML](https://bigdl.readthedocs.io/en/latest/doc/PPML/Overview/ppml.html) (Privacy Preserving Machine Learning) in [BigDL 2.0](https://github.com/intel-analytics/BigDL) provides a Trusted Cluster Environment for secure Big Data & AI applications, even on untrusted cloud environment. By combining Intel Software Guard Extensions (SGX) with several other security technologies (e.g., attestation, key management service, private set intersection, federated learning, homomorphic encryption, etc.), BigDL PPML ensures end-to-end security enabled for the entire distributed workflows, such as Apache Spark, Apache Flink, XGBoost, TensorFlow, PyTorch, etc.
|
||||
|
||||
|
||||
## 2. Why BigDL PPML?
|
||||
PPML allows organizations to explore powerful AI techniques while working to minimize the security risks associated with handling large amounts of sensitive data. PPML protects data at rest, in transit and in use: compute and memory protected by SGX Enclaves, storage (e.g., data and model) protected by encryption, network communication protected by remote attestation and Transport Layer Security (TLS), and optional Federated Learning support.
|
||||
|
||||
<p align="left">
|
||||
<img src="https://user-images.githubusercontent.com/61072813/177922914-f670111c-e174-40d2-b95a-aafe92485024.png" alt="data lifecycle" width='600px' />
|
||||
</p>
|
||||
|
||||
With BigDL PPML, you can run trusted Big Data & AI applications
|
||||
- **Trusted Spark SQL & Dataframe**: with the trusted Big Data analytics and ML/DL support, users can run standard Spark data analysis (such as Spark SQL, Dataframe, MLlib, etc.) in a secure and trusted fashion.
|
||||
- **Trusted ML (Machine Learning)**: with the trusted Big Data analytics and ML/DL support, users can run distributed machine learning (such as MLlib, XGBoost) in a secure and trusted fashion.
|
||||
- **Trusted DL (Deep Learning)**: with the trusted Big Data analytics and ML/DL support, users can run distributed deep learning (such as BigDL, Orca, Nano, DLlib) in a secure and trusted fashion.
|
||||
- **Trusted FL (Federated Learning)**: with PSI (Private Set Intersection), Secured Aggregation and trusted federated learning support, users can build united model across different parties without compromising privacy, even if these parities have different datasets or features.
|
||||
14
docs/readthedocs/source/doc/PPML/Overview/misc.rst
Normal file
|
|
@ -0,0 +1,14 @@
|
|||
Advanced Topic
|
||||
====================
|
||||
|
||||
|
||||
* `Privacy Preserving Machine Learning (PPML) User Guide <ppml.html>`_
|
||||
* `Trusted Big Data Analytics and ML <trusted_big_data_analytics_and_ml.html>`_
|
||||
* `Trusted FL (Federated Learning) <trusted_fl.html>`_
|
||||
* `Secure Your Services <../QuickStart/secure_your_services.html>`_
|
||||
* `Building Linux Kernel from Source with SGX Enabled <../QuickStart/build_kernel_with_sgx.html>`_
|
||||
* `Deploy the Intel SGX Device Plugin for Kubernetes <../QuickStart/deploy_intel_sgx_device_plugin_for_kubernetes.html>`_
|
||||
* `Trusted Cluster Serving with Graphene on Kubernetes <../QuickStart/trusted-serving-on-k8s-guide.html>`_
|
||||
* `TPC-H with Trusted SparkSQL on Kubernetes <../QuickStart/tpc-h_with_sparksql_on_k8s.html>`_
|
||||
* `TPC-DS with Trusted SparkSQL on Kubernetes <../QuickStart/tpc-ds_with_sparksql_on_k8s.html>`_
|
||||
* `Privacy Preserving Machine Learning (PPML) on Azure User Guide <azure_ppml.html>`_
|
||||
|
|
@ -230,29 +230,30 @@ Follow the guide below to run Spark on Kubernetes manually. Alternatively, you c
|
|||
|
||||
1. Enter `BigDL/ppml/trusted-big-data-ml/python/docker-graphene` dir. Refer to the previous section about [preparing data, keys and passwords](#2221-start-ppml-container). Then run the following commands to generate your enclave key and add it to your Kubernetes cluster as a secret.
|
||||
|
||||
```bash
|
||||
kubectl apply -f keys/keys.yaml
|
||||
kubectl apply -f password/password.yaml
|
||||
cd kubernetes
|
||||
bash enclave-key-to-secret.sh
|
||||
```
|
||||
```bash
|
||||
kubectl apply -f keys/keys.yaml
|
||||
kubectl apply -f password/password.yaml
|
||||
cd kubernetes
|
||||
bash enclave-key-to-secret.sh
|
||||
```
|
||||
2. Create the [RBAC(Role-based access control)](https://spark.apache.org/docs/latest/running-on-kubernetes.html#rbac) :
|
||||
|
||||
```bash
|
||||
kubectl create serviceaccount spark
|
||||
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
|
||||
```
|
||||
```bash
|
||||
kubectl create serviceaccount spark
|
||||
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
|
||||
```
|
||||
|
||||
3. Generate K8s config file, modify `YOUR_DIR` to the location you want to store the config:
|
||||
|
||||
```bash
|
||||
kubectl config view --flatten --minify > /YOUR_DIR/kubeconfig
|
||||
```
|
||||
```bash
|
||||
kubectl config view --flatten --minify > /YOUR_DIR/kubeconfig
|
||||
```
|
||||
|
||||
4. Create K8s secret, the secret created `YOUR_SECRET` should be the same as the password you specified in step 1:
|
||||
|
||||
```bash
|
||||
kubectl create secret generic spark-secret --from-literal secret=YOUR_SECRET
|
||||
```
|
||||
```bash
|
||||
kubectl create secret generic spark-secret --from-literal secret=YOUR_SECRET
|
||||
```
|
||||
|
||||
##### 2.2.3.2 Start the client container
|
||||
|
||||
|
|
@ -309,21 +310,21 @@ sudo docker run -itd \
|
|||
|
||||
1. Run `docker exec -it spark-local-k8s-client bash` to enter the container. Then run the following command to init the Spark local K8s client.
|
||||
|
||||
```bash
|
||||
./init.sh
|
||||
```
|
||||
```bash
|
||||
./init.sh
|
||||
```
|
||||
|
||||
2. We assume you have a working Network File System (NFS) configured for your Kubernetes cluster. Configure the `nfsvolumeclaim` on the last line to the name of the Persistent Volume Claim (PVC) of your NFS. Please prepare the following and put them in your NFS directory:
|
||||
|
||||
- The data (in a directory called `data`)
|
||||
- The kubeconfig file.
|
||||
- The data (in a directory called `data`)
|
||||
- The kubeconfig file.
|
||||
|
||||
3. Run the following command to start Spark-Pi example. When the application runs in `cluster` mode, you can run ` kubectl get pod ` to get the name and status of your K8s pod(e.g., driver-xxxx). Then you can run ` kubectl logs -f driver-xxxx ` to get the output of your application.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
secure_password=`openssl rsautl -inkey /ppml/trusted-big-data-ml/work/password/key.txt -decrypt </ppml/trusted-big-data-ml/work/password/output.bin` && \
|
||||
export TF_MKL_ALLOC_MAX_BYTES=10737418240 && \
|
||||
```bash
|
||||
#!/bin/bash
|
||||
secure_password=`openssl rsautl -inkey /ppml/trusted-big-data-ml/work/password/key.txt -decrypt </ppml/trusted-big-data-ml/work/password/output.bin` && \
|
||||
export TF_MKL_ALLOC_MAX_BYTES=10737418240 && \
|
||||
export SPARK_LOCAL_IP=$LOCAL_IP && \
|
||||
/opt/jdk8/bin/java \
|
||||
-cp '/ppml/trusted-big-data-ml/work/spark-3.1.2/conf/:/ppml/trusted-big-data-ml/work/spark-3.1.2/jars/*' \
|
||||
|
|
@ -377,7 +378,7 @@ export TF_MKL_ALLOC_MAX_BYTES=10737418240 && \
|
|||
--class org.apache.spark.examples.SparkPi \
|
||||
--verbose \
|
||||
local:///ppml/trusted-big-data-ml/work/spark-3.1.2/examples/jars/spark-examples_2.12-3.1.2.jar 100 2>&1 | tee spark-pi-sgx-$SPARK_MODE.log
|
||||
```
|
||||
```
|
||||
|
||||
You can run your own Spark application after changing `--class` and jar path.
|
||||
|
||||
|
|
|
|||
92
docs/readthedocs/source/doc/PPML/Overview/quicktour.md
Normal file
|
|
@ -0,0 +1,92 @@
|
|||
# A Hello World Example
|
||||
|
||||
|
||||
In this section, you can get started with running a simple native python HelloWorld program and a simple native Spark Pi program locally in a BigDL PPML client container to get an initial understanding of the usage of ppml.
|
||||
|
||||
|
||||
|
||||
## a. Prepare Keys
|
||||
|
||||
* generate ssl_key
|
||||
|
||||
Download scripts from [here](https://github.com/intel-analytics/BigDL).
|
||||
|
||||
```
|
||||
cd BigDL/ppml/
|
||||
sudo bash scripts/generate-keys.sh
|
||||
```
|
||||
This script will generate keys under keys/ folder
|
||||
|
||||
* generate enclave-key.pem
|
||||
|
||||
```
|
||||
openssl genrsa -3 -out enclave-key.pem 3072
|
||||
```
|
||||
This script generates a file enclave-key.pem which is used to sign image.
|
||||
|
||||
|
||||
## b. Start the BigDL PPML client container
|
||||
|
||||
```
|
||||
#!/bin/bash
|
||||
|
||||
# ENCLAVE_KEY_PATH means the absolute path to the "enclave-key.pem" in step a
|
||||
# KEYS_PATH means the absolute path to the keys folder in step a
|
||||
# LOCAL_IP means your local IP address.
|
||||
export ENCLAVE_KEY_PATH=YOUR_LOCAL_ENCLAVE_KEY_PATH
|
||||
export KEYS_PATH=YOUR_LOCAL_KEYS_PATH
|
||||
export LOCAL_IP=YOUR_LOCAL_IP
|
||||
export DOCKER_IMAGE=intelanalytics/bigdl-ppml-trusted-big-data-ml-python-graphene:devel
|
||||
|
||||
sudo docker pull $DOCKER_IMAGE
|
||||
|
||||
sudo docker run -itd \
|
||||
--privileged \
|
||||
--net=host \
|
||||
--cpuset-cpus="0-5" \
|
||||
--oom-kill-disable \
|
||||
--device=/dev/gsgx \
|
||||
--device=/dev/sgx/enclave \
|
||||
--device=/dev/sgx/provision \
|
||||
-v $ENCLAVE_KEY_PATH:/graphene/Pal/src/host/Linux-SGX/signer/enclave-key.pem \
|
||||
-v /var/run/aesmd/aesm.socket:/var/run/aesmd/aesm.socket \
|
||||
-v $KEYS_PATH:/ppml/trusted-big-data-ml/work/keys \
|
||||
--name=bigdl-ppml-client-local \
|
||||
-e LOCAL_IP=$LOCAL_IP \
|
||||
-e SGX_MEM_SIZE=64G \
|
||||
$DOCKER_IMAGE bash
|
||||
```
|
||||
|
||||
## c. Run Python HelloWorld in BigDL PPML Client Container
|
||||
|
||||
Run the [script](https://github.com/intel-analytics/BigDL/blob/main/ppml/trusted-big-data-ml/python/docker-graphene/start-scripts/start-python-helloworld-sgx.sh) to run trusted [Python HelloWorld](https://github.com/intel-analytics/BigDL/blob/main/ppml/trusted-big-data-ml/python/docker-graphene/examples/helloworld.py) in BigDL PPML client container:
|
||||
```
|
||||
sudo docker exec -it bigdl-ppml-client-local bash work/start-scripts/start-python-helloworld-sgx.sh
|
||||
```
|
||||
Check the log:
|
||||
```
|
||||
sudo docker exec -it bigdl-ppml-client-local cat /ppml/trusted-big-data-ml/test-helloworld-sgx.log | egrep "Hello World"
|
||||
```
|
||||
The result should look something like this:
|
||||
> Hello World
|
||||
|
||||
|
||||
## d. Run Spark Pi in BigDL PPML Client Container
|
||||
|
||||
Run the [script](https://github.com/intel-analytics/BigDL/blob/main/ppml/trusted-big-data-ml/python/docker-graphene/start-scripts/start-spark-local-pi-sgx.sh) to run trusted [Spark Pi](https://github.com/apache/spark/blob/v3.1.2/examples/src/main/python/pi.py) in BigDL PPML client container:
|
||||
|
||||
```bash
|
||||
sudo docker exec -it bigdl-ppml-client-local bash work/start-scripts/start-spark-local-pi-sgx.sh
|
||||
```
|
||||
|
||||
Check the log:
|
||||
|
||||
```bash
|
||||
sudo docker exec -it bigdl-ppml-client-local cat /ppml/trusted-big-data-ml/test-pi-sgx.log | egrep "roughly"
|
||||
```
|
||||
|
||||
The result should look something like this:
|
||||
|
||||
> Pi is roughly 3.146760
|
||||
|
||||
<br />
|
||||
504
docs/readthedocs/source/doc/PPML/Overview/userguide.md
Normal file
|
|
@ -0,0 +1,504 @@
|
|||
## Develop your own Big Data & AI applications with BigDL PPML
|
||||
|
||||
First you need to create a `PPMLContext`, which wraps `SparkSession` and provides methods to read encrypted data file into plain-text RDD/DataFrame and write DataFrame to encrypted data file. Then you can read & write data through `PPMLContext`.
|
||||
|
||||
If you are familiar with Spark, you may find that the usage of `PPMLConext` is very similar to Spark.
|
||||
|
||||
### 1. Create PPMLContext
|
||||
|
||||
- create a PPMLContext with `appName`
|
||||
|
||||
This is the simplest way to create a `PPMLContext`. When you don't need to read/write encrypted files, you can use this way to create a `PPMLContext`.
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.PPMLContext
|
||||
|
||||
val sc = PPMLContext.initPPMLContext("MyApp")
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
from bigdl.ppml.ppml_context import *
|
||||
|
||||
sc = PPMLContext("MyApp")
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
If you want to read/write encrypted files, then you need to provide more information.
|
||||
|
||||
- create a PPMLContext with `appName` & `ppmlArgs`
|
||||
|
||||
`ppmlArgs` is ppml arguments in a Map, `ppmlArgs` varies according to the kind of Key Management Service (KMS) you are using. Key Management Service (KMS) is used to generate `primaryKey` and `dataKey` to encrypt/decrypt data. We provide 3 types of KMS ——SimpleKeyManagementService, EHSMKeyManagementService, AzureKeyManagementService.
|
||||
|
||||
Refer to [KMS Utils](https://github.com/intel-analytics/BigDL/blob/main/ppml/services/kms-utils/docker/README.md) to use KMS to generate `primaryKey` and `dataKey`, then you are ready to create **PPMLContext** with `ppmlArgs`.
|
||||
|
||||
- For `SimpleKeyManagementService`:
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.PPMLContext
|
||||
|
||||
val ppmlArgs: Map[String, String] = Map(
|
||||
"spark.bigdl.kms.type" -> "SimpleKeyManagementService",
|
||||
"spark.bigdl.kms.simple.id" -> "your_app_id",
|
||||
"spark.bigdl.kms.simple.key" -> "your_app_key",
|
||||
"spark.bigdl.kms.key.primary" -> "/your/primary/key/path/primaryKey",
|
||||
"spark.bigdl.kms.key.data" -> "/your/data/key/path/dataKey"
|
||||
)
|
||||
|
||||
val sc = PPMLContext.initPPMLContext("MyApp", ppmlArgs)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
from bigdl.ppml.ppml_context import *
|
||||
|
||||
ppml_args = {"kms_type": "SimpleKeyManagementService",
|
||||
"simple_app_id": "your_app_id",
|
||||
"simple_app_key": "your_app_key",
|
||||
"primary_key_path": "/your/primary/key/path/primaryKey",
|
||||
"data_key_path": "/your/data/key/path/dataKey"
|
||||
}
|
||||
|
||||
sc = PPMLContext("MyApp", ppml_args)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
- For `EHSMKeyManagementService`:
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.PPMLContext
|
||||
|
||||
val ppmlArgs: Map[String, String] = Map(
|
||||
"spark.bigdl.kms.type" -> "EHSMKeyManagementService",
|
||||
"spark.bigdl.kms.ehs.ip" -> "your_server_ip",
|
||||
"spark.bigdl.kms.ehs.port" -> "your_server_port",
|
||||
"spark.bigdl.kms.ehs.id" -> "your_app_id",
|
||||
"spark.bigdl.kms.ehs.key" -> "your_app_key",
|
||||
"spark.bigdl.kms.key.primary" -> "/your/primary/key/path/primaryKey",
|
||||
"spark.bigdl.kms.key.data" -> "/your/data/key/path/dataKey"
|
||||
)
|
||||
|
||||
val sc = PPMLContext.initPPMLContext("MyApp", ppmlArgs)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
from bigdl.ppml.ppml_context import *
|
||||
|
||||
ppml_args = {"kms_type": "EHSMKeyManagementService",
|
||||
"kms_server_ip": "your_server_ip",
|
||||
"kms_server_port": "your_server_port"
|
||||
"ehsm_app_id": "your_app_id",
|
||||
"ehsm_app_key": "your_app_key",
|
||||
"primary_key_path": "/your/primary/key/path/primaryKey",
|
||||
"data_key_path": "/your/data/key/path/dataKey"
|
||||
}
|
||||
|
||||
sc = PPMLContext("MyApp", ppml_args)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
- For `AzureKeyManagementService`
|
||||
|
||||
|
||||
the parameter `clientId` is not necessary, you don't have to provide this parameter.
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.PPMLContext
|
||||
|
||||
val ppmlArgs: Map[String, String] = Map(
|
||||
"spark.bigdl.kms.type" -> "AzureKeyManagementService",
|
||||
"spark.bigdl.kms.azure.vault" -> "key_vault_name",
|
||||
"spark.bigdl.kms.azure.clientId" -> "client_id",
|
||||
"spark.bigdl.kms.key.primary" -> "/your/primary/key/path/primaryKey",
|
||||
"spark.bigdl.kms.key.data" -> "/your/data/key/path/dataKey"
|
||||
)
|
||||
|
||||
val sc = PPMLContext.initPPMLContext("MyApp", ppmlArgs)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
from bigdl.ppml.ppml_context import *
|
||||
|
||||
ppml_args = {"kms_type": "AzureKeyManagementService",
|
||||
"azure_vault": "your_azure_vault",
|
||||
"azure_client_id": "your_azure_client_id",
|
||||
"primary_key_path": "/your/primary/key/path/primaryKey",
|
||||
"data_key_path": "/your/data/key/path/dataKey"
|
||||
}
|
||||
|
||||
sc = PPMLContext("MyApp", ppml_args)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
- create a PPMLContext with `sparkConf` & `appName` & `ppmlArgs`
|
||||
|
||||
If you need to set Spark configurations, you can provide a `SparkConf` with Spark configurations to create a `PPMLContext`.
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.PPMLContext
|
||||
import org.apache.spark.SparkConf
|
||||
|
||||
val ppmlArgs: Map[String, String] = Map(
|
||||
"spark.bigdl.kms.type" -> "SimpleKeyManagementService",
|
||||
"spark.bigdl.kms.simple.id" -> "your_app_id",
|
||||
"spark.bigdl.kms.simple.key" -> "your_app_key",
|
||||
"spark.bigdl.kms.key.primary" -> "/your/primary/key/path/primaryKey",
|
||||
"spark.bigdl.kms.key.data" -> "/your/data/key/path/dataKey"
|
||||
)
|
||||
|
||||
val conf: SparkConf = new SparkConf().setMaster("local[4]")
|
||||
|
||||
val sc = PPMLContext.initPPMLContext(conf, "MyApp", ppmlArgs)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
from bigdl.ppml.ppml_context import *
|
||||
from pyspark import SparkConf
|
||||
|
||||
ppml_args = {"kms_type": "SimpleKeyManagementService",
|
||||
"simple_app_id": "your_app_id",
|
||||
"simple_app_key": "your_app_key",
|
||||
"primary_key_path": "/your/primary/key/path/primaryKey",
|
||||
"data_key_path": "/your/data/key/path/dataKey"
|
||||
}
|
||||
|
||||
conf = SparkConf()
|
||||
conf.setMaster("local[4]")
|
||||
|
||||
sc = PPMLContext("MyApp", ppml_args, conf)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
### 2. Read and Write Files
|
||||
|
||||
To read/write data, you should set the `CryptoMode`:
|
||||
|
||||
- `plain_text`: no encryption
|
||||
- `AES/CBC/PKCS5Padding`: for CSV, JSON and text file
|
||||
- `AES_GCM_V1`: for PARQUET only
|
||||
- `AES_GCM_CTR_V1`: for PARQUET only
|
||||
|
||||
To write data, you should set the `write` mode:
|
||||
|
||||
- `overwrite`: Overwrite existing data with the content of dataframe.
|
||||
- `append`: Append content of the dataframe to existing data or table.
|
||||
- `ignore`: Ignore current write operation if data / table already exists without any error.
|
||||
- `error`: Throw an exception if data or table already exists.
|
||||
- `errorifexists`: Throw an exception if data or table already exists.
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.crypto.{AES_CBC_PKCS5PADDING, PLAIN_TEXT}
|
||||
|
||||
// read data
|
||||
val df = sc.read(cryptoMode = PLAIN_TEXT)
|
||||
...
|
||||
|
||||
// write data
|
||||
sc.write(dataFrame = df, cryptoMode = AES_CBC_PKCS5PADDING)
|
||||
.mode("overwrite")
|
||||
...
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
from bigdl.ppml.ppml_context import *
|
||||
|
||||
# read data
|
||||
df = sc.read(crypto_mode = CryptoMode.PLAIN_TEXT)
|
||||
...
|
||||
|
||||
# write data
|
||||
sc.write(dataframe = df, crypto_mode = CryptoMode.AES_CBC_PKCS5PADDING)
|
||||
.mode("overwrite")
|
||||
...
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details><summary>expand to see the examples of reading/writing CSV, PARQUET, JSON and text file</summary>
|
||||
|
||||
The following examples use `sc` to represent a initialized `PPMLContext`
|
||||
|
||||
**read/write CSV file**
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.PPMLContext
|
||||
import com.intel.analytics.bigdl.ppml.crypto.{AES_CBC_PKCS5PADDING, PLAIN_TEXT}
|
||||
|
||||
// read a plain csv file and return a DataFrame
|
||||
val plainCsvPath = "/plain/csv/path"
|
||||
val df1 = sc.read(cryptoMode = PLAIN_TEXT).option("header", "true").csv(plainCsvPath)
|
||||
|
||||
// write a DataFrame as a plain csv file
|
||||
val plainOutputPath = "/plain/output/path"
|
||||
sc.write(df1, PLAIN_TEXT)
|
||||
.mode("overwrite")
|
||||
.option("header", "true")
|
||||
.csv(plainOutputPath)
|
||||
|
||||
// read a encrypted csv file and return a DataFrame
|
||||
val encryptedCsvPath = "/encrypted/csv/path"
|
||||
val df2 = sc.read(cryptoMode = AES_CBC_PKCS5PADDING).option("header", "true").csv(encryptedCsvPath)
|
||||
|
||||
// write a DataFrame as a encrypted csv file
|
||||
val encryptedOutputPath = "/encrypted/output/path"
|
||||
sc.write(df2, AES_CBC_PKCS5PADDING)
|
||||
.mode("overwrite")
|
||||
.option("header", "true")
|
||||
.csv(encryptedOutputPath)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
# import
|
||||
from bigdl.ppml.ppml_context import *
|
||||
|
||||
# read a plain csv file and return a DataFrame
|
||||
plain_csv_path = "/plain/csv/path"
|
||||
df1 = sc.read(CryptoMode.PLAIN_TEXT).option("header", "true").csv(plain_csv_path)
|
||||
|
||||
# write a DataFrame as a plain csv file
|
||||
plain_output_path = "/plain/output/path"
|
||||
sc.write(df1, CryptoMode.PLAIN_TEXT)
|
||||
.mode('overwrite')
|
||||
.option("header", True)
|
||||
.csv(plain_output_path)
|
||||
|
||||
# read a encrypted csv file and return a DataFrame
|
||||
encrypted_csv_path = "/encrypted/csv/path"
|
||||
df2 = sc.read(CryptoMode.AES_CBC_PKCS5PADDING).option("header", "true").csv(encrypted_csv_path)
|
||||
|
||||
# write a DataFrame as a encrypted csv file
|
||||
encrypted_output_path = "/encrypted/output/path"
|
||||
sc.write(df2, CryptoMode.AES_CBC_PKCS5PADDING)
|
||||
.mode('overwrite')
|
||||
.option("header", True)
|
||||
.csv(encrypted_output_path)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
**read/write PARQUET file**
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.PPMLContext
|
||||
import com.intel.analytics.bigdl.ppml.crypto.{AES_GCM_CTR_V1, PLAIN_TEXT}
|
||||
|
||||
// read a plain parquet file and return a DataFrame
|
||||
val plainParquetPath = "/plain/parquet/path"
|
||||
val df1 = sc.read(PLAIN_TEXT).parquet(plainParquetPath)
|
||||
|
||||
// write a DataFrame as a plain parquet file
|
||||
plainOutputPath = "/plain/output/path"
|
||||
sc.write(df1, PLAIN_TEXT)
|
||||
.mode("overwrite")
|
||||
.parquet(plainOutputPath)
|
||||
|
||||
// read a encrypted parquet file and return a DataFrame
|
||||
val encryptedParquetPath = "/encrypted/parquet/path"
|
||||
val df2 = sc.read(AES_GCM_CTR_V1).parquet(encryptedParquetPath)
|
||||
|
||||
// write a DataFrame as a encrypted parquet file
|
||||
val encryptedOutputPath = "/encrypted/output/path"
|
||||
sc.write(df2, AES_GCM_CTR_V1)
|
||||
.mode("overwrite")
|
||||
.parquet(encryptedOutputPath)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
# import
|
||||
from bigdl.ppml.ppml_context import *
|
||||
|
||||
# read a plain parquet file and return a DataFrame
|
||||
plain_parquet_path = "/plain/parquet/path"
|
||||
df1 = sc.read(CryptoMode.PLAIN_TEXT).parquet(plain_parquet_path)
|
||||
|
||||
# write a DataFrame as a plain parquet file
|
||||
plain_output_path = "/plain/output/path"
|
||||
sc.write(df1, CryptoMode.PLAIN_TEXT)
|
||||
.mode('overwrite')
|
||||
.parquet(plain_output_path)
|
||||
|
||||
# read a encrypted parquet file and return a DataFrame
|
||||
encrypted_parquet_path = "/encrypted/parquet/path"
|
||||
df2 = sc.read(CryptoMode.AES_GCM_CTR_V1).parquet(encrypted_parquet_path)
|
||||
|
||||
# write a DataFrame as a encrypted parquet file
|
||||
encrypted_output_path = "/encrypted/output/path"
|
||||
sc.write(df2, CryptoMode.AES_GCM_CTR_V1)
|
||||
.mode('overwrite')
|
||||
.parquet(encrypted_output_path)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
**read/write JSON file**
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.PPMLContext
|
||||
import com.intel.analytics.bigdl.ppml.crypto.{AES_CBC_PKCS5PADDING, PLAIN_TEXT}
|
||||
|
||||
// read a plain json file and return a DataFrame
|
||||
val plainJsonPath = "/plain/json/path"
|
||||
val df1 = sc.read(PLAIN_TEXT).json(plainJsonPath)
|
||||
|
||||
// write a DataFrame as a plain json file
|
||||
val plainOutputPath = "/plain/output/path"
|
||||
sc.write(df1, PLAIN_TEXT)
|
||||
.mode("overwrite")
|
||||
.json(plainOutputPath)
|
||||
|
||||
// read a encrypted json file and return a DataFrame
|
||||
val encryptedJsonPath = "/encrypted/parquet/path"
|
||||
val df2 = sc.read(AES_CBC_PKCS5PADDING).json(encryptedJsonPath)
|
||||
|
||||
// write a DataFrame as a encrypted parquet file
|
||||
val encryptedOutputPath = "/encrypted/output/path"
|
||||
sc.write(df2, AES_CBC_PKCS5PADDING)
|
||||
.mode("overwrite")
|
||||
.json(encryptedOutputPath)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
# import
|
||||
from bigdl.ppml.ppml_context import *
|
||||
|
||||
# read a plain json file and return a DataFrame
|
||||
plain_json_path = "/plain/json/path"
|
||||
df1 = sc.read(CryptoMode.PLAIN_TEXT).json(plain_json_path)
|
||||
|
||||
# write a DataFrame as a plain json file
|
||||
plain_output_path = "/plain/output/path"
|
||||
sc.write(df1, CryptoMode.PLAIN_TEXT)
|
||||
.mode('overwrite')
|
||||
.json(plain_output_path)
|
||||
|
||||
# read a encrypted json file and return a DataFrame
|
||||
encrypted_json_path = "/encrypted/parquet/path"
|
||||
df2 = sc.read(CryptoMode.AES_CBC_PKCS5PADDING).json(encrypted_json_path)
|
||||
|
||||
# write a DataFrame as a encrypted parquet file
|
||||
encrypted_output_path = "/encrypted/output/path"
|
||||
sc.write(df2, CryptoMode.AES_CBC_PKCS5PADDING)
|
||||
.mode('overwrite')
|
||||
.json(encrypted_output_path)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
**read textfile**
|
||||
|
||||
<details open>
|
||||
<summary>scala</summary>
|
||||
|
||||
```scala
|
||||
import com.intel.analytics.bigdl.ppml.PPMLContext
|
||||
import com.intel.analytics.bigdl.ppml.crypto.{AES_CBC_PKCS5PADDING, PLAIN_TEXT}
|
||||
|
||||
// read from a plain csv file and return a RDD
|
||||
val plainCsvPath = "/plain/csv/path"
|
||||
val rdd1 = sc.textfile(plainCsvPath) // the default cryptoMode is PLAIN_TEXT
|
||||
|
||||
// read from a encrypted csv file and return a RDD
|
||||
val encryptedCsvPath = "/encrypted/csv/path"
|
||||
val rdd2 = sc.textfile(path=encryptedCsvPath, cryptoMode=AES_CBC_PKCS5PADDING)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>python</summary>
|
||||
|
||||
```python
|
||||
# import
|
||||
from bigdl.ppml.ppml_context import *
|
||||
|
||||
# read from a plain csv file and return a RDD
|
||||
plain_csv_path = "/plain/csv/path"
|
||||
rdd1 = sc.textfile(plain_csv_path) # the default crypto_mode is "plain_text"
|
||||
|
||||
# read from a encrypted csv file and return a RDD
|
||||
encrypted_csv_path = "/encrypted/csv/path"
|
||||
rdd2 = sc.textfile(path=encrypted_csv_path, crypto_mode=CryptoMode.AES_CBC_PKCS5PADDING)
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
</details>
|
||||
|
||||
More usage with `PPMLContext` Python API, please refer to [PPMLContext Python API](https://github.com/intel-analytics/BigDL/blob/main/python/ppml/src/bigdl/ppml/README.md).
|
||||
175
docs/readthedocs/source/doc/PPML/QuickStart/end-to-end.md
Normal file
|
|
@ -0,0 +1,175 @@
|
|||
# PPML End-to-End Workflow Example
|
||||
|
||||
## E2E Architecture Overview
|
||||
|
||||
In this section we take SimpleQuery as an example to go through the entire BigDL PPML end-to-end workflow. SimpleQuery is simple example to query developers between the ages of 20 and 40 from people.csv.
|
||||
|
||||
|
||||
<p align="center">
|
||||
<img src="https://user-images.githubusercontent.com/61072813/178393982-929548b9-1c4e-4809-a628-10fafad69628.png" alt="data lifecycle" />
|
||||
</p>
|
||||
|
||||
<video src="https://user-images.githubusercontent.com/61072813/184758702-4b9809f9-50ac-425e-8def-0ea1c5bf1805.mp4" width="100%" controls></video>
|
||||
|
||||
---
|
||||
|
||||
## Step 0. Preparation your environment
|
||||
To secure your Big Data & AI applications in BigDL PPML manner, you should prepare your environment first, including K8s cluster setup, K8s-SGX plugin setup, key/password preparation, key management service (KMS) and attestation service (AS) setup, BigDL PPML client container preparation. **Please follow the detailed steps in** [Prepare Environment](./docs/prepare_environment.md).
|
||||
|
||||
|
||||
## Step 1. Encrypt and Upload Data
|
||||
Encrypt the input data of your Big Data & AI applications (here we use SimpleQuery) and then upload encrypted data to the nfs server. More details in [Encrypt Your Data](./services/kms-utils/docker/README.md#3-enroll-generate-key-encrypt-and-decrypt).
|
||||
|
||||
1. Generate the input data `people.csv` for SimpleQuery application
|
||||
you can use [generate_people_csv.py](https://github.com/analytics-zoo/ppml-e2e-examples/blob/main/spark-encrypt-io/generate_people_csv.py). The usage command of the script is `python generate_people.py </save/path/of/people.csv> <num_lines>`.
|
||||
|
||||
2. Encrypt `people.csv`
|
||||
```
|
||||
docker exec -i $KMSUTIL_CONTAINER_NAME bash -c "bash /home/entrypoint.sh encrypt $appid $apikey $input_file_path"
|
||||
```
|
||||
## Step 2. Build Big Data & AI applications
|
||||
To build your own Big Data & AI applications, refer to [develop your own Big Data & AI applications with BigDL PPML](#4-develop-your-own-big-data--ai-applications-with-bigdl-ppml). The code of SimpleQuery is in [here](https://github.com/intel-analytics/BigDL/blob/main/scala/ppml/src/main/scala/com/intel/analytics/bigdl/ppml/examples/SimpleQuerySparkExample.scala), it is already built into bigdl-ppml-spark_3.1.2-2.1.0-SNAPSHOT.jar, and the jar is put into PPML image.
|
||||
|
||||
## Step 3. Attestation
|
||||
|
||||
To enable attestation, you should have a running Attestation Service (EHSM-KMS here for example) in your environment. (You can start a KMS refering to [this link](https://github.com/intel-analytics/BigDL/tree/main/ppml/services/kms-utils/docker)). Configure your KMS app_id and app_key with `kubectl`, and then configure KMS settings in `spark-driver-template.yaml` and `spark-executor-template.yaml` in the container.
|
||||
``` bash
|
||||
kubectl create secret generic kms-secret --from-literal=app_id=your-kms-app-id --from-literal=app_key=your-kms-app-key
|
||||
```
|
||||
Configure `spark-driver-template.yaml` for example. (`spark-executor-template.yaml` is similar)
|
||||
``` yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
spec:
|
||||
containers:
|
||||
- name: spark-driver
|
||||
securityContext:
|
||||
privileged: true
|
||||
env:
|
||||
- name: ATTESTATION
|
||||
value: true
|
||||
- name: ATTESTATION_URL
|
||||
value: your_attestation_url
|
||||
- name: ATTESTATION_ID
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: kms-secret
|
||||
key: app_id
|
||||
- name: ATTESTATION_KEY
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: kms-secret
|
||||
key: app_key
|
||||
...
|
||||
```
|
||||
You should get `Attestation Success!` in logs after you [submit a PPML job](#step-4-submit-job) if the quote generated with user report is verified successfully by Attestation Service, or you will get `Attestation Fail! Application killed!` and the job will be stopped.
|
||||
|
||||
## Step 4. Submit Job
|
||||
When the Big Data & AI application and its input data is prepared, you are ready to submit BigDL PPML jobs. You need to choose the deploy mode and the way to submit job first.
|
||||
|
||||
* **There are 4 modes to submit job**:
|
||||
|
||||
1. **local mode**: run jobs locally without connecting to cluster. It is exactly same as using spark-submit to run your application: `$SPARK_HOME/bin/spark-submit --class "SimpleApp" --master local[4] target.jar`, driver and executors are not protected by SGX.
|
||||
<p align="left">
|
||||
<img src="https://user-images.githubusercontent.com/61072813/174703141-63209559-05e1-4c4d-b096-6b862a9bed8a.png" width='250px' />
|
||||
</p>
|
||||
|
||||
|
||||
2. **local SGX mode**: run jobs locally with SGX guarded. As the picture shows, the client JVM is running in a SGX Enclave so that driver and executors can be protected.
|
||||
<p align="left">
|
||||
<img src="https://user-images.githubusercontent.com/61072813/174703165-2afc280d-6a3d-431d-9856-dd5b3659214a.png" width='250px' />
|
||||
</p>
|
||||
|
||||
|
||||
3. **client SGX mode**: run jobs in k8s client mode with SGX guarded. As we know, in K8s client mode, the driver is deployed locally as an external client to the cluster. With **client SGX mode**, the executors running in K8S cluster are protected by SGX, the driver running in client is also protected by SGX.
|
||||
<p align="left">
|
||||
<img src="https://user-images.githubusercontent.com/61072813/174703216-70588315-7479-4b6c-9133-095104efc07d.png" width='500px' />
|
||||
</p>
|
||||
|
||||
|
||||
4. **cluster SGX mode**: run jobs in k8s cluster mode with SGX guarded. As we know, in K8s cluster mode, the driver is deployed on the k8s worker nodes like executors. With **cluster SGX mode**, the driver and executors running in K8S cluster are protected by SGX.
|
||||
<p align="left">
|
||||
<img src="https://user-images.githubusercontent.com/61072813/174703234-e45b8fe5-9c61-4d17-93ef-6b0c961a2f95.png" width='500px' />
|
||||
</p>
|
||||
|
||||
|
||||
* **There are two options to submit PPML jobs**:
|
||||
* use [PPML CLI](./docs/submit_job.md#ppml-cli) to submit jobs manually
|
||||
* use [helm chart](./docs/submit_job.md#helm-chart) to submit jobs automatically
|
||||
|
||||
Here we use **k8s client mode** and **PPML CLI** to run SimpleQuery. Check other modes, please see [PPML CLI Usage Examples](./docs/submit_job.md#usage-examples). Alternatively, you can also use Helm to submit jobs automatically, see the details in [Helm Chart Usage](./docs/submit_job.md#helm-chart).
|
||||
|
||||
<details><summary>expand to see details of submitting SimpleQuery</summary>
|
||||
|
||||
1. enter the ppml container
|
||||
```
|
||||
docker exec -it bigdl-ppml-client-k8s bash
|
||||
```
|
||||
2. run simplequery on k8s client mode
|
||||
```
|
||||
#!/bin/bash
|
||||
export secure_password=`openssl rsautl -inkey /ppml/trusted-big-data-ml/work/password/key.txt -decrypt </ppml/trusted-big-data-ml/work/password/output.bin`
|
||||
bash bigdl-ppml-submit.sh \
|
||||
--master $RUNTIME_SPARK_MASTER \
|
||||
--deploy-mode client \
|
||||
--sgx-enabled true \
|
||||
--sgx-log-level error \
|
||||
--sgx-driver-memory 64g \
|
||||
--sgx-driver-jvm-memory 12g \
|
||||
--sgx-executor-memory 64g \
|
||||
--sgx-executor-jvm-memory 12g \
|
||||
--driver-memory 32g \
|
||||
--driver-cores 8 \
|
||||
--executor-memory 32g \
|
||||
--executor-cores 8 \
|
||||
--num-executors 2 \
|
||||
--conf spark.kubernetes.container.image=$RUNTIME_K8S_SPARK_IMAGE \
|
||||
--name simplequery \
|
||||
--verbose \
|
||||
--class com.intel.analytics.bigdl.ppml.examples.SimpleQuerySparkExample \
|
||||
--jars local:///ppml/trusted-big-data-ml/spark-encrypt-io-0.3.0-SNAPSHOT.jar \
|
||||
local:///ppml/trusted-big-data-ml/work/data/simplequery/spark-encrypt-io-0.3.0-SNAPSHOT.jar \
|
||||
--inputPath /ppml/trusted-big-data-ml/work/data/simplequery/people_encrypted \
|
||||
--outputPath /ppml/trusted-big-data-ml/work/data/simplequery/people_encrypted_output \
|
||||
--inputPartitionNum 8 \
|
||||
--outputPartitionNum 8 \
|
||||
--inputEncryptModeValue AES/CBC/PKCS5Padding \
|
||||
--outputEncryptModeValue AES/CBC/PKCS5Padding \
|
||||
--primaryKeyPath /ppml/trusted-big-data-ml/work/data/simplequery/keys/primaryKey \
|
||||
--dataKeyPath /ppml/trusted-big-data-ml/work/data/simplequery/keys/dataKey \
|
||||
--kmsType EHSMKeyManagementService
|
||||
--kmsServerIP your_ehsm_kms_server_ip \
|
||||
--kmsServerPort your_ehsm_kms_server_port \
|
||||
--ehsmAPPID your_ehsm_kms_appid \
|
||||
--ehsmAPIKEY your_ehsm_kms_apikey
|
||||
```
|
||||
|
||||
|
||||
3. check runtime status: exit the container or open a new terminal
|
||||
|
||||
To check the logs of the Spark driver, run
|
||||
```
|
||||
sudo kubectl logs $( sudo kubectl get pod | grep "simplequery.*-driver" -m 1 | cut -d " " -f1 )
|
||||
```
|
||||
To check the logs of an Spark executor, run
|
||||
```
|
||||
sudo kubectl logs $( sudo kubectl get pod | grep "simplequery-.*-exec" -m 1 | cut -d " " -f1 )
|
||||
```
|
||||
|
||||
4. If you setup [PPML Monitoring](docs/prepare_environment.md#optional-k8s-monitioring-setup), you can check PPML Dashboard to monitor the status in http://kubernetes_master_url:3000
|
||||
|
||||

|
||||
|
||||
</details>
|
||||
<br />
|
||||
|
||||
## Step 5. Decrypt and Read Result
|
||||
When the job is done, you can decrypt and read result of the job. More details in [Decrypt Job Result](./services/kms-utils/docker/README.md#3-enroll-generate-key-encrypt-and-decrypt).
|
||||
|
||||
```
|
||||
docker exec -i $KMSUTIL_CONTAINER_NAME bash -c "bash /home/entrypoint.sh decrypt $appid $apikey $input_path"
|
||||
```
|
||||
|
||||
## Video Demo
|
||||
|
||||
<video src="https://user-images.githubusercontent.com/61072813/184758643-821026c3-40e0-4d4c-bcd3-8a516c55fc01.mp4" width="100%" controls></video>
|
||||
|
|
@ -10,37 +10,37 @@
|
|||
|
||||
1. Download and compile tpc-ds
|
||||
|
||||
```bash
|
||||
git clone --recursive https://github.com/intel-analytics/zoo-tutorials.git
|
||||
cd /path/to/zoo-tutorials
|
||||
git clone https://github.com/databricks/tpcds-kit.git
|
||||
cd tpcds-kit/tools
|
||||
make OS=LINUX
|
||||
```
|
||||
```bash
|
||||
git clone --recursive https://github.com/intel-analytics/zoo-tutorials.git
|
||||
cd /path/to/zoo-tutorials
|
||||
git clone https://github.com/databricks/tpcds-kit.git
|
||||
cd tpcds-kit/tools
|
||||
make OS=LINUX
|
||||
```
|
||||
|
||||
2. Generate data
|
||||
|
||||
```bash
|
||||
cd /path/to/zoo-tutorials
|
||||
cd tpcds-spark/spark-sql-perf
|
||||
sbt "test:runMain com.databricks.spark.sql.perf.tpcds.GenTPCDSData -d <dsdgenDir> -s <scaleFactor> -l <dataDir> -f parquet"
|
||||
```
|
||||
```bash
|
||||
cd /path/to/zoo-tutorials
|
||||
cd tpcds-spark/spark-sql-perf
|
||||
sbt "test:runMain com.databricks.spark.sql.perf.tpcds.GenTPCDSData -d <dsdgenDir> -s <scaleFactor> -l <dataDir> -f parquet"
|
||||
```
|
||||
|
||||
`dsdgenDir` is the path of `tpcds-kit/tools`, `scaleFactor` is the size of the data, for example `-s 1` will generate 1G data, `dataDir` is the path to store generated data.
|
||||
`dsdgenDir` is the path of `tpcds-kit/tools`, `scaleFactor` is the size of the data, for example `-s 1` will generate 1G data, `dataDir` is the path to store generated data.
|
||||
|
||||
### Deploy PPML TPC-DS on Kubernetes
|
||||
|
||||
1. Compile Kit
|
||||
|
||||
```bash
|
||||
cd zoo-tutorials/tpcds-spark
|
||||
sbt package
|
||||
```
|
||||
```bash
|
||||
cd zoo-tutorials/tpcds-spark
|
||||
sbt package
|
||||
```
|
||||
|
||||
2. Create external tables
|
||||
|
||||
```bash
|
||||
$SPARK_HOME/bin/spark-submit \
|
||||
```bash
|
||||
$SPARK_HOME/bin/spark-submit \
|
||||
--class "createTables" \
|
||||
--master <spark-master> \
|
||||
--driver-memory 20G \
|
||||
|
|
@ -49,26 +49,26 @@ $SPARK_HOME/bin/spark-submit \
|
|||
--executor-memory 20G \
|
||||
--jars spark-sql-perf/target/scala-2.12/spark-sql-perf_2.12-0.5.1-SNAPSHOT.jar \
|
||||
target/scala-2.12/tpcds-benchmark_2.12-0.1.jar <dataDir> <dsdgenDir> <scaleFactor>
|
||||
```
|
||||
```
|
||||
|
||||
3. Pull docker image
|
||||
|
||||
```bash
|
||||
sudo docker pull intelanalytics/bigdl-ppml-trusted-big-data-ml-python-graphene:2.1.0-SNAPSHOT
|
||||
```
|
||||
```bash
|
||||
sudo docker pull intelanalytics/bigdl-ppml-trusted-big-data-ml-python-graphene:2.1.0-SNAPSHOT
|
||||
```
|
||||
|
||||
4. Prepare SGX keys (following instructions [here](https://github.com/intel-analytics/BigDL/tree/main/ppml/trusted-big-data-ml/python/docker-graphene#11-prepare-the-keyspassworddataenclave-keypem "here")), make sure keys and tpcds-spark can be accessed on each K8S node
|
||||
5. Start a bigdl-ppml enabled Spark K8S client container with configured local IP, key, tpc-ds and kuberconfig path
|
||||
|
||||
```bash
|
||||
export ENCLAVE_KEY=/YOUR_DIR/keys/enclave-key.pem
|
||||
export DATA_PATH=/YOUR_DIR/zoo-tutorials/tpcds-spark
|
||||
export KEYS_PATH=/YOUR_DIR/keys
|
||||
export SECURE_PASSWORD_PATH=/YOUR_DIR/password
|
||||
export KUBERCONFIG_PATH=/YOUR_DIR/kuberconfig
|
||||
export LOCAL_IP=$local_ip
|
||||
export DOCKER_IMAGE=intelanalytics/bigdl-ppml-trusted-big-data-ml-python-graphene:2.1.0-SNAPSHOT
|
||||
sudo docker run -itd \
|
||||
```bash
|
||||
export ENCLAVE_KEY=/YOUR_DIR/keys/enclave-key.pem
|
||||
export DATA_PATH=/YOUR_DIR/zoo-tutorials/tpcds-spark
|
||||
export KEYS_PATH=/YOUR_DIR/keys
|
||||
export SECURE_PASSWORD_PATH=/YOUR_DIR/password
|
||||
export KUBERCONFIG_PATH=/YOUR_DIR/kuberconfig
|
||||
export LOCAL_IP=$local_ip
|
||||
export DOCKER_IMAGE=intelanalytics/bigdl-ppml-trusted-big-data-ml-python-graphene:2.1.0-SNAPSHOT
|
||||
sudo docker run -itd \
|
||||
--privileged \
|
||||
--net=host \
|
||||
--name=spark-local-k8s-client \
|
||||
|
|
@ -96,20 +96,20 @@ sudo docker run -itd \
|
|||
-e SGX_LOG_LEVEL=error \
|
||||
-e LOCAL_IP=$LOCAL_IP \
|
||||
$DOCKER_IMAGE bash
|
||||
```
|
||||
```
|
||||
|
||||
6. Attach to the client container
|
||||
|
||||
```bash
|
||||
sudo docker exec -it spark-local-k8s-client bash
|
||||
```
|
||||
```bash
|
||||
sudo docker exec -it spark-local-k8s-client bash
|
||||
```
|
||||
|
||||
7. Modify `spark-executor-template.yaml`, add path of `enclave-key`, `tpcds-spark` and `kuberconfig` on host
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
spec:
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
spec:
|
||||
containers:
|
||||
- name: spark-executor
|
||||
securityContext:
|
||||
|
|
@ -131,21 +131,21 @@ spec:
|
|||
- name: kubeconf
|
||||
hostPath:
|
||||
path: /path/to/kuberconfig
|
||||
```
|
||||
```
|
||||
|
||||
8. Execute TPC-DS queries
|
||||
|
||||
Optional argument `QUERY` is the query number to run. Multiple query numbers should be separated by space, e.g. `1 2 3`. If no query number is specified, all 1-99 queries would be executed.
|
||||
Optional argument `QUERY` is the query number to run. Multiple query numbers should be separated by space, e.g. `1 2 3`. If no query number is specified, all 1-99 queries would be executed.
|
||||
|
||||
```bash
|
||||
secure_password=`openssl rsautl -inkey /ppml/trusted-big-data-ml/work/password/key.txt -decrypt </ppml/trusted-big-data-ml/work/password/output.bin` && \
|
||||
export TF_MKL_ALLOC_MAX_BYTES=10737418240 && \
|
||||
export SPARK_LOCAL_IP=$LOCAL_IP && \
|
||||
export HDFS_HOST=$hdfs_host_ip && \
|
||||
export HDFS_PORT=$hdfs_port && \
|
||||
export TPCDS_DIR=/ppml/trusted-big-data-ml/work/tpcds-spark \
|
||||
export OUTPUT_DIR=hdfs://$HDFS_HOST:$HDFS_PORT/tpc-ds/output \
|
||||
export QUERY=3
|
||||
```bash
|
||||
secure_password=`openssl rsautl -inkey /ppml/trusted-big-data-ml/work/password/key.txt -decrypt </ppml/trusted-big-data-ml/work/password/output.bin` && \
|
||||
export TF_MKL_ALLOC_MAX_BYTES=10737418240 && \
|
||||
export SPARK_LOCAL_IP=$LOCAL_IP && \
|
||||
export HDFS_HOST=$hdfs_host_ip && \
|
||||
export HDFS_PORT=$hdfs_port && \
|
||||
export TPCDS_DIR=/ppml/trusted-big-data-ml/work/tpcds-spark \
|
||||
export OUTPUT_DIR=hdfs://$HDFS_HOST:$HDFS_PORT/tpc-ds/output \
|
||||
export QUERY=3
|
||||
/opt/jdk8/bin/java \
|
||||
-cp '$TPCDS_DIR/target/scala-2.12/tpcds-benchmark_2.12-0.1.jar:/ppml/trusted-big-data-ml/work/spark-3.1.2/conf/:/ppml/trusted-big-data-ml/work/spark-3.1.2/jars/*' \
|
||||
-Xmx10g \
|
||||
|
|
@ -209,6 +209,6 @@ export QUERY=3
|
|||
--verbose \
|
||||
$TPCDS_DIR/target/scala-2.12/tpcds-benchmark_2.12-0.1.jar \
|
||||
$OUTPUT_DIR $QUERY
|
||||
```
|
||||
```
|
||||
|
||||
After benchmark is finished, the performance result is saved as `part-*.csv` file under `<OUTPUT_DIR>/performance` directory.
|
||||
After benchmark is finished, the performance result is saved as `part-*.csv` file under `<OUTPUT_DIR>/performance` directory.
|
||||
|
|
|
|||
71
docs/readthedocs/source/doc/PPML/index.rst
Normal file
|
|
@ -0,0 +1,71 @@
|
|||
BigDL-PPML
|
||||
=========================
|
||||
|
||||
Protecting privacy and confidentiality is critical for large-scale data analysis and machine learning. BigDL PPML (BigDL Privacy Preserving Machine Learning) combines various low-level hardware and software security technologies (e.g., Intel® Software Guard Extensions (Intel® SGX), Security Key Management, Remote Attestation, Data Encryption, Federated Learning, etc.) so that users can continue applying standard Big Data and AI technologies (such as Apache Spark, Apache Flink, TensorFlow, PyTorch, etc.) without sacrificing privacy.
|
||||
|
||||
----------------------
|
||||
|
||||
|
||||
.. grid:: 1 2 2 2
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Get Started**
|
||||
^^^
|
||||
|
||||
Documents in these sections helps you getting started quickly with PPML.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Introduction <./Overview/intro.html>` |
|
||||
:bdg-link:`Hello World Example <./Overview/quicktour.html>`
|
||||
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**User Guide**
|
||||
^^^
|
||||
|
||||
Provides you with in-depth information about PPML features and concepts and step-by-step guides.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`User Guide <./Overview/userguide.html>` |
|
||||
:bdg-link:`Advanced Topics <./Overview/misc.html>`
|
||||
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Tutorials**
|
||||
^^^
|
||||
|
||||
PPML Tutorials and Examples.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`End-to-End Example <./Overview/examples.html>` |
|
||||
:bdg-link:`More Examples <https://github.com/intel-analytics/BigDL/blob/main/ppml/docs/examples.md>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Videos**
|
||||
^^^
|
||||
|
||||
Videos and Demos helps you quick understand the architecture and start hands-on work.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Introduction <./Overview/intro.html#what-is-bigdl-ppml>` |
|
||||
:bdg-link:`E2E Workflow <./QuickStart/end-to-end.html#e2e-architecture-overview>` |
|
||||
:bdg-link:`E2E Demo <./QuickStart/end-to-end.html#video-demo>`
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
BigDL-PPML Document <self>
|
||||
|
|
@ -1,3 +1,7 @@
|
|||
# Clipping
|
||||
|
||||
--------
|
||||
|
||||
## ConstantGradientClipping ##
|
||||
|
||||
Set constant gradient clipping during the training process.
|
||||
3328
docs/readthedocs/source/doc/PythonAPI/DLlib/core_layers.md
Normal file
|
|
@ -1,4 +1,5 @@
|
|||
## Model Freeze
|
||||
# Model Freeze
|
||||
|
||||
To "freeze" a model means to exclude some layers of model from training.
|
||||
|
||||
```scala
|
||||
13
docs/readthedocs/source/doc/PythonAPI/DLlib/index.rst
Normal file
|
|
@ -0,0 +1,13 @@
|
|||
DLlib API
|
||||
==================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
model.rst
|
||||
core_layers.md
|
||||
optim-Methods.md
|
||||
regularizers.md
|
||||
learningrate-Scheduler.md
|
||||
freeze.md
|
||||
clipping.md
|
||||
|
|
@ -1,3 +1,8 @@
|
|||
# Learning Rate Scheduler
|
||||
|
||||
--------
|
||||
|
||||
|
||||
## Poly ##
|
||||
|
||||
**Scala:**
|
||||
17
docs/readthedocs/source/doc/PythonAPI/DLlib/model.rst
Normal file
|
|
@ -0,0 +1,17 @@
|
|||
Model/Sequential
|
||||
==================
|
||||
|
||||
dllib.keras.models.Model
|
||||
---------------------------
|
||||
|
||||
.. autoclass:: bigdl.dllib.keras.models.Model
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
||||
|
||||
dllib.keras.models.Sequential
|
||||
---------------------------
|
||||
|
||||
.. autoclass:: bigdl.dllib.keras.models.Sequential
|
||||
:members:
|
||||
:undoc-members:
|
||||
|
|
@ -1,3 +1,7 @@
|
|||
# Optimizer
|
||||
|
||||
--------
|
||||
|
||||
## Adam ##
|
||||
|
||||
**Scala:**
|
||||
|
|
@ -1,3 +1,7 @@
|
|||
# Regularizer
|
||||
|
||||
--------
|
||||
|
||||
## L1 Regularizer ##
|
||||
|
||||
**Scala:**
|
||||
7
docs/readthedocs/source/doc/PythonAPI/Friesian/index.rst
Normal file
|
|
@ -0,0 +1,7 @@
|
|||
Friesian API
|
||||
==================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
feature.rst
|
||||
|
|
@ -1,3 +1,6 @@
|
|||
Orca AutoML
|
||||
============================
|
||||
|
||||
orca.automl.auto_estimator
|
||||
---------------------------
|
||||
|
||||
|
|
|
|||
15
docs/readthedocs/source/doc/PythonAPI/Orca/context.rst
Normal file
|
|
@ -0,0 +1,15 @@
|
|||
Orca Context
|
||||
=========
|
||||
|
||||
orca.init_orca_context
|
||||
-------------------------
|
||||
|
||||
|
||||
.. automodule:: bigdl.orca.common
|
||||
:members: init_orca_context
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
|
||||
|
||||
20
docs/readthedocs/source/doc/PythonAPI/Orca/data.rst
Normal file
|
|
@ -0,0 +1,20 @@
|
|||
Orca Data
|
||||
=========
|
||||
|
||||
orca.data.XShards
|
||||
---------------------------
|
||||
|
||||
.. autoclass:: bigdl.orca.data.XShards
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
orca.data.pandas
|
||||
---------------------------
|
||||
|
||||
.. automodule:: bigdl.orca.data.pandas.preprocessing
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
10
docs/readthedocs/source/doc/PythonAPI/Orca/index.rst
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
Orca API
|
||||
==================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
context.rst
|
||||
data.rst
|
||||
orca.rst
|
||||
automl.rst
|
||||
|
|
@ -1,4 +1,4 @@
|
|||
Orca API
|
||||
Orca Learn
|
||||
=========
|
||||
|
||||
orca.learn.bigdl.estimator
|
||||
|
|
@ -88,12 +88,3 @@ orca.learn.openvino.estimator
|
|||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
AutoML
|
||||
------------------------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
automl.rst
|
||||
|
|
@ -44,10 +44,10 @@ output of Cluster Serving job information should be displayed, if not, go to [Pr
|
|||
|
||||
1. `Duplicate registration of device factory for type XLA_CPU with the same priority 50`
|
||||
|
||||
This error is caused by Flink ClassLoader. Please put cluster serving related jars into `${FLINK_HOME}/lib`.
|
||||
This error is caused by Flink ClassLoader. Please put cluster serving related jars into `${FLINK_HOME}/lib`.
|
||||
|
||||
2. `servable Manager config dir not exist`
|
||||
|
||||
Check if `servables.yaml` exists in current directory. If not, download from [github](https://github.com/intel-analytics/bigdl/blob/master/ppml/trusted-realtime-ml/scala/docker-graphene/servables.yaml).
|
||||
Check if `servables.yaml` exists in current directory. If not, download from [github](https://github.com/intel-analytics/bigdl/blob/master/ppml/trusted-realtime-ml/scala/docker-graphene/servables.yaml).
|
||||
### Still, I get no result
|
||||
If you still get empty result, raise issue [here](https://github.com/intel-analytics/bigdl/issues) and post the output/log of your serving job.
|
||||
|
|
|
|||
66
docs/readthedocs/source/doc/Serving/index.rst
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
Cluster Serving
|
||||
=========================
|
||||
|
||||
BigDL Cluster Serving is a lightweight distributed, real-time serving solution that supports a wide range of deep learning models (such as TensorFlow, PyTorch, Caffe, BigDL and OpenVINO models). It provides a simple pub/sub API, so that the users can easily send their inference requests to the input queue (using a simple Python API); Cluster Serving will then automatically manage the scale-out and real-time model inference across a large cluster (using distributed streaming frameworks such as Apache Spark Streaming, Apache Flink, etc.)
|
||||
|
||||
----------------------
|
||||
|
||||
|
||||
|
||||
.. grid:: 1 2 2 2
|
||||
:gutter: 2
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Get Started**
|
||||
^^^
|
||||
|
||||
Documents in these sections helps you getting started quickly with Serving.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Serving in 5 minutes <./QuickStart/serving-quickstart.html>` |
|
||||
:bdg-link:`Installation <./ProgrammingGuide/serving-installation.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Key Features Guide**
|
||||
^^^
|
||||
|
||||
Each guide in this section provides you with in-depth information, concepts and knowledges about DLLib key features.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Start Serving <./ProgrammingGuide/serving-start.html>` |
|
||||
:bdg-link:`Inference <./ProgrammingGuide/serving-inference.html>`
|
||||
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**Examples**
|
||||
^^^
|
||||
|
||||
Cluster Serving Examples and Tutorials.
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`Examples <./Example/example.html>`
|
||||
|
||||
.. grid-item-card::
|
||||
|
||||
**MISC**
|
||||
^^^
|
||||
|
||||
Cluster Serving
|
||||
|
||||
+++
|
||||
|
||||
:bdg-link:`FAQ <./FAQ/faq.html>` |
|
||||
:bdg-link:`Contribute <./FAQ/contribute-guide.html>`
|
||||
|
||||
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
Cluster Serving Document <self>
|
||||
|
|
@ -72,31 +72,31 @@ You need to do the following preparations before starting the IDE to successfull
|
|||
- Build BigDL; see [here](#build) for more instructions.
|
||||
- Prepare Spark environment by either setting `SPARK_HOME` as the environment variable or `pip install pyspark`. Note that the Spark version should match the one you build BigDL on.
|
||||
- Check the jars under `BigDL/dist/lib` and set the environment variable `BIGDL_CLASSPATH`. Modify SPARKVERSION and BIGDLVERSION(Scala) as appropriate:
|
||||
```bash
|
||||
export BIGDL_CLASSPATH=BigDL/dist/lib/bigdl-dllib-spark_SPARKVERSION-BIGDLVERSION-jar-with-dependencies.jar:BigDL/dist/lib/bigdl-orca-spark_SPARKVERSION-BIGDLVERSION-jar-with-dependencies.jar:BigDL/dist/lib/bigdl-friesian-spark_SPARKVERSION-BIGDLVERSION-jar-with-dependencies.jar
|
||||
```
|
||||
```bash
|
||||
export BIGDL_CLASSPATH=BigDL/dist/lib/bigdl-dllib-spark_SPARKVERSION-BIGDLVERSION-jar-with-dependencies.jar:BigDL/dist/lib/bigdl-orca-spark_SPARKVERSION-BIGDLVERSION-jar-with-dependencies.jar:BigDL/dist/lib/bigdl-friesian-spark_SPARKVERSION-BIGDLVERSION-jar-with-dependencies.jar
|
||||
```
|
||||
- Configure BigDL source files to the Python interpreter:
|
||||
|
||||
You can easily do this after launching PyCharm by right clicking the folder `BigDL/python/dllib/src` -> __Mark Directory As__ -> __Sources Root__ (also do this for `BigDL/python/nano/src`, `BigDL/python/orca/src`, `BigDL/python/friesian/src`, `BigDL/python/chronos/src`, `BigDL/python/serving/src` if necessary).
|
||||
You can easily do this after launching PyCharm by right clicking the folder `BigDL/python/dllib/src` -> __Mark Directory As__ -> __Sources Root__ (also do this for `BigDL/python/nano/src`, `BigDL/python/orca/src`, `BigDL/python/friesian/src`, `BigDL/python/chronos/src`, `BigDL/python/serving/src` if necessary).
|
||||
|
||||
Alternatively, you can add BigDL source files to `PYTHONPATH`:
|
||||
```bash
|
||||
export PYTHONPATH=BigDL/python/dllib/src:BigDL/python/nano/src:BigDL/python/orca/src:BigDL/python/friesian/src:BigDL/python/chronos/src:BigDL/python/serving/src:$PYTHONPATH
|
||||
```
|
||||
Alternatively, you can add BigDL source files to `PYTHONPATH`:
|
||||
```bash
|
||||
export PYTHONPATH=BigDL/python/dllib/src:BigDL/python/nano/src:BigDL/python/orca/src:BigDL/python/friesian/src:BigDL/python/chronos/src:BigDL/python/serving/src:$PYTHONPATH
|
||||
```
|
||||
|
||||
- Add `spark-bigdl.conf` to `PYTHONPATH`:
|
||||
```bash
|
||||
export PYTHONPATH=BigDL/python/dist/conf/spark-bigdl.conf:$PYTHONPATH
|
||||
```
|
||||
```bash
|
||||
export PYTHONPATH=BigDL/python/dist/conf/spark-bigdl.conf:$PYTHONPATH
|
||||
```
|
||||
|
||||
- Install and add `tflibs` to `TF_LIBS_PATH`:
|
||||
```bash
|
||||
# Install bigdl-tf and bigdl-math
|
||||
pip install bigdl-tf bigdl-math
|
||||
```bash
|
||||
# Install bigdl-tf and bigdl-math
|
||||
pip install bigdl-tf bigdl-math
|
||||
|
||||
# Configure TF_LIBS_PATH
|
||||
export TF_LIBS_PATH=$(python -c 'import site; print(site.getsitepackages()[0])')/bigdl/share/tflibs
|
||||
```
|
||||
# Configure TF_LIBS_PATH
|
||||
export TF_LIBS_PATH=$(python -c 'import site; print(site.getsitepackages()[0])')/bigdl/share/tflibs
|
||||
```
|
||||
|
||||
|
||||
The above environment variables should be available when running or debugging code in the IDE. When running applications in PyCharm, you can add runtime environment variables by clicking __Run__ -> __Edit Configurations__; then in the __Run/Debug Configurations__ panel, you can add necessary environment variables to your applications.
|
||||
|
|
|
|||