Chronos: add quick tour page (#5217)

* add quick tour

* update some md

* update quick tour

* add some contents

* add some changes

* change link

* add new file

* fix some typo

* fix typo

* fix typo

* fix typo

* fix typo

* fix typos

* fix bug

* fix typos

* fix typos

* fix bug

* fix bug

* fix bug

* fix bug

* add information

* fix type

* add description

* add more content

* add some bug fix

* update document

* update

* update page

* fix typo

* update typos

* fix typo
This commit is contained in:
Junwei Deng 2022-08-09 12:16:38 +08:00 committed by GitHub
parent 99bef5579d
commit b2dac661db
5 changed files with 364 additions and 106 deletions

View file

@ -38,3 +38,4 @@ ConfigSpace==0.5.0
sphinx-design==0.2.0
sphinx-external-toc==0.3.0
nbsphinx==0.8.9
sphinx-design==0.2.0

View file

@ -53,6 +53,7 @@ subtrees:
entries:
- file: doc/Chronos/Overview/chronos
- file: doc/Chronos/Overview/windows_guide
- file: doc/Chronos/Overview/quick-tour
- file: doc/Chronos/Overview/deep_dive
- file: doc/Chronos/QuickStart/index
- file: doc/Chronos/Overview/chronos_known_issue

View file

@ -92,10 +92,10 @@ extensions = [
'sphinx_tabs.tabs',
'sphinx_design',
'sphinx_external_toc',
'sphinx_design',
'nbsphinx'
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

View file

@ -1,7 +1,7 @@
# Chronos User Guide
### **1. Overview**
_Chronos_ is an application framework for building large-scale time series analysis applications.
_BigDL-Chronos_ (_Chronos_ for short) is an application framework for building a fast, accurate and scalable time series analysis application.
You can use _Chronos_ to do:
@ -100,7 +100,7 @@ pip install --pre --upgrade bigdl-nano[tensorflow]
```
after you install the pytorch backend chronos.
#### OS and Python version requirement
#### **2.3 OS and Python version requirement**
```eval_rst
.. note::
@ -114,119 +114,74 @@ after you install the pytorch backend chronos.
Chronos only supports Python 3.7.2 ~ latest 3.7.x. We are validating more Python versions.
```
---
### **3. Run**
Various Python programming environments are supported to run a _Chronos_ application.
#### **3.1 Jupyter Notebook**
You can start the Jupyter notebook as you normally do using the following command and run _Chronos_ application directly in a Jupyter notebook:
```bash
jupyter notebook --notebook-dir=./ --ip=* --no-browser
```
#### **3.2 Python Script**
You can directly write _Chronos_ application in a Python file (e.g. script.py) and run in the command line as a normal Python program:
```bash
python script.py
```
### **3. Which document to see?**
```eval_rst
.. note::
**Optimization on Intel® Hardware**:
.. grid:: 2
:gutter: 1
Chronos integrated many optimized libraries and best known methods (BKMs), users can have best performance to add ``bigdl-nano-init`` before their scripts.
.. grid-item-card::
:class-footer: sd-bg-light
``bigdl-nano-init python script.py``
**Quick Tour**
^^^
Currently, this function is under active development and we encourage our users to add ``bigdl-nano-init`` for forecaster's training.
You may understand the basic usage of Chronos' components and learn to write the first runnable application in this quick tour page.
+++
`Quick Tour <./quick-tour.html>`_
.. grid-item-card::
:class-footer: sd-bg-light
**User Guides**
^^^
Our user guides provide you with in-depth information, concepts and knowledges about Chronos.
+++
`Data process <./data_processing_feature_engineering.html>`_ /
`Forecast <./forecasting.html>`_ /
`Detect <./anomaly_detection.html>`_ /
`Simulate <./simulation.html>`_
.. grid:: 2
:gutter: 1
.. grid-item-card::
:class-footer: sd-bg-light
**How-to-Guide**
^^^
If you are meeting with some specific problems during the usage, how-to guides are good place to be checked.
+++
Work In Progress
.. grid-item-card::
:class-footer: sd-bg-light
**API Document**
^^^
API Document provides you with a detailed description of the Chronos APIs.
+++
`API Document <../../PythonAPI/Chronos/index.html>`_
```
---
### **4. Get Started**
#### **4.1 Initialization**
_Chronos_ uses [Orca](../../Orca/Overview/orca.md) to enable distributed training and AutoML capabilities. Initialize orca as below when you want to:
1. Use the distributed mode of a forecaster.
2. Use automl to distributedly tuning your model.
3. Use `XshardsTSDataset` to process time series dataset in distribution fashion.
Otherwise, there is no need to initialize an orca context.
View [Orca Context](../../Orca/Overview/orca-context.md) for more details. Note that argument `init_ray_on_spark` must be `True` for _Chronos_.
```python
from bigdl.orca import init_orca_context, stop_orca_context
if __name__ == "__main__":
# run in local mode
init_orca_context(cluster_mode="local", cores=4, init_ray_on_spark=True)
# run on K8s cluster
init_orca_context(cluster_mode="k8s", num_nodes=2, cores=2, init_ray_on_spark=True)
# run on Hadoop YARN cluster
init_orca_context(cluster_mode="yarn-client", num_nodes=2, cores=2, init_ray_on_spark=True)
# >>> Start of Chronos Application >>>
# ...
# <<< End of Chronos Application <<<
stop_orca_context()
```
#### **4.2 AutoTS Example**
This example run a forecasting task with automl optimization with `AutoTSEstimator` on New York City Taxi Dataset. To run this example, install the following: `pip install --pre --upgrade bigdl-chronos[all]`.
```python
from bigdl.orca.automl import hp
from bigdl.chronos.data.repo_dataset import get_public_dataset
from bigdl.chronos.autots import AutoTSEstimator
from bigdl.orca import init_orca_context, stop_orca_context
from sklearn.preprocessing import StandardScaler
if __name__ == "__main__":
# initial orca context
init_orca_context(cluster_mode="local", cores=4, memory="8g", init_ray_on_spark=True)
# load dataset
tsdata_train, tsdata_val, tsdata_test = get_public_dataset(name='nyc_taxi')
# dataset preprocessing
stand = StandardScaler()
for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
tsdata.gen_dt_feature().impute()\
.scale(stand, fit=tsdata is tsdata_train)
# AutoTSEstimator initalization
autotsest = AutoTSEstimator(model="tcn",
future_seq_len=10)
# AutoTSEstimator fitting
tsppl = autotsest.fit(data=tsdata_train,
validation_data=tsdata_val)
# Evaluation
autotsest_mse = tsppl.evaluate(tsdata_test)
# stop orca context
stop_orca_context()
```
### **5. Details**
_Chronos_ provides flexible components for forecasting, detection, simulation and other userful functionalities. You may review following pages to fully learn how to use Chronos to build various time series related applications.
- [Time Series Processing and Feature Engineering Overview](./data_processing_feature_engineering.html)
- [Time Series Forecasting Overview](./forecasting.html)
- [Time Series Anomaly Detection Overview](./anomaly_detection.html)
- [Generate Synthetic Sequential Data Overview](./simulation.html)
- [Useful Functionalities Overview](./useful_functionalities.html)
- [Speed up Chronos built-in/customized models](./speed_up.html)
- [Chronos API Doc](../../PythonAPI/Chronos/index.html)
### **6. Examples and Demos**
### **4. Examples and Demos**
- Quickstarts
- [Use AutoTSEstimator for Time-Series Forecasting](../QuickStart/chronos-autotsest-quickstart.html)
- [Use TSDataset and Forecaster for Time-Series Forecasting](../QuickStart/chronos-tsdataset-forecaster-quickstart.html)

View file

@ -0,0 +1,301 @@
Chronos Quick Tour
======================
Welcome to Chronos for building a fast, accurate and scalable time series analysis application🎉! Start with our quick tour to understand some critical concepts and how to use them to tackle your tasks.
.. grid:: 1 1 1 1
.. grid-item-card::
:text-align: center
:shadow: none
:class-header: sd-bg-light
:class-footer: sd-bg-light
:class-card: sd-mb-2
**Data processing**
^^^
Time series data processing includes imputing, deduplicating, resampling, scale/unscale, roll sampling, etc to process raw time series data(typically in a table) to a format that is understandable to the models. ``TSDataset`` and ``XShardsTSDataset`` are provided for an abstraction.
+++
.. button-ref:: TSDataset/XShardsTSDataset
:color: primary
:expand:
:outline:
Get Started
.. grid:: 1 1 3 3
.. grid-item-card::
:text-align: center
:shadow: none
:class-header: sd-bg-light
:class-footer: sd-bg-light
:class-card: sd-mb-2
**Forecasting**
^^^
Time series forecasting uses history data to predict future data. ``Forecaster`` and ``AutoTSEstimator`` are provided for built-in algorithms and distributed hyperparameter tunning.
+++
.. button-ref:: Forecaster
:color: primary
:expand:
:outline:
Get Started
.. grid-item-card::
:text-align: center
:shadow: none
:class-header: sd-bg-light
:class-footer: sd-bg-light
:class-card: sd-mb-2
**Anomaly Detection**
^^^
Time series anomaly detection finds the anomaly point in time series. ``Detector`` is provided for many built-in algorithms.
+++
.. button-ref:: Detector
:color: primary
:expand:
:outline:
Get Started
.. grid-item-card::
:text-align: center
:shadow: none
:class-header: sd-bg-light
:class-footer: sd-bg-light
:class-card: sd-mb-2
**Simulation**
^^^
Time series simulation generates synthetic time series data. ``Simulator`` is provided for many built-in algorithms.
+++
.. button-ref:: Simulator(experimental)
:color: primary
:expand:
:outline:
Get Started
TSDataset/XShardsTSDataset
---------------------
In Chronos, we provide a ``TSDataset`` (and a ``XShardsTSDataset`` to handle large data input in distributed fashion) abstraction to represent a time series dataset. It is responsible for preprocessing raw time series data(typically in a table) to a format that is understandable to the models. Many typical transformation, preprocessing and feature engineering method can be called cascadely on ``TSDataset`` or ``XShardsTSDataset``.
.. code-block:: python
# !wget https://raw.githubusercontent.com/numenta/NAB/v1.0/data/realKnownCause/nyc_taxi.csv
import pandas as pd
from sklearn.preprocessing import StandardScaler
from bigdl.chronos.data import TSDataset
df = pd.read_csv("nyc_taxi.csv", parse_dates=["timestamp"])
tsdata = TSDataset.from_pandas(df,
dt_col="timestamp",
target_col="value")
scaler = StandardScaler()
tsdata.deduplicate()\
.impute()\
.gen_dt_feature()\
.scale(scaler)\
.roll(lookback=100, horizon=1)
.. grid:: 2
:gutter: 1
.. grid-item-card::
.. button-ref:: ./data_processing_feature_engineering
:color: primary
:expand:
:outline:
Tutorial
.. grid-item-card::
.. button-ref:: ../../PythonAPI/Chronos/tsdataset
:color: primary
:expand:
:outline:
API Document
Forecaster
-----------------------
We have implemented quite a few algorithms among traditional statistics to deep learning for time series forecasting in ``bigdl.chronos.forecaster`` package. Users may train these forecasters on history time series and use them to predict future time series.
To import a specific forecaster, you may use {algorithm name} + "Forecaster", and call ``fit`` to train the forecaster and ``predict`` to predict future data.
.. code-block:: python
from bigdl.chronos.forecaster import TCNForecaster # TCN is algorithm name
from bigdl.chronos.data.repo_dataset import get_public_dataset
if __name__ == "__main__":
# use nyc_taxi public dataset
train_data, _, test_data = get_public_dataset("nyc_taxi")
for data in [train_data, test_data]:
# use 100 data point in history to predict 1 data point in future
data.roll(lookback=100, horizon=1)
# create a forecaster
forecaster = TCNForecaster.from_tsdataset(train_data)
# train the forecaster
forecaster.fit(train_data)
# predict with the trained forecaster
pred = forecaster.predict(test_data)
AutoTSEstimator
---------------------------
For time series forecasting, we also provide an ``AutoTSEstimator`` for distributed hyperparameter tunning as an extention to ``Forecaster``. Users only need to create a ``AutoTSEstimator`` and call ``fit`` to train the estimator. A ``TSPipeline`` will be returned for users to predict future data.
.. code-block:: python
from bigdl.orca.automl import hp
from bigdl.chronos.data.repo_dataset import get_public_dataset
from bigdl.chronos.autots import AutoTSEstimator
from bigdl.orca import init_orca_context, stop_orca_context
from sklearn.preprocessing import StandardScaler
if __name__ == "__main__":
# initial orca context
init_orca_context(cluster_mode="local", cores=4, memory="8g", init_ray_on_spark=True)
# load dataset
tsdata_train, tsdata_val, tsdata_test = get_public_dataset(name='nyc_taxi')
# dataset preprocessing
stand = StandardScaler()
for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
tsdata.gen_dt_feature().impute()\
.scale(stand, fit=tsdata is tsdata_train)
# AutoTSEstimator initalization
autotsest = AutoTSEstimator(model="tcn",
future_seq_len=10)
# AutoTSEstimator fitting
tsppl = autotsest.fit(data=tsdata_train,
validation_data=tsdata_val)
# Prediction
pred = tsppl.predict(tsdata_test)
# stop orca context
stop_orca_context()
.. grid:: 3
:gutter: 1
.. grid-item-card::
.. button-ref:: ../QuickStart/chronos-tsdataset-forecaster-quickstart
:color: primary
:expand:
:outline:
Quick Start
.. grid-item-card::
.. button-ref:: ./forecasting
:color: primary
:expand:
:outline:
Tutorial
.. grid-item-card::
.. button-ref:: ../../PythonAPI/Chronos/forecasters
:color: primary
:expand:
:outline:
API Document
Detector
--------------------
We have implemented quite a few algorithms among traditional statistics to deep learning for time series anomaly detection in ``bigdl.chronos.detector.anomaly`` package.
To import a specific detector, you may use {algorithm name} + "Detector", and call ``fit`` to train the detector and ``anomaly_indexes`` to get anomaly data points' indexs.
.. code-block:: python
from bigdl.chronos.detector.anomaly import DBScanDetector # DBScan is algorithm name
from bigdl.chronos.data.repo_dataset import get_public_dataset
if __name__ == "__main__":
# use nyc_taxi public dataset
train_data = get_public_dataset("nyc_taxi", with_split=False)
# create a detector
detector = DBScanDetector()
# fit a detector
detector.fit(train_data.to_pandas()['value'].to_numpy())
# find the anomaly points
anomaly_indexes = detector.anomaly_indexes()
.. grid:: 3
:gutter: 1
.. grid-item-card::
.. button-ref:: ../QuickStart/chronos-anomaly-detector
:color: primary
:expand:
:outline:
Quick Start
.. grid-item-card::
.. button-ref:: ./anomaly_detection
:color: primary
:expand:
:outline:
Tutorial
.. grid-item-card::
.. button-ref:: ../../PythonAPI/Chronos/anomaly_detectors
:color: primary
:expand:
:outline:
API Document
Simulator(experimental)
---------------------
Simulator is still under activate development with unstable API.
.. grid:: 2
:gutter: 1
.. grid-item-card::
.. button-ref:: ./simulation
:color: primary
:expand:
:outline:
Tutorial
.. grid-item-card::
.. button-ref:: ../../PythonAPI/Chronos/simulator
:color: primary
:expand:
:outline:
API Document