diff --git a/docs/readthedocs/source/doc/Chronos/Overview/anomaly_detection.md b/docs/readthedocs/source/doc/Chronos/Overview/anomaly_detection.md
index c021c6dc..8fc00aa2 100644
--- a/docs/readthedocs/source/doc/Chronos/Overview/anomaly_detection.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/anomaly_detection.md
@@ -4,20 +4,20 @@ Anomaly Detection detects abnormal samples in a given time series. _Chronos_ pro
 
 View some examples notebooks for [Datacenter AIOps][AIOps].
 
-## **1. ThresholdDetector**
+## 1. ThresholdDetector
 
 ThresholdDetector detects anomaly based on threshold. It can be used to detect anomaly on a given time series ([notebook][AIOps_anomaly_detect_unsupervised]), or used together with [Forecasters](#forecasting) to detect anomaly on new coming samples ([notebook][AIOps_anomaly_detect_unsupervised_forecast_based]).
 
 View [ThresholdDetector API Doc](../../PythonAPI/Chronos/anomaly_detectors.html#chronos-model-anomaly-th-detector) for more details.
 
 
-## **2. AEDetector**
+## 2. AEDetector
 
 AEDetector detects anomaly based on the reconstruction error of an autoencoder network.
 
 View anomaly detection [notebook][AIOps_anomaly_detect_unsupervised] and [AEDetector API Doc](../../PythonAPI/Chronos/anomaly_detectors.html#chronos-model-anomaly-ae-detector) for more details.
 
-## **3. DBScanDetector**
+## 3. DBScanDetector
 
 DBScanDetector uses DBSCAN clustering algortihm for anomaly detection.
 
diff --git a/docs/readthedocs/source/doc/Chronos/Overview/chronos_known_issue.md b/docs/readthedocs/source/doc/Chronos/Overview/chronos_known_issue.md
index f00a4cc4..79a56bb1 100644
--- a/docs/readthedocs/source/doc/Chronos/Overview/chronos_known_issue.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/chronos_known_issue.md
@@ -1,5 +1,5 @@
 # Chronos Known Issue
-## **1. Issue 1**
+## 1. Issue 1
 **Problem description**
 
 Numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject. 
@@ -11,7 +11,7 @@ Numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 fro
 
 ---------------------------
 
-## **2. Issue 2**
+## 2. Issue 2
 **Problem description**
 
 NotImplementedError: Cannot convert a symbolic Tensor (encoder_lstm_8/strided_slice:0) to a numpy array. 
@@ -22,7 +22,7 @@ NotImplementedError: Cannot convert a symbolic Tensor (encoder_lstm_8/strided_sl
 
 ---------------------------
 
-## **3. Issue 3**
+## 3. Issue 3
 
 **Problem description**
 
@@ -35,7 +35,7 @@ StanModel object has no attribute 'fit_class', cause of pip, may be.
 
 ---------------------------
 
-## **4. Issue 4**
+## 4. Issue 4
 **Problem description**
 
 Exception: No active RayContext. Please call init_orca_context to create a RayContext.
@@ -49,7 +49,7 @@ Exception: No active RayContext. Please call init_orca_context to create a RayCo
 
 ---------------------------
 
-## **5. Issue 5**
+## 5. Issue 5
 **Problem description**
 
  Sed: error while loading shared libraries: libunwind.so.8: cannot open shared object file: No such file or directory.
diff --git a/docs/readthedocs/source/doc/Chronos/Overview/data_processing_feature_engineering.md b/docs/readthedocs/source/doc/Chronos/Overview/data_processing_feature_engineering.md
index 5d21a0ca..1ca35634 100644
--- a/docs/readthedocs/source/doc/Chronos/Overview/data_processing_feature_engineering.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/data_processing_feature_engineering.md
@@ -4,7 +4,7 @@ Time series data is a special data formulation with its specific operations. _Ch
 
 Users can create a [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) quickly from many raw data types, including pandas dataframe, parquet files, spark dataframe or xshards objects. [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) can be directly used in [`AutoTSEstimator`](../../PythonAPI/Chronos/autotsestimator.html#autotsestimator) and [forecasters](../../PythonAPI/Chronos/forecasters). It can also be converted to pandas dataframe, numpy ndarray, pytorch dataloaders or tensorflow dataset for various usage.
 
-## **1. Basic concepts**
+## 1. Basic concepts
 
 A time series can be interpreted as a sequence of real value whose order is timestamp. While a time series dataset can be a combination of one or a huge amount of time series. It may contain multiple time series since users may collect different time series in the same/different period of time (e.g. An AIops dataset may have CPU usage ratio and memory usage ratio data for two servers at a period of time. This dataset contains four time series).
 
@@ -22,7 +22,7 @@ All the preprocessing operations will be done on each independent time series(i.
 
 ```
 
-## **2. Create a TSDataset**
+## 2. Create a TSDataset
 
 [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) supports initializing from a pandas dataframe through [`TSDataset.from_pandas`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.from_pandas) or from a parquet file through [`TSDataset.from_parquet`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.from_parquet).
 
@@ -80,7 +80,7 @@ You can initialize a [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html
 
 If you are building a prototype for your forecasting/anomaly detection task and you need to split you TSDataset to train/valid/test set, you can use `with_split` parameter.[`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) or [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) supports split with ratio by `val_ratio` and `test_ratio`.
 
-## **3. Time series dataset preprocessing**
+## 3. Time series dataset preprocessing
 [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) supports [`impute`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.impute), [`deduplicate`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.deduplicate) and [`resample`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.resample). You may fill the missing point by [`impute`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.impute) in different modes. You may remove the records that are totally the same by [`deduplicate`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.deduplicate). You may change the sample frequency by [`resample`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.resample). [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) only supports [`impute`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.experimental.xshards_tsdataset.XShardsTSDataset.impute) for now.
 
 A typical cascade call for preprocessing is:
@@ -99,7 +99,7 @@ A typical cascade call for preprocessing is:
 
             tsdata.impute()
 ```
-## **4. Feature scaling**
+## 4. Feature scaling
 Scaling all features to one distribution is important, especially when we want to train a machine learning/deep learning system. Scaling will make the training process much more stable. Still, we may always remember to unscale the prediction result at last.
 
 [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) and [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) support all the scalers in sklearn through [`scale`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.scale) and [`unscale`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.unscale) method.
@@ -169,16 +169,16 @@ A typical call is:
             unscaled_y = tsdata_test.unscale_xshards(y, key="y")
             # calculate metric by unscaled_yhat and unscaled_y
 ```
-## **5. Feature generation**
+## 5. Feature generation
 Other than historical target data and other extra feature provided by users, some additional features can be generated automatically by [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html). [`gen_dt_feature`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.gen_dt_feature) helps users to generate 10 datetime related features(e.g. MONTH, WEEKDAY, ...). [`gen_global_feature`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.gen_global_feature) and [`gen_rolling_feature`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.gen_rolling_feature) are powered by tsfresh to generate aggregated features (e.g. min, max, ...) for each time series or rolling windows respectively.
 
-## **6. Sampling and exporting**
+## 6. Sampling and exporting
 A time series dataset needs to be sampling and exporting as numpy ndarray/dataloader to be used in machine learning and deep learning models(e.g. forecasters, anomaly detectors, auto models, etc.).
 ```eval_rst
 .. warning::
     You don't need to call any sampling or exporting methods introduced in this section when using ``AutoTSEstimator``.
 ```
-### **6.1 Roll sampling**
+### 6.1 Roll sampling
 Roll sampling (or sliding window sampling) is useful when you want to train a RR type supervised deep learning forecasting model. It works as the [diagram](#RR-forecast-image) shows.
 
 
@@ -228,7 +228,7 @@ A typical call of [`roll`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.
             forecaster.fit(data)
 ```
 
-### **6.2 Pandas Exporting**
+### 6.2 Pandas Exporting
 Now we support pandas dataframe exporting through `to_pandas()` for users to carry out their own transformation. Here is an example of using only one time series for anomaly detection.
 ```python
 # anomaly detector on "target" col
@@ -237,7 +237,7 @@ anomaly_detector.fit(x)
 ```
 View [TSDataset API Doc](../../PythonAPI/Chronos/tsdataset.html#) for more details.
 
-## **7. Built-in Dataset**
+## 7. Built-in Dataset
 
 Built-in Dataset supports the function of data downloading, preprocessing, and returning to the `TSDataset` object of the public data set.
 
diff --git a/docs/readthedocs/source/doc/Chronos/Overview/forecasting.md b/docs/readthedocs/source/doc/Chronos/Overview/forecasting.md
index ba1f5236..7ca0ec9d 100644
--- a/docs/readthedocs/source/doc/Chronos/Overview/forecasting.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/forecasting.md
@@ -7,7 +7,7 @@ There're three ways to do forecasting:
 - Use [**auto forecasting models**](#use-auto-forecasting-model) with auto hyperparameter optimization.
 - Use [**standalone forecasters**](#use-standalone-forecaster-pipeline).
 
-#### **0. Supported Time Series Forecasting Model**
+#### 0. Supported Time Series Forecasting Model
 
 - `Model`: Model name.
 - `Style`: Forecasting model style. Detailed information will be stated in [this section](#time-series-forecasting-concepts).
@@ -41,15 +41,15 @@ There're three ways to do forecasting:
 
 
 
-#### **1. Time Series Forecasting Concepts**
+#### 1. Time Series Forecasting Concepts
 Time series forecasting is one of the most popular tasks on time series data. **In short, forecasing aims at predicting the future by using the knowledge you can learn from the history.**
 
-##### **1.1 Traditional Statistical(TS) Style**
+##### 1.1 Traditional Statistical(TS) Style
 Traditionally, Time series forecasting problem was formulated with rich mathematical fundamentals and statistical models. Typically, one model can only handle one time series and fit on the whole time series before the last observed timestamp and predict the next few steps. Training(fit) is needed every time you change the last observed timestamp.
 
 ![](../Image/forecast-TS.png)
 
-##### **1.2 Regular Regression(RR) Style**
+##### 1.2 Regular Regression(RR) Style
 Recent years, common deep learning architectures (e.g. RNN, CNN, Transformer, etc.) are being successfully applied to forecasting problem. Forecasting is transformed to a supervised learning regression problem in this style. A model can predict several time series. Typically, a sampling process based on sliding-window is needed, some terminology is explained as following:
 
 - `lookback` / `past_seq_len`: the length of historical data along time. This number is tunable.
@@ -60,7 +60,7 @@ Recent years, common deep learning architectures (e.g. RNN, CNN, Transformer, et
 <span id="RR-forecast-image"></span>
 ![](../Image/forecast-RR.png)
 
-#### **2. Use AutoTS Pipeline**
+#### 2. Use AutoTS Pipeline
 For AutoTS Pipeline, we will leverage `AutoTSEstimator`, `TSPipeline` and preferably `TSDataset`. A typical usage of AutoTS pipeline basically contains 3 steps.
 1. Prepare a `TSDataset` or customized data creator.
 2. Init a `AutoTSEstimator` and call `.fit()` on the data.
@@ -75,7 +75,7 @@ For AutoTS Pipeline, we will leverage `AutoTSEstimator`, `TSPipeline` and prefer
 ```
 View [Quick Start](../QuickStart/chronos-autotsest-quickstart.html) for a more detailed example.
 
-##### **2.1 Prepare dataset**
+##### 2.1 Prepare dataset
 `AutoTSEstimator` support 2 types of data input.
 
 You can easily prepare your data in `TSDataset` (recommended). You may refer to [here](#TSDataset) for the detailed information to prepare your `TSDataset` with proper data processing and feature generation. Here is a typical `TSDataset` preparation.
@@ -98,7 +98,7 @@ from torch.utils.data import DataLoader
 def training_data_creator(config):
     return Dataloader(..., batch_size=config['batch_size'])
 ```
-##### **2.2 Create an AutoTSEstimator**
+##### 2.2 Create an AutoTSEstimator
 `AutoTSEstimator` depends on the [Distributed Hyper-parameter Tuning](../../Orca/Overview/distributed-tuning.html) supported by Project Orca. It also provides time series only functionalities and optimization. Here is a typical initialization process.
 ```python
 import bigdl.orca.automl.hp as hp
@@ -115,7 +115,7 @@ We prebuild three defualt search space for each build-in model, which you can us
 
 `selected_features` is set to `"auto"` by default, where the `AutoTSEstimator` will find the best subset of extra features to help the forecasting task.
 
-##### **2.3 Fit on AutoTSEstimator**
+##### 2.3 Fit on AutoTSEstimator
 Fitting on `AutoTSEstimator` is fairly easy. A `TSPipeline` will be returned once fitting is completed.
 ```python
 ts_pipeline = auto_estimator.fit(data=tsdata_train,
@@ -124,7 +124,7 @@ ts_pipeline = auto_estimator.fit(data=tsdata_train,
                                  epochs=5)
 ```
 Detailed information and settings please refer to [AutoTSEstimator API doc](../../PythonAPI/Chronos/autotsestimator.html#id1).
-##### **2.4 Development on TSPipeline**
+##### 2.4 Development on TSPipeline
 You may carry out predict, evaluate, incremental training or save/load for further development.
 ```python
 # predict with the best trial
@@ -154,7 +154,7 @@ Detailed information please refer to [TSPipeline API doc](../../PythonAPI/Chrono
     Incremental fitting on TSPipeline just update the model weights the standard way, which does not involve AutoML.
 ```
 
-#### **3. Use Standalone Forecaster Pipeline**
+#### 3. Use Standalone Forecaster Pipeline
 
 _Chronos_ provides a set of standalone time series forecasters without AutoML support, including deep learning models as well as traditional statistical models.
 
@@ -173,28 +173,28 @@ The input data can be easily get from `TSDataset`.
 View [Quick Start](../QuickStart/chronos-tsdataset-forecaster-quickstart.md) for a more detailed example. Refer to [API docs](../../PythonAPI/Chronos/forecasters.html) of each Forecaster for detailed usage instructions and examples.
 
 <span id="LSTMForecaster"></span>
-##### **3.1 LSTMForecaster**
+##### 3.1 LSTMForecaster
 
 LSTMForecaster wraps a vanilla LSTM model, and is suitable for univariate time series forecasting.
 
 View Network Traffic Prediction [notebook][network_traffic_model_forecasting] and [LSTMForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#lstmforecaster) for more details.
 
 <span id="Seq2SeqForecaster"></span>
-##### **3.2 Seq2SeqForecaster**
+##### 3.2 Seq2SeqForecaster
 
 Seq2SeqForecaster wraps a sequence to sequence model based on LSTM, and is suitable for multivariant & multistep time series forecasting.
 
 View [Seq2SeqForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#seq2seqforecaster) for more details.
 
 <span id="TCNForecaster"></span>
-##### **3.3 TCNForecaster**
+##### 3.3 TCNForecaster
 
 Temporal Convolutional Networks (TCN) is a neural network that use convolutional architecture rather than recurrent networks. It supports multi-step and multi-variant cases. Causal Convolutions enables large scale parallel computing which makes TCN has less inference time than RNN based model such as LSTM.
 
 View Network Traffic multivariate multistep Prediction [notebook][network_traffic_multivariate_multistep_tcnforecaster] and [TCNForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#tcnforecaster) for more details.
 
 <span id="MTNetForecaster"></span>
-##### **3.4 MTNetForecaster**
+##### 3.4 MTNetForecaster
 
 ```eval_rst
 .. note::
@@ -209,14 +209,14 @@ MTNetForecaster wraps a MTNet model. The model architecture mostly follows the [
 View Network Traffic Prediction [notebook][network_traffic_model_forecasting] and [MTNetForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#mtnetforecaster) for more details.
 
 <span id="TCMFForecaster"></span>
-##### **3.5 TCMFForecaster**
+##### 3.5 TCMFForecaster
 
 TCMFForecaster wraps a model architecture that follows implementation of the paper [DeepGLO paper](https://arxiv.org/abs/1905.03806) with slight modifications. It is especially suitable for extremely high dimensional (up-to millions) multivariate time series forecasting.
 
 View High-dimensional Electricity Data Forecasting [example][run_electricity] and [TCMFForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#tcmfforecaster) for more details.
 
 <span id="ARIMAForecaster"></span>
-##### **3.6 ARIMAForecaster**
+##### 3.6 ARIMAForecaster
 
 ```eval_rst
 .. note::
@@ -231,7 +231,7 @@ ARIMAForecaster wraps a ARIMA model and is suitable for univariate time series f
 View [ARIMAForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#arimaforecaster) for more details.
 
 <span id="ProphetForecaster"></span>
-##### **3.7 ProphetForecaster**
+##### 3.7 ProphetForecaster
 
 ```eval_rst
 .. note::
@@ -252,11 +252,11 @@ ProphetForecaster wraps the Prophet model ([site](https://github.com/facebook/pr
 View Stock Prediction [notebook][stock_prediction_prophet] and [ProphetForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#prophetforecaster) for more details.
 
 <span id="NBeatsForecaster"></span>
-##### **3.8 NBeatsForecaster**
+##### 3.8 NBeatsForecaster
 
 Neural basis expansion analysis for interpretable time series forecasting ([N-BEATS](https://arxiv.org/abs/1905.10437)) is a deep neural architecture based on backward and forward residual links and a very deep stack of fully-connected layers. Nbeats can solve univariate time series point forecasting problems, being interpretable, and fast to train.
 
-#### **4. Use Auto forecasting model**
+#### 4. Use Auto forecasting model
 Auto forecasting models are designed to be used exactly the same as Forecasters. The only difference is that you can set hp search function to the hyperparameters and the `.fit()` method will search the best hyperparameter setting.
 ```python
 # set hyperparameters in hp search function, loss, metric...
diff --git a/docs/readthedocs/source/doc/Chronos/Overview/install.md b/docs/readthedocs/source/doc/Chronos/Overview/install.md
index 3705f2af..7de074be 100644
--- a/docs/readthedocs/source/doc/Chronos/Overview/install.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/install.md
@@ -2,7 +2,7 @@
 
 ---
 
-#### **OS and Python version requirement**
+#### OS and Python version requirement
 
 
 ```eval_rst
@@ -20,7 +20,7 @@
 
 
 
-#### **Install using Conda**
+#### Install using Conda
 
 We recommend using conda to manage the Chronos python environment. For more information about Conda, refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
 Select your preferences in the panel below to find the proper install command. Then run the install command as the example shown below.
diff --git a/docs/readthedocs/source/doc/Chronos/Overview/simulation.md b/docs/readthedocs/source/doc/Chronos/Overview/simulation.md
index 373a6757..03fcc3bd 100644
--- a/docs/readthedocs/source/doc/Chronos/Overview/simulation.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/simulation.md
@@ -7,7 +7,7 @@ Chronos provides simulators to generate synthetic time series data for users who
      ``DPGANSimulator`` is the only simulator chronos provides at the moment, more simulators are on their way.
 ```
 
-## **1. DPGANSimulator**
+## 1. DPGANSimulator
 `DPGANSimulator` adopt DoppelGANger raised in [Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions](http://arxiv.org/abs/1909.13403). The method is data-driven unsupervised method based on deep learning model with GAN (Generative Adversarial Networks) structure. The model features a pair of seperate attribute generator and feature generator and their corresponding discriminators `DPGANSimulator` also supports a rich and comprehensive input data (training data) format and outperform other algorithms in many evalution metrics.
 
 ```eval_rst
diff --git a/docs/readthedocs/source/doc/Chronos/Overview/speed_up.md b/docs/readthedocs/source/doc/Chronos/Overview/speed_up.md
index 52f729be..6eada0c4 100644
--- a/docs/readthedocs/source/doc/Chronos/Overview/speed_up.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/speed_up.md
@@ -10,10 +10,10 @@ We will focus on **single node acceleration for forecasting models' training and
 
 You may refer to other pages listed above.
 
-### **1. Overview**
+### 1. Overview
 Time series model, especially those deep learning models, often suffers slow training speed and unsatisfying inference speed. Chronos is adapted to integrate many optimized library and best known methods(BKMs) for performance improvement on built-in models and customized models.
 
-### **2. Training Acceleration**
+### 2. Training Acceleration
 Training Acceleration is transparent in Chronos's API. Transparentness means that Chronos users will enjoy the acceleration without changing their code(unless some expert users want to set some advanced settings).
 ```eval_rst
 .. note::
@@ -21,7 +21,7 @@ Training Acceleration is transparent in Chronos's API. Transparentness means tha
 
      Chronos will automatically utilize the computation resources on the hardware. This may include multi-process training on a single node. Use this header will prevent many strange behavior.
 ```
-#### **2.1 `Forecaster` Training Acceleration**
+#### 2.1 `Forecaster` Training Acceleration
 Currently, transparent acceleration for `LSTMForecaster`, `Seq2SeqForecaster`, `TCNForecaster` and `NBeatsForecaster` is **automatically enabled** and tested. Chronos will set various environment variables and config multi-processing training according to the hardware paremeters(e.g. cores number, ...).
 
 Currently, this function is under active development and **some expert users may want to change some config or disable some acceleration tricks**. Here are some instructions.
@@ -43,7 +43,7 @@ forecaster.use_ipex = True  # enable ipex during training
 forecaster.use_ipex = False  # disable ipex during training
 ```
 
-#### **2.2 Customized Model Training Acceleration**
+#### 2.2 Customized Model Training Acceleration
 We provide an optimized pytorch-lightning Trainer, `TSTrainer`, to accelerate customized time series model defined by pytorch. A typical use-case can be using `pytorch-forecasting`'s built-in models(they are defined in pytorch-lightning LightningModule) and Chronos `TSTrainer` to accelerate the training process.
 
 `TSTrainer` requires very few code changes to your original code. Here is a quick guide:
@@ -61,13 +61,13 @@ trainer = Trainer(...
 
 We have examples adapted from `pytorch-forecasting`'s examples to show the significant speed-up by using `TSTrainer` in our [use-case](https://github.com/intel-analytics/BigDL/tree/main/python/chronos/use-case/pytorch-forecasting).
 
-#### **2.3 Auto Tuning Acceleration**
+#### 2.3 Auto Tuning Acceleration
 We are working on the acceleration of `AutoModel` and `AutoTSEstimator`. Please unset the environment by:
 ```bash
 source bigdl-nano-unset-env
 ```
 
-### **3. Inference Acceleration**
+### 3. Inference Acceleration
 Inference has become a critical part for time series model's performance. This may be divided to two parts:
 - Throughput: how many samples can be predicted in a certain amount of time.
 - Latency: how much time is used to predict 1 sample.
@@ -84,22 +84,22 @@ Typically, throughput and latency is a trade-off pair. We have three optimizatio
 
     ``pip install neural-compressor==1.8.1``
 ```
-#### **3.1 `Forecaster` Inference Acceleration**
-##### **3.1.1 Default Acceleration**
+#### 3.1 `Forecaster` Inference Acceleration
+##### 3.1.1 Default Acceleration
 Nothing needs to be done. Chronos has deployed accleration for inferencing. **some expert users may want to change some config or disable some acceleration tricks**. Here are some instructions:
 
 Users may unset the environment by:
 ```bash
 source bigdl-nano-unset-env
 ```
-##### **3.1.2 ONNX Runtime**
+##### 3.1.2 ONNX Runtime
 LSTM, TCN, Seq2seq and NBeats has supported onnx in their forecasters. When users use these built-in models, they may call `predict_with_onnx`/`evaluate_with_onnx` for prediction or evaluation. They may also call `export_onnx_file` to export the onnx model file and `build_onnx` to change the onnxruntime's setting(not necessary).
 ```python
 f = Forecaster(...)
 f.fit(...)
 f.predict_with_onnx(...)
 ```
-##### **3.1.3 Quantization**
+##### 3.1.3 Quantization
 LSTM, TCN and NBeats has supported quantization in their forecasters.
 ```python
 # init
@@ -127,15 +127,15 @@ f.load(checkpoint_file="fp32.model"
 ```
 Please refer to [Forecaster API Docs](../../PythonAPI/Chronos/forecasters.html) for details.
 
-#### **3.2 `TSPipeline` Inference Acceleration**
+#### 3.2 `TSPipeline` Inference Acceleration
 Basically same to [`Forecaster`](#31-forecaster-inference-acceleration)
-##### **3.1.1 Default Acceleration**
+##### 3.2.1 Default Acceleration
 Basically same to [`Forecaster`](#31-forecaster-inference-acceleration)
-##### **3.1.2 ONNX Runtime**
+##### 3.2.2 ONNX Runtime
 ```python
 tsppl.predict_with_onnx(...)
 ```
-##### **3.1.3 Quantization**
+##### 3.2.3 Quantization
 ```python
 tsppl.quantize(...)
 tsppl.predict/predict_with_onnx(test_data, quantize=True/False)
diff --git a/docs/readthedocs/source/doc/Chronos/Overview/useful_functionalities.md b/docs/readthedocs/source/doc/Chronos/Overview/useful_functionalities.md
index c45acfc8..64c270b9 100644
--- a/docs/readthedocs/source/doc/Chronos/Overview/useful_functionalities.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/useful_functionalities.md
@@ -1,7 +1,7 @@
 # Distributed Processing
 
 
-#### **Distributed training**
+#### Distributed training
 LSTM, TCN and Seq2seq users can easily train their forecasters in a distributed fashion to **handle extra large dataset and utilize a cluster**. The functionality is powered by Project Orca.
 ```python
 f = Forecaster(..., distributed=True)
@@ -10,7 +10,7 @@ f.predict(...)
 f.to_local()  # collect the forecaster to single node
 f.predict_with_onnx(...)  # onnxruntime only supports single node
 ```
-#### **Distributed Data processing: XShardsTSDataset**
+#### Distributed Data processing: XShardsTSDataset
 ```eval_rst
 .. warning::
     ``XShardsTSDataset`` is still experimental.
diff --git a/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-autotsest-quickstart.md b/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-autotsest-quickstart.md
index 417c5280..a8a8766b 100644
--- a/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-autotsest-quickstart.md
+++ b/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-autotsest-quickstart.md
@@ -8,13 +8,13 @@
 
 **In this guide we will demonstrate how to use _Chronos AutoTSEstimator_ and _Chronos TSPipeline_ to auto tune a time seires forecasting task and handle the whole model development process easily.**
 
-### **Introduction**
+### Introduction
 
 Chronos provides `AutoTSEstimator` as a highly integrated solution for time series forecasting task with hyperparameter autotuning, auto feature selection and auto preprocessing. Users can prepare a `TSDataset`(recommended, used in this notebook) or their own data creator as input data. By constructing a `AutoTSEstimator` and calling `fit` on the data, a `TSPipeline` contains the best model and pre/post data processing will be returned for further development of deployment.
 
 `AutoTSEstimator` only support LSTM, TCN, and Seq2seq built-in models and 3rd party models for now.
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../Overview/chronos.html#install) for more details.
 
@@ -24,7 +24,7 @@ conda activate my_env
 pip install --pre --upgrade bigdl-chronos[all]
 ```
 
-### **Step 1: Init Orca Context**
+### Step 1: Init Orca Context
 ```python
 if args.cluster_mode == "local":
     init_orca_context(cluster_mode="local", cores=4) # run in local mode
@@ -37,7 +37,7 @@ This is the only place where you need to specify local or distributed mode. View
 
 **Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.
 
-### **Step 2: Prepare a TSDataset**
+### Step 2: Prepare a TSDataset
 Prepare a `TSDataset` and call necessary operations on it.
 ```python
 from bigdl.chronos.data import TSDataset
@@ -58,7 +58,7 @@ Please call `.gen_dt_feature()`(recommended), `.gen_rolling_feature()`, and `gen
 
 Detailed information please refer to [TSDataset API doc](../../PythonAPI/Chronos/tsdataset.html) and [Time series data basic concepts](../Overview/data_processing_feature_engineering.html).
 
-### **Step 3: Create an AutoTSEstimator**
+### Step 3: Create an AutoTSEstimator
 
 ```python
 import bigdl.orca.automl.hp as hp
@@ -74,7 +74,7 @@ We prebuild three defualt search space for each build-in model, which you can us
 
 Detailed information please refer to [AutoTSEstimator API doc](../../PythonAPI/Chronos/autotsestimator.html#autotsestimator) and basic concepts [here](../Overview/forecasting.html#use-autots-pipeline).
 
-### **Step 4: Fit with AutoTSEstimator**
+### Step 4: Fit with AutoTSEstimator
 ```python
 # fit with AutoTSEstimator for a returned TSPipeline
 ts_pipeline = auto_estimator.fit(data=tsdata_train, # train dataset
@@ -82,7 +82,7 @@ ts_pipeline = auto_estimator.fit(data=tsdata_train, # train dataset
                                  epochs=5) # number of epochs to train in each trial
 ```
 Detailed information please refer to [AutoTSEstimator API doc](../../PythonAPI/Chronos/autotsestimator.html#autotsestimator).
-### **Step 5: Further deployment with TSPipeline**
+### Step 5: Further deployment with TSPipeline
 The `TSPipeline` will reply the same preprcessing and corresponding postprocessing operations on the test data. You may carry out predict, evaluate or save/load for further development.
 ```python
 # predict with the best trial
@@ -106,7 +106,7 @@ loaded_ppl = TSPipeline.load(my_ppl_file_path)
 ```
 Detailed information please refer to [TSPipeline API doc](../../PythonAPI/Chronos/tsdataset.html).
 
-### **Optional: Examine the leaderboard visualization**
+### Optional: Examine the leaderboard visualization
 To view the evaluation result of "not chosen" trails and find some insight or even possibly improve you search space for a new autotuning task. We provide a leaderboard through tensorboard.
 ```python
 # show a tensorboard view
diff --git a/docs/readthedocs/source/doc/Chronos/QuickStart/index.md b/docs/readthedocs/source/doc/Chronos/QuickStart/index.md
index 5c507129..7bd441bb 100644
--- a/docs/readthedocs/source/doc/Chronos/QuickStart/index.md
+++ b/docs/readthedocs/source/doc/Chronos/QuickStart/index.md
@@ -1,7 +1,5 @@
 # Chronos Examples
 
-</br>
-
 ```eval_rst
 .. raw:: html
 
diff --git a/docs/readthedocs/source/doc/DLlib/Overview/dllib.md b/docs/readthedocs/source/doc/DLlib/Overview/dllib.md
index 6ce6269b..204ab135 100644
--- a/docs/readthedocs/source/doc/DLlib/Overview/dllib.md
+++ b/docs/readthedocs/source/doc/DLlib/Overview/dllib.md
@@ -16,7 +16,7 @@ It includes the functionalities of the [original BigDL](https://github.com/intel
 
 This section show a single example of how to use dllib to build a deep learning application on Spark, using Keras APIs
 
-#### **LeNet Model on MNIST using Keras-Style API**
+#### LeNet Model on MNIST using Keras-Style API
 
 This tutorial is an explanation of what is happening in the [lenet](https://github.com/intel-analytics/BigDL/tree/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/keras) example
 
@@ -75,7 +75,7 @@ model.fit(trainSet, nbEpoch = param.maxEpoch, validationData = validationSet)
 
 ## Python Example
 
-#### **Initialize NN Context**
+#### Initialize NN Context
 
 `NNContext` is the main entry for provisioning the dllib program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
 
@@ -93,7 +93,7 @@ In `init_nncontext`, the user may specify cluster mode for the dllib program:
 The dllib program simply runs `init_nncontext` on the local machine, which will automatically provision the runtime Python environment and distributed execution engine on the underlying computing environment (such as a single laptop, a large K8s or Hadoop cluster, etc.).
 
 
-#### **Autograd Examples using bigdl-dllb keras Python API**
+#### Autograd Examples using bigdl-dllb keras Python API
 
 This tutorial describes the [Autograd](https://github.com/intel-analytics/BigDL/tree/branch-2.0/python/dllib/examples/autograd).
 
diff --git a/docs/readthedocs/source/doc/DLlib/Overview/visualization.md b/docs/readthedocs/source/doc/DLlib/Overview/visualization.md
index b9b6f268..61867302 100644
--- a/docs/readthedocs/source/doc/DLlib/Overview/visualization.md
+++ b/docs/readthedocs/source/doc/DLlib/Overview/visualization.md
@@ -1,4 +1,4 @@
-## **Visualizing training with TensorBoard**
+## Visualizing training with TensorBoard
 With the summary info generated, we can then use [TensorBoard](https://pypi.python.org/pypi/tensorboard) to visualize the behaviors of the BigDL program.
 
 * **Installing TensorBoard**
diff --git a/docs/readthedocs/source/doc/Nano/Overview/known_issues.md b/docs/readthedocs/source/doc/Nano/Overview/known_issues.md
index 1fbaf098..f8807945 100644
--- a/docs/readthedocs/source/doc/Nano/Overview/known_issues.md
+++ b/docs/readthedocs/source/doc/Nano/Overview/known_issues.md
@@ -1,8 +1,8 @@
 # Nano Known Issues
 
-## **PyTorch Issues**
+## PyTorch Issues
 
-### **AttributeError: module 'distutils' has no attribute 'version'**
+### AttributeError: module 'distutils' has no attribute 'version'
 
 This usually is because the latest setuptools does not compatible with PyTorch 1.9.
 
@@ -14,7 +14,7 @@ For example, if your `setuptools` is installed by conda, you can run:
 conda install setuptools==58.0.4
 ```
 
-### **error while loading shared libraries: libunwind.so.8**
+### error while loading shared libraries: libunwind.so.8
 
 You may see this error message when running `source bigdl-nano-init`
 ```
@@ -24,7 +24,7 @@ You can use the following command to fix this issue.
 
 * `apt-get install libunwind8-dev` 
 
-### **Bus error (core dumped) in multi-instance training with spawn distributed backend**
+### Bus error (core dumped) in multi-instance training with spawn distributed backend
 
 This usually is because you did not set enough shared memory size in your docker container.
 
@@ -46,18 +46,18 @@ spec:
     name: cache-volume
 ```
 
-## **TensorFlow Issues**
+## TensorFlow Issues
 
-### **Nano keras multi-instance training currently does not suport tensorflow dataset.from_generators, numpy_function, py_function**
+### Nano keras multi-instance training currently does not suport tensorflow dataset.from_generators, numpy_function, py_function
 
 Nano keras multi-instance training will serialize TensorFlow dataset object into a `graph.pb` file, which does not work with `dataset.from_generators`, `dataset.numpy_function`, `dataset.py_function` due to limitations in TensorFlow.
 
-### **RuntimeError: Inter op parallelism cannot be modified after initialization**
+### RuntimeError: Inter op parallelism cannot be modified after initialization
 
 If you meet this error when import `bigdl.nano.tf`, it could be that you have already run some TensorFlow code that set the inter/intra op parallelism, such as `tfds.load`. You can try to workaround this issue by trying to import `bigdl.nano.tf` first before running TensorFlow code. See https://github.com/tensorflow/tensorflow/issues/57812 for more information.
 
-## **Ray Issues**
+## Ray Issues
 
-### **protobuf version error**
+### protobuf version error
 
 Now `pip install ray[default]==1.11.0` will install `google-api-core==2.10.0`, which depends on `protobuf>=3.20.1`. However, nano depends on `protobuf==3.19.4`, so if we install `ray` after installing `bigdl-nano`, pip will reinstall `protobuf==4.21.5`, which causes error.
diff --git a/docs/readthedocs/source/doc/Nano/Overview/nano.md b/docs/readthedocs/source/doc/Nano/Overview/nano.md
index f6280bde..9d6c29e7 100644
--- a/docs/readthedocs/source/doc/Nano/Overview/nano.md
+++ b/docs/readthedocs/source/doc/Nano/Overview/nano.md
@@ -5,7 +5,7 @@ BigDL-Nano is a Python package to transparently accelerate PyTorch and TensorFlo
 ----
 
 
-### **PyTorch Bite-sized Example**
+### PyTorch Bite-sized Example
 
 BigDL-Nano supports both PyTorch and PyTorch Lightning models and most optimizations require only changing a few "import" lines in your code and adding a few flags.
 
@@ -34,10 +34,10 @@ class MyNano(TorchNano):
 MyNano(use_ipex=True, num_processes=2).train()
 ```
 
-For more details on the BigDL-Nano's PyTorch usage, please refer to the [PyTorch Training](../QuickStart/pytorch_train.md) and [PyTorch Inference](../QuickStart/pytorch_inference.md) page.
+For more details on the BigDL-Nano's PyTorch usage, please refer to the [PyTorch Training](./pytorch_train.md) and [PyTorch Inference](./pytorch_inference.md) page.
 
 
-### **TensorFlow Bite-sized Example**
+### TensorFlow Bite-sized Example
 
 BigDL-Nano supports `tensorflow.keras` API and most optimizations require only changing a few "import" lines in your code and adding a few flags.
 
@@ -67,4 +67,4 @@ model.compile(optimizer='adam',
 model.fit(x_train, y_train, epochs=5, num_processes=4)
 ```
 
-For more details on the BigDL-Nano's Tensorflow usage, please refer to the [TensorFlow Training](../QuickStart/tensorflow_train.md) and [TensorFlow Inference](../QuickStart/tensorflow_inference.md) page.
+For more details on the BigDL-Nano's Tensorflow usage, please refer to the [TensorFlow Training](./tensorflow_train.md) and [TensorFlow Inference](./tensorflow_inference.md) page.
diff --git a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_onnxruntime.md b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_onnxruntime.md
index dd199949..59ed3bbd 100644
--- a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_onnxruntime.md
+++ b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_onnxruntime.md
@@ -2,7 +2,7 @@
 
 **In this guide we will describe how to apply ONNXRuntime Acceleration on inference pipeline with the APIs delivered by BigDL-Nano in 4 simple steps**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
 ```bash
@@ -18,7 +18,7 @@ Before you start with ONNXRuntime accelerator, you need to install some ONNX pac
 ```bash
 pip install onnx onnxruntime
 ```
-### **Step 1: Load the data**
+### Step 1: Load the data
 ```python
 import torch
 from torchvision.io import read_image
@@ -45,7 +45,7 @@ val_dataset = torch.utils.data.Subset(val_dataset, indices[-val_size:])
 train_dataloader = DataLoader(train_dataset, batch_size=32)
 ```
 
-### **Step 2: Prepare the Model**
+### Step 2: Prepare the Model
 ```python
 import torch
 from torchvision.models import resnet18
@@ -70,7 +70,7 @@ y_hat = model_ft(x)
 y_hat.argmax(dim=1)
 ```
 
-### **Step 3: Apply ONNXRumtime Acceleration**
+### Step 3: Apply ONNXRumtime Acceleration
 When you're ready, you can simply append the following part to enable your ONNXRuntime acceleration.
 ```python
 # trace your model as an ONNXRuntime model
diff --git a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_openvino.md b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_openvino.md
index 80a78ba7..7163594e 100644
--- a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_openvino.md
+++ b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_openvino.md
@@ -2,7 +2,7 @@
 
 **In this guide we will describe how to apply OpenVINO Acceleration on inference pipeline with the APIs delivered by BigDL-Nano in 4 simple steps**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
 ```bash
@@ -19,7 +19,7 @@ To use OpenVINO acceleration, you have to install the OpenVINO toolkit:
 pip install openvino-dev
 ```
 
-### **Step 1: Load the data**
+### Step 1: Load the data
 ```python
 import torch
 from torchvision.io import read_image
@@ -46,7 +46,7 @@ val_dataset = torch.utils.data.Subset(val_dataset, indices[-val_size:])
 train_dataloader = DataLoader(train_dataset, batch_size=32)
 ```
 
-### **Step 2: Prepare the Model**
+### Step 2: Prepare the Model
 ```python
 import torch
 from torchvision.models import resnet18
@@ -71,7 +71,7 @@ y_hat = model_ft(x)
 y_hat.argmax(dim=1)
 ```
 
-### **Step 3: Apply OpenVINO Acceleration**
+### Step 3: Apply OpenVINO Acceleration
 When you're ready, you can simply append the following part to enable your OpenVINO acceleration.
 ```python
 # trace your model as an OpenVINO model
diff --git a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_inc.md b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_inc.md
index 1f7c443c..ec4b695c 100644
--- a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_inc.md
+++ b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_inc.md
@@ -2,7 +2,7 @@
 
 **In this guide we will describe how to obtain a quantized model with the APIs delivered by BigDL-Nano in 4 simple steps**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
 ```bash
@@ -18,7 +18,7 @@ By default, Intel Neural Compressor is not installed with BigDL-Nano. So if you
 ```bash
 pip install neural-compressor==1.11
 ```
-### **Step 1: Load the data**
+### Step 1: Load the data
 ```python
 import torch
 from torchvision.io import read_image
@@ -47,7 +47,7 @@ val_dataset = torch.utils.data.Subset(val_dataset, indices[-val_size:])
 train_dataloader = DataLoader(train_dataset, batch_size=32)
 ```
 
-### **Step 2: Prepare the Model**
+### Step 2: Prepare the Model
 ```python
 import torch
 from torchvision.models import resnet18
@@ -73,7 +73,7 @@ y_hat = model_ft(x)
 y_hat.argmax(dim=1)
 ```
 
-### **Step 3: Quantization using Intel Neural Compressor**
+### Step 3: Quantization using Intel Neural Compressor
 Quantization is widely used to compress models to a lower precision, which not only reduces the model size but also accelerates inference. BigDL-Nano provides `Trainer.quantize()` API for users to quickly obtain a quantized model with accuracy control by specifying a few arguments.
 
 Without extra accelerator, `Trainer.quantize()` returns a pytorch module with desired precision and accuracy. You can add quantization as below:
diff --git a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_inc_onnx.md b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_inc_onnx.md
index 16cb1116..1df32e84 100644
--- a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_inc_onnx.md
+++ b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_inc_onnx.md
@@ -2,7 +2,7 @@
 
 **In this guide we will describe how to obtain a quantized model running inference in the ONNXRuntime engine with the APIs delivered by BigDL-Nano in 4 simple steps**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
 ```bash
@@ -19,7 +19,7 @@ To quantize model using ONNXRuntime as backend, it is required to install Intel
 pip install neural-compress==1.11
 pip install onnx onnxruntime onnxruntime-extensions
 ```
-### **Step 1: Load the data**
+### Step 1: Load the data
 ```python
 import torch
 from torchvision.io import read_image
@@ -46,7 +46,7 @@ val_dataset = torch.utils.data.Subset(val_dataset, indices[-val_size:])
 train_dataloader = DataLoader(train_dataset, batch_size=32)
 ```
 
-### **Step 2: Prepare your Model**
+### Step 2: Prepare your Model
 ```python
 import torch
 from torchvision.models import resnet18
@@ -72,7 +72,7 @@ y_hat = model_ft(x)
 y_hat.argmax(dim=1)
 ```
 
-### **Step 3: Quantization with ONNXRuntime accelerator**
+### Step 3: Quantization with ONNXRuntime accelerator
 With the ONNXRuntime accelerator, `Trainer.quantize()` will return a model with compressed precision but running inference in the ONNXRuntime engine.
 
 you can add quantization as below:
diff --git a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_openvino.md b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_openvino.md
index e2afa8e7..b14f686c 100644
--- a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_openvino.md
+++ b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_quantization_openvino.md
@@ -2,7 +2,7 @@
 
 **In this guide we will describe how to obtain a quantized model with the APIs delivered by BigDL-Nano in 4 simple steps**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
 ```bash
@@ -19,7 +19,7 @@ The POT(Post-training Optimization Tools) is provided by OpenVINO toolkit. To us
 pip install openvino-dev
 ```
 
-### **Step 1: Load the data**
+### Step 1: Load the data
 ```python
 import torch
 from torchvision.io import read_image
@@ -48,7 +48,7 @@ val_dataset = torch.utils.data.Subset(val_dataset, indices[-val_size:])
 train_dataloader = DataLoader(train_dataset, batch_size=32)
 ```
 
-### **Step 2: Prepare the Model**
+### Step 2: Prepare the Model
 ```python
 import torch
 from torchvision.models import resnet18
@@ -73,7 +73,7 @@ y_hat = model_ft(x)
 y_hat.argmax(dim=1)
 ```
 
-### **Step 3: Quantization using Post-training Optimization Tools**
+### Step 3: Quantization using Post-training Optimization Tools
 Accelerator='openvino' means using OpenVINO POT to do quantization. The quantization can be added as below:
 ```python
 from torchmetrics import Accuracy
diff --git a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_train_quickstart.md b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_train_quickstart.md
index 1a2d3113..7becb1c8 100644
--- a/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_train_quickstart.md
+++ b/docs/readthedocs/source/doc/Nano/QuickStart/pytorch_train_quickstart.md
@@ -2,7 +2,7 @@
 
 **In this guide we will describe how to scale out PyTorch programs using Nano Trainer in 5 simple steps**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
@@ -16,7 +16,7 @@ source bigdl-nano-init
 pip install lightning-bolts
 ```
 
-### **Step 1: Import BigDL-Nano**
+### Step 1: Import BigDL-Nano
 The PyTorch Trainer (`bigdl.nano.pytorch.Trainer`) is the place where we integrate most optimizations. It extends PyTorch Lightning's Trainer and has a few more parameters and methods specific to BigDL-Nano. The Trainer can be directly used to train a `LightningModule`.
 ```python
 from bigdl.nano.pytorch import Trainer
@@ -26,7 +26,7 @@ Computer Vision task often needs a data processing pipeline that sometimes const
 from bigdl.nano.pytorch.vision import transforms
 ```
 
-### **Step 2: Load the Data**
+### Step 2: Load the Data
 You can define the datamodule using standard [LightningDataModule](https://pytorch-lightning.readthedocs.io/en/latest/data/datamodule.html)
 ```python
 from pl_bolts.datamodules import CIFAR10DataModule
@@ -45,7 +45,7 @@ cifar10_dm = CIFAR10DataModule(
 return cifar10_dm
 ```
 
-### **Step 3: Define the Model**
+### Step 3: Define the Model
 
 You may define your model, loss and optimizer in the same way as in any standard PyTorch Lightning program.
 
diff --git a/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_embedding.md b/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_embedding.md
index a9659cc6..3e2bea78 100644
--- a/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_embedding.md
+++ b/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_embedding.md
@@ -1,7 +1,7 @@
 # BigDL-Nano TensorFlow SparseEmbedding and SparseAdam
 **In this guide we demonstrates how to use `SparseEmbedding` and `SparseAdam` to obtain stroger performance with sparse gradient.**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
@@ -15,13 +15,13 @@ source bigdl-nano-init
 pip install tensorflow-datasets
 ```
 
-### **Step 1: Import BigDL-Nano**
+### Step 1: Import BigDL-Nano
 The optimizations in BigDL-Nano are delivered through BigDL-Nano’s `Model` and `Sequential` classes. For most cases, you can just replace your `tf.keras.Model` to `bigdl.nano.tf.keras.Model` and `tf.keras.Sequential` to `bigdl.nano.tf.keras.Sequential` to benefits from BigDL-Nano.
 ```python
 from bigdl.nano.tf.keras import Model, Sequential
 ```
 
-### **Step 2: Load the data**
+### Step 2: Load the data
 We demonstrate with imdb_reviews, a large movie review dataset.
 ```python
 import tensorflow_datasets as tfds
@@ -35,7 +35,7 @@ import tensorflow_datasets as tfds
 )
 ```
 
-### **Step 3: Parepre the Data**
+### Step 3: Parepre the Data
 In particular, we remove <br /> tags.
 ```python
 import tensorflow as tf
@@ -82,7 +82,7 @@ val_ds = val_ds.cache().prefetch(buffer_size=10)
 test_ds = test_ds.cache().prefetch(buffer_size=10)
 ```
 
-### **Step 4: Build Model**
+### Step 4: Build Model
 `bigdl.nano.tf.keras.Embedding` is a slightly modified version of `tf.keras.Embedding` layer, this embedding layer only applies regularizer to the output of the embedding layer, so that the gradient to embeddings is sparse. `bigdl.nano.tf.optimzers.Adam` is a variant of the `Adam` optimizer that handles sparse updates more efficiently. 
 Here we create two models, one using normal Embedding layer and Adam optimizer, the other using `SparseEmbedding` and `SparseAdam`.
 ```python
@@ -119,7 +119,7 @@ model = Model(inputs, predictions)
 model.compile(loss="binary_crossentropy", optimizer=SparseAdam(), metrics=["accuracy"])
 ```
 
-### **Step 5: Training**
+### Step 5: Training
 ```python
 # Fit the model using the train and val datasets.
 model.fit(train_ds, validation_data=val_ds, epochs=3)
diff --git a/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_quantization_quickstart.md b/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_quantization_quickstart.md
index f94182e0..cd052ed8 100644
--- a/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_quantization_quickstart.md
+++ b/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_quantization_quickstart.md
@@ -1,7 +1,7 @@
 ## BigDL-Nano TensorFLow Quantization Quickstart
 **In this guide we will demonstrates how to apply post-training quantization on a keras model with BigDL-Nano in 4 simple steps.**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
diff --git a/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_train_quickstart.md b/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_train_quickstart.md
index b681e236..6d40180f 100644
--- a/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_train_quickstart.md
+++ b/docs/readthedocs/source/doc/Nano/QuickStart/tensorflow_train_quickstart.md
@@ -1,7 +1,7 @@
 # BigDL-Nano TensorFlow Training Quickstart
 **In this guide we will describe how to accelerate TensorFlow Keras application on training workloads using BigDL-Nano in 5 simple steps**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
@@ -15,13 +15,13 @@ source bigdl-nano-init
 pip install tensorflow-datasets
 ```
 
-### **Step 1: Import BigDL-Nano**
+### Step 1: Import BigDL-Nano
 The optimizations in BigDL-Nano are delivered through BigDL-Nano’s `Model` and `Sequential` classes. For most cases, you can just replace your `tf.keras.Model` to `bigdl.nano.tf.keras.Model` and `tf.keras.Sequential` to `bigdl.nano.tf.keras.Sequential` to benefits from BigDL-Nano.
 ```python
 from bigdl.nano.tf.keras import Model, Sequential
 ```
 
-### **Step 2: Load the Data**
+### Step 2: Load the Data
 Here we load data from tensorflow_datasets(hereafter [TFDS](https://www.tensorflow.org/datasets)). The [Stanford Dogs](http://vision.stanford.edu/aditya86/ImageNetDogs/main.html) dataset contains images of 120 breeds of dogs around the world. There are 20,580 images, out of which 12,000 are used for training and 8580 for testing.
 ```python
 import tensorflow_datasets as tfds
@@ -47,7 +47,7 @@ ds_train = ds_train.cache().repeat().shuffle(1000).map(preprocessing).batch(batc
 ds_test = ds_test.map(preprocessing).batch(batch_size, drop_remainder=True).prefetch(AUTOTUNE)
 ```
 
-### **Step 3: Build Model**
+### Step 3: Build Model
 BigDL-Nano's `Model` (`bigdl.nano.tf.keras.Model`) and `Sequential` (`bigdl.nano.tf.keras.Sequential`) classes have identical APIs with `tf.keras.Model` and `tf.keras.Sequential`.
 Here we initialize the model with pre-trained ImageNet weights, and we fine-tune it on the Stanford Dogs dataset.
 ```python
@@ -92,7 +92,7 @@ def unfreeze_model(model):
     )
 ```
 
-### **Step 4: Training**
+### Step 4: Training
 ```python
 steps_per_epoch = ds_info.splits['train'].num_examples // batch_size
 model_default = make_model()
diff --git a/docs/readthedocs/source/doc/Orca/Overview/data-parallel-processing.md b/docs/readthedocs/source/doc/Orca/Overview/data-parallel-processing.md
index cdc8c6c4..8fb138c5 100644
--- a/docs/readthedocs/source/doc/Orca/Overview/data-parallel-processing.md
+++ b/docs/readthedocs/source/doc/Orca/Overview/data-parallel-processing.md
@@ -4,7 +4,7 @@
 
 **Orca provides efficient support of distributed data-parallel processing pipeline, a critical component for large-scale AI applications.**
 
-### **1. TensorFlow Dataset and PyTorch DataLoader**
+### 1. TensorFlow Dataset and PyTorch DataLoader
 
 Orca will seamlessly parallelize the standard `tf.data.Dataset` or `torch.utils.data.DataLoader` pipelines across a large cluster in a data-parallel fashion, which can be directly used for distributed deep learning training, as shown below:
 
@@ -52,7 +52,7 @@ and `tf.numpy_function` are currently not supported._
 2. _TensorFlow Dataset pipeline created from generators, such as `Dataset.from_generators` are currently not supported._
 3. _For TensorFlow Dataset and Pytorch DataLoader pipelines that read from files (including `tf.data.TFRecordDataset` and `tf.data.TextLineDataset`), one needs to ensure that the same file paths can be accessed on every node in the cluster._
 
-#### **1.1. Data Creator Function**
+#### 1.1. Data Creator Function
 Alternatively, the user may also pass a *Data Creator Function* as the input to the distributed training and inference. Inside the *Data Creator Function*, the user needs to create and return a `tf.data.Dataset` or `torch.utils.data.DataLoader` object, as shown below.
 
 TensorFlow:
@@ -84,7 +84,7 @@ def train_data_creator(config, batch_size):
     return train_loader
 ```
 
-### **2. Spark Dataframes**
+### 2. Spark Dataframes
 Orca supports Spark Dataframes as the input to the distributed training, and as the input/output of the distributed inference. Consequently, the user can easily process large-scale dataset using Apache Spark, and directly apply AI models on the distributed (and possibly in-memory) Dataframes without data conversion or serialization. 
 
 ```python
@@ -95,7 +95,7 @@ est.fit(data=df,
         label_cols=['label']) # specifies which column(s) to be used as labels
 ```
 
-### **3. XShards (Distributed Data-Parallel Python Processing)**
+### 3. XShards (Distributed Data-Parallel Python Processing)
 
 `XShards` in Orca allows the user to process large-scale dataset using *existing* Python codes in a distributed and data-parallel fashion, as shown below. 
 
@@ -117,7 +117,7 @@ In essence, an `XShards` contains an automatically sharded (or partitioned) Pyth
 
 View the related [Python API doc](./data) for more details.
  
-#### **3.1 Data-Parallel Pandas**
+#### 3.1 Data-Parallel Pandas
 The user may use `XShards` to efficiently process large-size Pandas Dataframes in a distributed and data-parallel fashion.
 
 First, the user can read CVS, JSON or Parquet files (stored on local disk, HDFS, AWS S3, etc.) to obtain an `XShards` of Pandas Dataframe, as shown below:
diff --git a/docs/readthedocs/source/doc/Orca/Overview/distributed-training-inference.md b/docs/readthedocs/source/doc/Orca/Overview/distributed-training-inference.md
index ef84a44a..46bcda13 100644
--- a/docs/readthedocs/source/doc/Orca/Overview/distributed-training-inference.md
+++ b/docs/readthedocs/source/doc/Orca/Overview/distributed-training-inference.md
@@ -4,15 +4,15 @@
 
 **Orca `Estimator` provides sklearn-style APIs for transparently distributed model training and inference** 
 
-### **1. Estimator**
+### 1. Estimator
 
 To perform distributed training and inference, the user can first create an Orca `Estimator` from any standard (single-node) TensorFlow, Kera or PyTorch model, and then call `Estimator.fit` or `Estimator.predict`  methods (using the [data-parallel processing pipeline](./data-parallel-processing.md) as input).
 
 Under the hood, the Orca `Estimator` will replicate the model on each node in the cluster, feed the data partition (generated by the data-parallel processing pipeline) on each node to the local model replica, and synchronize model parameters using various *backend* technologies (such as *Horovod*, `tf.distribute.MirroredStrategy`, `torch.distributed`, or the parameter sync layer in [*BigDL*](https://github.com/intel-analytics/BigDL)).
 
-### **2. TensorFlow/Keras Estimator**
+### 2. TensorFlow/Keras Estimator
 
-#### **2.1 TensorFlow 1.15 and Keras 2.3**
+#### 2.1 TensorFlow 1.15 and Keras 2.3
 
 There are two ways to create an Estimator for TensorFlow 1.15, either from a low level computation graph or a Keras model. Examples are as follow:
 
@@ -62,7 +62,7 @@ The `data` argument in `fit` method can be a Spark DataFrame, an *XShards* or a
 
 View the related [Python API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Orca/orca.html#module-bigdl.orca.learn.tf.estimator) for more details.
 
-#### **2.2 TensorFlow 2.x and Keras 2.4+**
+#### 2.2 TensorFlow 2.x and Keras 2.4+
 
 **Using `ray` or *Horovod* backend**
 
@@ -153,7 +153,7 @@ View the related [Python API doc](https://bigdl.readthedocs.io/en/latest/doc/Pyt
 
 ***For more details, view the distributed TensorFlow training/inference [page]()<TODO: link to be added>.***
 
-### **3. PyTorch Estimator**
+### 3. PyTorch Estimator
 
 **Using *BigDL* backend**
 
@@ -213,7 +213,7 @@ View the related [Python API doc](https://bigdl.readthedocs.io/en/latest/doc/Pyt
 
 ***For more details, view the distributed PyTorch training/inference [page]()<TODO: link to be added>.***
 
-### **4. MXNet Estimator**
+### 4. MXNet Estimator
 
 The user may create a MXNet `Estimator` as follows:
 ```python
@@ -250,7 +250,7 @@ The input to `fit` methods can be an *XShards*, or a *Data Creator Function* (th
 
 View the related [Python API doc]()<TODO: link to be added> for more details.
 
-### **5. BigDL Estimator**
+### 5. BigDL Estimator
 
 The user may create a BigDL `Estimator` as follows:
 ```python
@@ -280,7 +280,7 @@ The input to `fit` and `predict` methods can be a *Spark Dataframe*, or an *XSha
 
 View the related [Python API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Orca/orca.html#module-bigdl.orca.learn.bigdl.estimator) for more details.
 
-### **6. OpenVINO Estimator**
+### 6. OpenVINO Estimator
 
 The user may create a OpenVINO `Estimator` as follows:
 ```python
diff --git a/docs/readthedocs/source/doc/Orca/Overview/distributed-tuning.md b/docs/readthedocs/source/doc/Orca/Overview/distributed-tuning.md
index 3a4bfc0d..0c1dced0 100644
--- a/docs/readthedocs/source/doc/Orca/Overview/distributed-tuning.md
+++ b/docs/readthedocs/source/doc/Orca/Overview/distributed-tuning.md
@@ -6,20 +6,20 @@
 
 
 
-### **1. AutoEstimator**
+### 1. AutoEstimator
 
 To perform distributed hyper-parameter tuning, user can first create an Orca `AutoEstimator` from standard TensorFlow Keras or PyTorch model, and then call `AutoEstimator.fit`.
 
 Under the hood, the Orca `AutoEstimator` generates different trials and schedules them on each mode in the cluster. Each trial runs a different combination of hyper parameters, sampled from the user-desired hyper-parameter space.
 HDFS is used to save temporary results of each trial and all the results will be finally transferred to driver for further analysis.
 
-### **2. Pytorch AutoEstimator**
+### 2. Pytorch AutoEstimator
 
 User could pass *Creator Function*s, including *Data Creator Function*, *Model Creator Function* and *Optimizer Creator Function* to `AutoEstimator` for training.
 
 The *Creator Function*s should take a parameter of `config` as input and get the hyper-parameter values from `config` to enable hyper parameter search.
 
-#### **2.1 Data Creator Function**
+#### 2.1 Data Creator Function
 You can define the train and validation datasets using *Data Creator Function*. The *Data Creator Function* takes `config` as input and returns a `torch.utils.data.DataLoader` object, as shown below.
 ```python
 # "batch_size" is the hyper-parameter to be tuned.
@@ -35,7 +35,7 @@ def train_loader_creator(config):
 ```
 The input data for Pytorch `AutoEstimator` can be a *Data Creator Function* or a tuple of numpy ndarrays in the form of (x, y), where x is training input data and y is training target data.
 
-#### **2.2 Model Creator Function**
+#### 2.2 Model Creator Function
 *Model Creator Function* also takes `config` as input and returns a `torch.nn.Module` object, as shown below.
 
 ```python
@@ -57,7 +57,7 @@ def model_creator(config):
     return model
 ```
 
-#### **2.3 Optimizer Creator Function**
+#### 2.3 Optimizer Creator Function
 *Optimizer Creator Function* takes `model` and `config` as input, and returns a `torch.optim.Optimizer` object.
 ```python
 import torch
@@ -67,7 +67,7 @@ def optim_creator(model, config):
 
 Note that the `optimizer` argument in Pytorch `AutoEstimator` constructor could be a *Optimizer Creator Function* or a string, which is the name of Pytorch Optimizer. The above *Optimizer Creator Function* has the same functionality with "Adam".
 
-#### **2.4 Create and Fit Pytorch AutoEstimator**
+#### 2.4 Create and Fit Pytorch AutoEstimator
 User could create a Pytorch `AutoEstimator` as below.
 ```python
 from bigdl.orca.automl.auto_estimator import AutoEstimator
@@ -95,7 +95,7 @@ best_config = auto_est.get_best_config() # a dictionary of hyper-parameter names
 ```
 View the related [Python API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/AutoML/automl.html#orca-automl-auto-estimator) for more details.
 
-### **3. TensorFlow/Keras AutoEstimator**
+### 3. TensorFlow/Keras AutoEstimator
 Users can create an `AutoEstimator` for TensorFlow Keras from a `tf.keras` model (using a *Model Creator Function*). For example:
 
 ```python
@@ -132,10 +132,10 @@ best_config = auto_est.get_best_config() # a dictionary of hyper-parameter names
 ```
 View the related [Python API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/AutoML/automl.html#orca-automl-auto-estimator) for more details.
 
-### **4. Search Space and Search Algorithms**
+### 4. Search Space and Search Algorithms
 For Hyper-parameter Optimization, user should define the search space of various hyper-parameter values for neural network training, as well as how to search through the chosen hyper-parameter space.
 
-#### **4.1 Basic Search Algorithms**
+#### 4.1 Basic Search Algorithms
 
 For basic search algorithms like **Grid Search** and **Random Search**, we provide several sampling functions with `automl.hp`. See [API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/AutoML/automl.html#orca-automl-hp) for more details.
 
@@ -152,7 +152,7 @@ search_space = {
 }
 ```
 
-#### **4.2 Advanced Search Algorithms**
+#### 4.2 Advanced Search Algorithms
 Beside grid search and random search, user could also choose to use some advanced hyper-parameter optimization methods,
 such as [Ax](https://ax.dev/), [Bayesian Optimization](https://github.com/fmfn/BayesianOptimization), [Scikit-Optimize](https://scikit-optimize.github.io), etc. We supported all *Search Algorithms* in [Ray Tune](https://docs.ray.io/en/master/index.html). View the [Ray Tune Search Algorithms](https://docs.ray.io/en/master/tune/api_docs/suggestion.html) for more details.
 Note that you should install the dependency for your search algorithm manually.
@@ -182,7 +182,7 @@ auto_estimator.fit(
 ```
 See [API Doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/AutoML/automl.html#orca-automl-auto-estimator) for more details.
 
-### **5. Scheduler**
+### 5. Scheduler
 *Scheduler* can stop/pause/tweak the hyper-parameters of running trials, making the hyper-parameter tuning process much efficient.
 
 We support all *Schedulers* in [Ray Tune](https://docs.ray.io/en/master/index.html). See [Ray Tune Schedulers](https://docs.ray.io/en/master/tune/api_docs/schedulers.html#schedulers-ref) for more details.
diff --git a/docs/readthedocs/source/doc/Orca/Overview/known_issues.md b/docs/readthedocs/source/doc/Orca/Overview/known_issues.md
index 28260f09..30a50407 100644
--- a/docs/readthedocs/source/doc/Orca/Overview/known_issues.md
+++ b/docs/readthedocs/source/doc/Orca/Overview/known_issues.md
@@ -1,8 +1,8 @@
 # Orca Known Issues
 
-## **Estimator Issues**
+## Estimator Issues
 
-### **UnkownError: Could not start gRPC server**
+### UnkownError: Could not start gRPC server
 
 This error occurs while running Orca TF2 Estimator with spark backend, which may because the previous pyspark tensorflow job was not cleaned completely. You can retry later or you can set spark config `spark.python.worker.reuse=false` in your application.
 
@@ -16,7 +16,7 @@ If you are using `init_orca_context(cluster_mode="yarn-client")`:
    spark-submit --conf spark.python.worker.reuse=false
    ```
 
-### **RuntimeError: Inter op parallelism cannot be modified after initialization**
+### RuntimeError: Inter op parallelism cannot be modified after initialization
 
 This error occurs if you build your TensorFlow model on the driver rather than on workers. You should build the complete model in `model_creator` which runs on each worker node. You can refer to the following examples:
 
@@ -43,9 +43,9 @@ This error occurs if you build your TensorFlow model on the driver rather than o
    ...
    ```
 
-## **OrcaContext Issues**
+## OrcaContext Issues
 
-### **Exception: Failed to read dashbord log: [Errno 2] No such file or directory: '/tmp/ray/.../dashboard.log'**
+### Exception: Failed to read dashbord log: [Errno 2] No such file or directory: '/tmp/ray/.../dashboard.log'
 
 This error occurs when initialize an orca context with `init_ray_on_spark=True`. We have not locate the root cause of this problem, but it might be caused by an atypical python environment.
 
@@ -63,9 +63,9 @@ You could follow below steps to workaround:
 
 2. If you really need to use ray on spark, please install bigdl-orca under a conda environment. Detailed information please refer to [here](./orca.html).
 
-## **Other Issues**
+## Other Issues
 
-### **OSError: Unable to load libhdfs: ./libhdfs.so: cannot open shared object file: No such file or directory**
+### OSError: Unable to load libhdfs: ./libhdfs.so: cannot open shared object file: No such file or directory
 
 This error is because PyArrow fails to locate `libhdfs.so` in default path of `$HADOOP_HOME/lib/native` when you run with YARN on Cloudera.
 To solve this issue, you need to set the path of `libhdfs.so` in Cloudera to the environment variable of `ARROW_LIBHDFS_DIR` on Spark driver and executors with the following steps:
@@ -87,7 +87,7 @@ To solve this issue, you need to set the path of `libhdfs.so` in Cloudera to the
                 --conf spark.yarn.appMasterEnv.ARROW_LIBHDFS_DIR=/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib64
 
 
-### **Spark Dynamic Allocation**
+### Spark Dynamic Allocation
 
 By design, BigDL does not support Spark Dynamic Allocation mode, and needs to allocate fixed resources for deep learning model training. Thus if your environment has already configured Spark Dynamic Allocation, or stipulated that Spark Dynamic Allocation must be used, you may encounter the following error:
 
diff --git a/docs/readthedocs/source/doc/Orca/Overview/orca-context.md b/docs/readthedocs/source/doc/Orca/Overview/orca-context.md
index 6b06aa12..4816fa74 100644
--- a/docs/readthedocs/source/doc/Orca/Overview/orca-context.md
+++ b/docs/readthedocs/source/doc/Orca/Overview/orca-context.md
@@ -5,7 +5,7 @@
 `OrcaContext` is the main entry for provisioning the Orca program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
 
 ---
-### **1. Initialization**
+### 1. Initialization
 
 An Orca program usually starts with the initialization of `OrcaContext` as follows:
 
@@ -26,7 +26,7 @@ The Orca program simply runs `init_orca_context` on the local machine, which wil
 View the related [Python API doc]() for more details.
 
 ---
-### **2. Python Dependencies**
+### 2. Python Dependencies
 
 A key challenge for scaling out Python program across a distributed cluster is how to properly install the required Python environment (libraries and dependencies) on each node in the cluster (preferably in an automatic and dynamic fashion). 
 
@@ -41,7 +41,7 @@ init_orca_context(..., extra_python_lib="func1.py,func2.py,lib3.zip")
 View the user guide for [K8s](../../UserGuide/k8s.md) and [Hadoop/YARN](../../UserGuide/hadoop.md) for more details.
 
 ---
-### **3. Execution Engine**
+### 3. Execution Engine
 
 Under the hood, `OrcaContext` will automatically provision Apache Spark and/or Ray as the underlying execution engine for the distributed data processing and model training/inference.
 
@@ -55,7 +55,7 @@ ray_ctx = OrcaContext.get_ray_context()
 ```
 
 ---
-### **4. Extra Configurations**
+### 4. Extra Configurations
 
 Users can make extra configurations when using the functionalities of Project Orca via `OrcaContext`.
 
@@ -71,7 +71,7 @@ Users can make extra configurations when using the functionalities of Project Or
 
 ---
 
-### **5. Termination**
+### 5. Termination
 
 After the Orca program finishes, the user can call `stop_orca_context` to release resources and shut down the underlying Spark and/or Ray execution engine.
 
diff --git a/docs/readthedocs/source/doc/Orca/Overview/orca.md b/docs/readthedocs/source/doc/Orca/Overview/orca.md
index 0f03929b..3ea31fb1 100644
--- a/docs/readthedocs/source/doc/Orca/Overview/orca.md
+++ b/docs/readthedocs/source/doc/Orca/Overview/orca.md
@@ -6,7 +6,7 @@ Most AI projects start with a Python notebook running on a single laptop; howeve
 
 ---
 
-### **Tensorflow Bite-sized Example**
+### Tensorflow Bite-sized Example
 
 This section uses TensorFlow 1.15, and you should install TensorFlow before running this example:
 ```bash
diff --git a/docs/readthedocs/source/doc/Orca/Overview/ray.md b/docs/readthedocs/source/doc/Orca/Overview/ray.md
index 7fde801c..6175e4b5 100644
--- a/docs/readthedocs/source/doc/Orca/Overview/ray.md
+++ b/docs/readthedocs/source/doc/Orca/Overview/ray.md
@@ -10,7 +10,7 @@ Users can seamlessly integrate Ray applications into the big data processing pip
 _**Note:** BigDL has been tested on Ray 1.9.2 and you are highly recommended to use this tested version._
 
 
-### **1. Install**
+### 1. Install
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the Python environment.
 When installing bigdl-orca with pip, you can specify the extras key `[ray]` to install the additional dependencies
@@ -26,7 +26,7 @@ pip install bigdl-orca[ray]
 View [Python User Guide](../../UserGuide/python.html#install) and [Orca User Guide](../Overview/orca.md) for more installation instructions.
 
 ---
-### **2. Initialize**
+### 2. Initialize
 
 We recommend using `init_orca_context` to initiate and run RayOnSpark on the underlying cluster. The Ray cluster would be launched by specifying `init_ray_on_spark=True`. For example, to launch Spark and Ray on standard Hadoop/YARN clusters in [YARN client mode](https://spark.apache.org/docs/latest/running-on-yarn.html#launching-spark-on-yarn):
 
@@ -61,7 +61,7 @@ OrcaContext.barrier_mode = False
 View [Orca Context](../Overview/orca-context.md) for more details.
 
 ---
-### **3. Run**
+### 3. Run
 
 - After the initialization, you can directly run Ray applications on the underlying cluster. [Ray tasks](https://docs.ray.io/en/master/walkthrough.html#remote-functions-tasks) or [actors](https://docs.ray.io/en/master/actors.html) would be launched across the cluster. The following code shows a simple example:
 
@@ -101,7 +101,7 @@ View [Orca Context](../Overview/orca-context.md) for more details.
   ```
 
 ---
-### **4. Known Issue**
+### 4. Known Issue
 If you encounter the following error when launching Ray on the underlying cluster, especially when you are using a [Spark standalone](https://spark.apache.org/docs/latest/spark-standalone.html) cluster:
 
 ```
@@ -118,7 +118,7 @@ sc = init_orca_context(cluster_mode, init_ray_on_spark=True, env={"LANG": "C.UTF
 ```
 
 ---
-### **5. FAQ**
+### 5. FAQ
 - **ValueError: Ray component worker_ports is trying to use a port number ... that is used by other components.**
 
   This error is because that some port in worker port list is occupied by other processes. To handle this issue, you can set range of the worker port list by using the parameters `min-worker-port` and `max-worker-port` in `init_orca_context` as follows:
diff --git a/docs/readthedocs/source/doc/Orca/QuickStart/orca-autoestimator-pytorch-quickstart.md b/docs/readthedocs/source/doc/Orca/QuickStart/orca-autoestimator-pytorch-quickstart.md
index 612d0d61..9fe8380f 100644
--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-autoestimator-pytorch-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-autoestimator-pytorch-quickstart.md
@@ -8,7 +8,7 @@
 
 **In this guide we will describe how to enable automated hyper-parameter search for PyTorch using Orca `AutoEstimator`.**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) is needed to prepare the Python environment for running this example. Please refer to the [install guide](https://bigdl.readthedocs.io/en/latest/doc/Orca/Overview/distributed-tuning.html#install) for more details.
 
@@ -19,7 +19,7 @@ pip install bigdl-orca[automl]
 pip install torch==1.8.1 torchvision==0.9.1
 ```
 
-### **Step 1: Init Orca Context**
+### Step 1: Init Orca Context
 ```python
 from bigdl.orca import init_orca_context, stop_orca_context
 
@@ -37,7 +37,7 @@ This is the only place where you need to specify local or distributed mode. View
 
 **Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
 
-### **Step 2: Define the Model**
+### Step 2: Define the Model
 
 You may define your model, loss and optimizer in the same way as in any standard PyTorch program.
 
@@ -77,7 +77,7 @@ def optim_creator(model, config):
     return torch.optim.Adam(model.parameters(), lr=config["lr"])
 ```
 
-### **Step 3: Define Dataset**
+### Step 3: Define Dataset
 
 You can define the train and validation datasets using *Data Creator Function* that takes `config` as input and returns a PyTorch `DataLoader`.
 
@@ -110,7 +110,7 @@ def test_loader_creator(config):
     return test_loader
 ```
 
-### **Step 4: Define Search Space**
+### Step 4: Define Search Space
 You should define a dictionary as your hyper-parameter search space.
 
 The keys are hyper-parameter names which should be the same with those in your creators, and you can specify how you want to sample each hyper-parameter in the values of the search space. See [automl.hp](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/AutoML/automl.html#orca-automl-hp) for more details.
@@ -125,7 +125,7 @@ search_space = {
 }
 ```
 
-### **Step 5: Automatically Fit and Search with Orca AutoEstimator**
+### Step 5: Automatically Fit and Search with Orca AutoEstimator
 
 First, create an `AutoEstimator`. You can refer to [AutoEstimator API doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/AutoML/automl.html#orca-automl-auto-estimator) for more details.
 
diff --git a/docs/readthedocs/source/doc/Orca/QuickStart/orca-autoxgboost-quickstart.md b/docs/readthedocs/source/doc/Orca/QuickStart/orca-autoxgboost-quickstart.md
index 0cc79989..48c70f11 100644
--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-autoxgboost-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-autoxgboost-quickstart.md
@@ -9,12 +9,12 @@
 **In this guide we will describe how to use Orca AutoXGBoost for automated xgboost tuning**
 
 Orca AutoXGBoost enables distributed automated hyper-parameter tuning for XGBoost, which includes `AutoXGBRegressor` and `AutoXGBClassifier` for sklearn`XGBRegressor` and `XGBClassifier` respectively. See more about [xgboost scikit-learn API](https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn).
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) is needed to prepare the Python environment for running this example. Please refer to the [install guide](https://bigdl.readthedocs.io/en/latest/doc/Orca/Overview/distributed-tuning.html#install) for more details.
 
 
-### **Step 1: Init Orca Context**
+### Step 1: Init Orca Context
 ```python
 from bigdl.orca import init_orca_context, stop_orca_context
 
@@ -32,7 +32,7 @@ This is the only place where you need to specify local or distributed mode. View
 
 **Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
 
-### **Step 2: Define Search space**
+### Step 2: Define Search space
 
 You should define a dictionary as your hyper-parameter search space.
 
@@ -47,7 +47,7 @@ search_space = {
 }
 ```
 
-### **Step 3: Automatically fit and search with Orca AutoXGBoost**
+### Step 3: Automatically fit and search with Orca AutoXGBoost
 
 First create an `AutoXGBRegressor`.
 
@@ -70,7 +70,7 @@ auto_xgb_reg.fit(data=(X_train, y_train),
                  metric="rmse")
 ```
 
-### **Step 4: Get best model and hyper parameters**
+### Step 4: Get best model and hyper parameters
 
 You can get the best learned model and the best hyper-parameter set for further deployment. The best model is an sklearn `XGBRegressor` instance.
 
diff --git a/docs/readthedocs/source/doc/Orca/QuickStart/orca-keras-quickstart.md b/docs/readthedocs/source/doc/Orca/QuickStart/orca-keras-quickstart.md
index 53698b07..a5b25e98 100644
--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-keras-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-keras-quickstart.md
@@ -9,7 +9,7 @@
 **In this guide we will describe how to scale out _Keras 2.3_ programs using Orca in 4 simple steps.** (_[TensorFlow 1.5](./orca-tf-quickstart.md) and [TensorFlow 2](./orca-tf2keras-quickstart.md) guides are also available._)
 
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
@@ -24,7 +24,7 @@ pip install pandas
 pip install scikit-learn
 ```
 
-### **Step 1: Init Orca Context**
+### Step 1: Init Orca Context
 ```python
 from bigdl.orca import init_orca_context, stop_orca_context
 
@@ -42,7 +42,7 @@ This is the only place where you need to specify local or distributed mode. View
 
 **Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).
 
-### **Step 2: Define the Model**
+### Step 2: Define the Model
 
 You may define your model, loss and metrics in the same way as in any standard (single node) Keras program.
 
@@ -66,7 +66,7 @@ model.compile(optimizer=keras.optimizers.RMSprop(),
               loss='sparse_categorical_crossentropy',
               metrics=['accuracy'])
 ```
-### **Step 3: Define Train Dataset**
+### Step 3: Define Train Dataset
 
 You can define the dataset using standard [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset). Orca also supports [Spark DataFrame](https://spark.apache.org/docs/latest/sql-programming-guide.html) and [Orca XShards](../Overview/data-parallel-processing.md).
 
@@ -86,7 +86,7 @@ mnist_train = mnist_train.map(preprocess)
 mnist_test = mnist_test.map(preprocess)
 ```
 
-### **Step 4: Fit with Orca Estimator**
+### Step 4: Fit with Orca Estimator
 
 First, create an Estimator.
 
diff --git a/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-distributed-quickstart.md b/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-distributed-quickstart.md
index 3a20bfeb..278929d0 100644
--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-distributed-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-distributed-quickstart.md
@@ -8,7 +8,7 @@
 
 **In this guide we will describe how to scale out _PyTorch_ programs using the `torch.distributed` package in Orca.**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) is needed to prepare the Python environment for running this example. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
@@ -19,7 +19,7 @@ pip install bigdl-orca[ray]
 pip install torch==1.7.1 torchvision==0.8.2
 ```
 
-### **Step 1: Init Orca Context**
+### Step 1: Init Orca Context
 ```python
 from bigdl.orca import init_orca_context, stop_orca_context
 
@@ -35,7 +35,7 @@ This is the only place where you need to specify local or distributed mode. View
 
 **Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
 
-### **Step 2: Define the Model**
+### Step 2: Define the Model
 
 You may define your model, loss and optimizer in the same way as in any standard (single node) PyTorch program.
 
@@ -75,7 +75,7 @@ def optim_creator(model, config):
     return torch.optim.Adam(model.parameters(), lr=0.001)
 ```
 
-### **Step 3: Define Train Dataset**
+### Step 3: Define Train Dataset
 
 You can define the dataset using a *Data Creator Function* that returns a PyTorch `DataLoader`. Orca also supports [Orca SparkXShards](../Overview/data-parallel-processing).
 
@@ -109,7 +109,7 @@ def test_loader_creator(config, batch_size):
     return test_loader
 ```
 
-### **Step 4: Fit with Orca Estimator**
+### Step 4: Fit with Orca Estimator
 
 First, Create an Estimator
 
@@ -130,7 +130,7 @@ for r in result:
     print(r, ":", result[r])
 ```
 
-### **Step 5: Save and Load the Model**
+### Step 5: Save and Load the Model
 
 Save the Estimator states (including model and optimizer) to the provided model path.
 
diff --git a/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart.md b/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart.md
index 0bfce47e..421fd695 100644
--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-pytorch-quickstart.md
@@ -8,7 +8,7 @@
 
 **In this guide we will describe how to scale out _PyTorch_ programs using Orca in 4 simple steps.**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) is needed to prepare the Python environment for running this example. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
@@ -22,7 +22,7 @@ pip install six cloudpickle
 pip install jep==3.9.0
 ```
 
-### **Step 1: Init Orca Context**
+### Step 1: Init Orca Context
 ```python
 from bigdl.orca import init_orca_context, stop_orca_context
 
@@ -44,7 +44,7 @@ This is the only place where you need to specify local or distributed mode. View
 
 **Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
 
-### **Step 2: Define the Model**
+### Step 2: Define the Model
 
 You may define your model, loss and optimizer in the same way as in any standard (single node) PyTorch program.
 
@@ -77,7 +77,7 @@ criterion = nn.NLLLoss()
 adam = torch.optim.Adam(model.parameters(), 0.001)
 ```
 
-### **Step 3: Define Train Dataset**
+### Step 3: Define Train Dataset
 
 You can define the dataset using standard [Pytorch DataLoader](https://pytorch.org/docs/stable/data.html). 
 
@@ -108,7 +108,7 @@ test_loader = torch.utils.data.DataLoader(
 
 Alternatively, we can also use a [Data Creator Function](https://github.com/intel-analytics/BigDL/blob/main/docs/docs/colab-notebook/orca/quickstart/pytorch_lenet_mnist_data_creator_func.ipynb) or [Orca XShards](../Overview/data-parallel-processing) as the input data, especially when the data size is very large)
 
-### **Step 4: Fit with Orca Estimator**
+### Step 4: Fit with Orca Estimator
 
 First, Create an Estimator
 
@@ -132,7 +132,7 @@ for r in result:
     print(r, ":", result[r])
 ```
 
-### **Step 5: Save and Load the Model**
+### Step 5: Save and Load the Model
 
 Save the Estimator states (including model and optimizer) to the provided model path.
 
diff --git a/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf-quickstart.md b/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf-quickstart.md
index 091d7111..067314a7 100644
--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf-quickstart.md
@@ -8,7 +8,7 @@
 
 **In this guide we will describe how to scale out _TensorFlow 1.15_ programs using Orca in 4 simple steps.** (_[Keras 2.3](./orca-keras-quickstart.md) and [TensorFlow 2](./orca-tf2keras-quickstart.md) guides are also available._)
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
@@ -21,7 +21,7 @@ pip install tensorflow-datasets==2.0
 pip install psutil
 ```
 
-### **Step 1: Init Orca Context**
+### Step 1: Init Orca Context
 ```python
 from bigdl.orca import init_orca_context, stop_orca_context
 
@@ -39,7 +39,7 @@ This is the only place where you need to specify local or distributed mode. View
 
 **Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details. To use tensorflow_datasets on HDFS, you should correctly set HADOOP_HOME, HADOOP_HDFS_HOME, LD_LIBRARY_PATH, etc. For more details, please refer to TensorFlow documentation [link](https://github.com/tensorflow/docs/blob/r1.11/site/en/deploy/hadoop.md).
 
-### **Step 2: Define the Model**
+### Step 2: Define the Model
 
 You may define your model, loss and metrics in the same way as in any standard (single node) TensorFlow program.
 
@@ -71,7 +71,7 @@ logits = lenet(images)
 loss = tf.reduce_mean(tf.losses.sparse_softmax_cross_entropy(logits=logits, labels=labels))
 acc = accuracy(logits, labels)
 ```
-### **Step 3: Define Train Dataset**
+### Step 3: Define Train Dataset
 
 You can define the dataset using standard [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset). Orca also supports [Spark DataFrame](https://spark.apache.org/docs/latest/sql-programming-guide.html) and [Orca XShards](../Overview/data-parallel-processing.md).
 
@@ -91,7 +91,7 @@ mnist_train = mnist_train.map(preprocess)
 mnist_test = mnist_test.map(preprocess)
 ```
 
-### **Step 4: Fit with Orca Estimator**
+### Step 4: Fit with Orca Estimator
 
 First, create an Estimator.
 
diff --git a/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf2keras-quickstart.md b/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf2keras-quickstart.md
index c1fd8f13..545c83fa 100644
--- a/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf2keras-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/orca-tf2keras-quickstart.md
@@ -8,7 +8,7 @@
 
 **In this guide we will describe how to to scale out _TensorFlow 2_ programs using Orca in 4 simple steps.** (_[TensorFlow 1.5](./orca-tf-quickstart.md) and [Keras 2.3](./orca-keras-quickstart.md) guides are also available._)
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
@@ -19,7 +19,7 @@ pip install bigdl-orca[ray]
 pip install tensorflow
 ```
 
-### **Step 1: Init Orca Context**
+### Step 1: Init Orca Context
 ```python
 from bigdl.orca import init_orca_context, stop_orca_context
 
@@ -35,7 +35,7 @@ This is the only place where you need to specify local or distributed mode. View
 
 **Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](./../../UserGuide/hadoop.md) for more details.
 
-### **Step 2: Define the Model**
+### Step 2: Define the Model
 
 You can then define the Keras model in the _Creator Function_ using the standard TensroFlow 2 APIs.
 
@@ -61,7 +61,7 @@ def model_creator(config):
                   metrics=['accuracy'])
     return model
 ```
-### **Step 3: Define Train Dataset**
+### Step 3: Define Train Dataset
 
 You can define the dataset in the _Creator Function_ using standard [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) APIs. Orca also supports [Spark DataFrame](https://spark.apache.org/docs/latest/sql-programming-guide.html) and [Orca XShards](../Overview/data-parallel-processing.md).
 
@@ -91,7 +91,7 @@ def val_data_creator(config, batch_size):
     return dataset
 ```
 
-### **Step 4: Fit with Orca Estimator**
+### Step 4: Fit with Orca Estimator
 
 First, create an Estimator.
 
@@ -118,7 +118,7 @@ est.shutdown()
 print(stats)
 ```
 
-### **Step 5: Save and Load the Model**
+### Step 5: Save and Load the Model
 
 Orca TF2 Estimator supports two formats to save and load the entire model (**TensorFlow SavedModel and Keras H5 Format**). The recommended format is SavedModel, which is the default format when you use `estimator.save()`.
 
diff --git a/docs/readthedocs/source/doc/Orca/QuickStart/ray-quickstart.md b/docs/readthedocs/source/doc/Orca/QuickStart/ray-quickstart.md
index b67d7cdf..db5d395b 100644
--- a/docs/readthedocs/source/doc/Orca/QuickStart/ray-quickstart.md
+++ b/docs/readthedocs/source/doc/Orca/QuickStart/ray-quickstart.md
@@ -8,7 +8,7 @@
 
 **In this guide, we will describe how to use RayOnSpark to directly run Ray programs on Big Data clusters in 2 simple steps.**
 
-### **Step 0: Prepare Environment**
+### Step 0: Prepare Environment
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
 
@@ -18,7 +18,7 @@ conda activate bigdl
 pip install bigdl-orca[ray]
 ```
 
-### **Step 1: Initialize**
+### Step 1: Initialize
 
 We recommend using `init_orca_context` to initiate and run BigDL on the underlying cluster. The Ray cluster would be launched automatically by specifying `init_ray_on_spark=True`.
 
@@ -57,7 +57,7 @@ address_info = ray_ctx.address_info  # The dictionary information of the ray clu
 redis_address = ray_ctx.redis_address  # The redis address of the ray cluster.
 ```
 
-### **Step 2: Run Ray Applications**
+### Step 2: Run Ray Applications
 
 After the initialization, you can directly write Ray code inline with your Spark code, and run Ray programs on the underlying existing Big Data clusters. Ray [tasks](https://docs.ray.io/en/master/walkthrough.html#remote-functions-tasks) and [actors](https://docs.ray.io/en/master/actors.html) would be launched across the cluster.
 
diff --git a/docs/readthedocs/source/doc/UseCase/spark-dataframe.md b/docs/readthedocs/source/doc/UseCase/spark-dataframe.md
index ad5cdee1..a2759c8e 100644
--- a/docs/readthedocs/source/doc/UseCase/spark-dataframe.md
+++ b/docs/readthedocs/source/doc/UseCase/spark-dataframe.md
@@ -10,7 +10,7 @@
 
 The dataset used in this guide is [movielens-1M](https://grouplens.org/datasets/movielens/1m/), which contains 1 million ratings of 5 levels from 6000 users on 4000 movies. We will read the data into Spark Dataframe and directly use the Spark Dataframe as the input to the distributed training.
 
-### **1. Read input data into Spark DataFrame**
+### 1. Read input data into Spark DataFrame
 
 First, read the input data into Spark Dataframes.
 
@@ -23,7 +23,7 @@ df = spark.read.csv(new_rating_files, sep=':', inferSchema=True).toDF(
   "user", "item", "label", "timestamp")
 ```
 
-### **2. Process data using Spark Dataframe**
+### 2. Process data using Spark Dataframe
 
 Next, process the data using Spark Dataframe operations.
 
@@ -35,7 +35,7 @@ df = df.withColumn('label', df.label-1)
 train_data, test_data = df.randomSplit([0.8, 0.2], 100)
 ```
 
-### **3. Define NCF model**
+### 3. Define NCF model
 
 This example defines NCF model in the _Creator Function_ using TensroFlow 2 APIs as follows.
 
@@ -77,7 +77,7 @@ def model_creator(config):
     return model
 ```
 
-### **4. Fit with Orca Estimator**
+### 4. Fit with Orca Estimator
 
 Finally, run distributed model training/inference on the Spark Dataframes directly.
 
diff --git a/docs/readthedocs/source/doc/UseCase/xshards-pandas.md b/docs/readthedocs/source/doc/UseCase/xshards-pandas.md
index 25256200..cbb47ccd 100644
--- a/docs/readthedocs/source/doc/UseCase/xshards-pandas.md
+++ b/docs/readthedocs/source/doc/UseCase/xshards-pandas.md
@@ -8,7 +8,7 @@
 
 **In this guide we will describe how to use [XShards](../Orca/Overview/data-parallel-processing.md) to scale-out Pandas data processing for distribtued deep learning.** 
 
-### **1. Read input data into XShards of Pandas DataFrame**
+### 1. Read input data into XShards of Pandas DataFrame
 
 First, read CVS, JSON or Parquet files into an `XShards` of Pandas Dataframe (i.e., a distributed and sharded dataset where each partition contained a Pandas Dataframe), as shown below:
 
@@ -19,7 +19,7 @@ full_data = read_csv(new_rating_files, sep=':', header=None,
                      dtype={0: np.int32, 1: np.int32, 2: np.int32})
 ```
 
-### **2. Process Pandas Dataframes using XShards**
+### 2. Process Pandas Dataframes using XShards
 
 Next, use XShards to efficiently process large-size Pandas Dataframes in a distributed and data-parallel fashion. You may run standard Python code on each partition in a data-parallel fashion using `XShards.transform_shard`, as shown below:
 
@@ -43,7 +43,7 @@ def split_train_test(data):
 train_data, test_data = full_data.transform_shard(split_train_test).split()
 ```
 
-### **3. Define NCF model**
+### 3. Define NCF model
 
 Define the NCF model using TensorFlow 1.15 APIs:
 
@@ -94,7 +94,7 @@ class NCF(object):
 embedding_size=16
 model = NCF(embedding_size, max_user_id, max_item_id)
 ```
-### **4. Fit with Orca Estimator**
+### 4. Fit with Orca Estimator
 
 Finally, directly run distributed model training/inference on the XShards of Pandas DataFrames.
 
diff --git a/docs/readthedocs/source/doc/UserGuide/colab.md b/docs/readthedocs/source/doc/UserGuide/colab.md
index db5b9257..4c59b2a4 100644
--- a/docs/readthedocs/source/doc/UserGuide/colab.md
+++ b/docs/readthedocs/source/doc/UserGuide/colab.md
@@ -4,11 +4,11 @@
 
 You can use BigDL without any installation by using  [Google Colab](https://colab.research.google.com/).
 
-### **1. Open a Colab Notebook**
+### 1. Open a Colab Notebook
 
 BigDL includes a collection of [notebooks](./notebooks.md) that can be directly opened and run in Colab. You can click 'Run in Google Colab' that opens the notebook on Colab directly. Click the "run" triangle on the left of each cell to run the notebook cell. When you run the first cell, you may face a pop-up saying 'Warning: This notebook was not authored by Google'; you should click on 'Run Anyway' to get rid of the warning. 
 
-### **2. Notebook Setup**
+### 2. Notebook Setup
 
 The first few cells of the notebook contains the code necessary to set up BigDL and other libraries.
 
diff --git a/docs/readthedocs/source/doc/UserGuide/databricks.md b/docs/readthedocs/source/doc/UserGuide/databricks.md
index 6b28d24e..06b22131 100644
--- a/docs/readthedocs/source/doc/UserGuide/databricks.md
+++ b/docs/readthedocs/source/doc/UserGuide/databricks.md
@@ -3,7 +3,7 @@
 ---
 
 You can run BigDL program on the [Databricks](https://databricks.com/) cluster as follows.
-### **1. Create a Databricks Cluster**
+### 1. Create a Databricks Cluster
 
 - Create either an [AWS Databricks](https://docs.databricks.com/getting-started/try-databricks.html) workspace or an [Azure Databricks](https://docs.microsoft.com/en-us/azure/azure-databricks/) workspace. 
 - Create a Databricks [cluster](https://docs.databricks.com/clusters/create.html) using the UI. Choose Databricks runtime version. This guide is tested on Runtime 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12).
@@ -90,7 +90,7 @@ Use the init script from [step 2](#2-generate-initialization-script) to install
 
 Then start or restart the cluster. After starting/restarting the cluster, the libraries specified in the init script are all installed.
 
-### **5. Run BigDL on Databricks**
+### 5. Run BigDL on Databricks
 
 Open a new notebook, and call `init_orca_context` at the beginning of your code (with `cluster_mode` set to "spark-submit").
 
@@ -110,7 +110,7 @@ Output on Databricks:
 
 > Note that if you want to save model to DBFS, or load model from DBFS, the save/load path should be the **File API Format** on Databricks, which means your save/load path should start with `/dbfs`.
 
-### **6. Other ways to install third-party libraries on Databricks if necessary**
+### 6. Other ways to install third-party libraries on Databricks if necessary
 
 If you want to use other ways to install third-party libraries, check related Databricks documentation of [libraries for AWS Databricks](https://docs.databricks.com/libraries/index.html) and [libraries for Azure Databricks](https://docs.microsoft.com/en-us/azure/databricks/libraries/).
 
diff --git a/docs/readthedocs/source/doc/UserGuide/develop.md b/docs/readthedocs/source/doc/UserGuide/develop.md
index a26fcd62..5635d81d 100644
--- a/docs/readthedocs/source/doc/UserGuide/develop.md
+++ b/docs/readthedocs/source/doc/UserGuide/develop.md
@@ -11,9 +11,9 @@ git clone https://github.com/intel-analytics/BigDL.git
 By default, `git clone` will download the development version of BigDL. If you want a release version, you can use the command `git checkout` to change the specified version.
 
 
-### **1. Python**
+### 1. Python
 
-#### **1.1 Build** 
+#### 1.1 Build
 
 To generate a new [whl](https://pythonwheels.com/) package for pip install, you can run the following script:
 
@@ -64,7 +64,7 @@ pip install bigdl_serving-*.whl
 See [here](./python.md) for more instructions to run BigDL after pip install.
 
 
-#### **1.2 IDE Setup**
+#### 1.2 IDE Setup
 Any IDE that support Python should be able to run BigDL. PyCharm works fine for us.
 
 You need to do the following preparations before starting the IDE to successfully run a BigDL Python program in the IDE:
@@ -102,7 +102,7 @@ You need to do the following preparations before starting the IDE to successfull
 The above environment variables should be available when running or debugging code in the IDE. When running applications in PyCharm, you can add runtime environment variables by clicking  __Run__ -> __Edit Configurations__; then in the __Run/Debug Configurations__ panel, you can add necessary environment variables to your applications.
 
 
-#### **1.3 Terminal Setup**
+#### 1.3 Terminal Setup
 
 Besides setting the environment variables mentioned above manually for Linux users, we also provide a solution to set them with a script:
 
@@ -123,9 +123,9 @@ python BigDL/python/dllib/examples/autograd/custom.py
 Note that this approach will only work temporarily for this terminal. 
 
 
-### **2. Scala**
+### 2. Scala
 
-#### **2.1 Build**
+#### 2.1 Build
 
 Maven 3 is needed to build BigDL, you can download it from the [maven website](https://maven.apache.org/download.cgi).
 
@@ -162,7 +162,7 @@ Build with `make-dist.sh`:
 $ bash make-dist.sh -P spark_3.x -Djava.version=11 -Djavac.version=11
 ```
 
-#### **2.2 IDE Setup**
+#### 2.2 IDE Setup
 
 BigDL uses maven to organize project. You should choose an IDE that supports Maven project and scala language. IntelliJ IDEA works fine for us.
 
diff --git a/docs/readthedocs/source/doc/UserGuide/docker.md b/docs/readthedocs/source/doc/UserGuide/docker.md
index 4d264d78..f07affba 100644
--- a/docs/readthedocs/source/doc/UserGuide/docker.md
+++ b/docs/readthedocs/source/doc/UserGuide/docker.md
@@ -2,7 +2,7 @@
 
 ---
 
-### **1. Pull Docker Image**
+### 1. Pull Docker Image
 
 You may pull a Docker image from the  [Docker Hub](https://hub.docker.com/r/intelanalytics/bigdl/tags).
 
@@ -43,7 +43,7 @@ sudo systemctl daemon-reload
 sudo systemctl restart docker
 ```
 
-### **2. Launch Docker Container**
+### 2. Launch Docker Container
 
 After pulling the BigDL Docker image, you can launch an BigDL Docker container:
 ```
@@ -70,11 +70,11 @@ The /opt/work directory contains:
 * BigDL is cloned from https://github.com/intel-analytics/BigDL.git, contains apps, examples using BigDL.
 * opt/download-bigdl.sh is used for downloading BigDL distributions.
 
-### **3. Run Jupyter Notebook Examples in the Container**
+### 3. Run Jupyter Notebook Examples in the Container
 
 After a Docker container is launched and user login into the container, you can start the Jupyter Notebook service inside the container.
 
-#### **3.1 Start the Jupyter Notebook services**
+#### 3.1 Start the Jupyter Notebook services
 
 In the `/opt/work` directory, run this command line to start the Jupyter Notebook service:
 ```
@@ -90,7 +90,7 @@ You will see the output message like below. This means the Jupyter Notebook serv
 [I 07:40:39.355 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
 ```
 
-#### **3.2 Connect to Jupyter Notebook service from a browser**
+#### 3.2 Connect to Jupyter Notebook service from a browser
 
 After the Jupyter Notebook service is successfully started, you can connect to the Jupyter Notebook service from a browser.
 
@@ -100,7 +100,7 @@ As a result, you will see the Jupyter Notebook like this:
 
 ![](images/notebook1.jpg)
 
-#### **3.3 Run BigDL Jupyter Notebooks**
+#### 3.3 Run BigDL Jupyter Notebooks
 
 After connecting to the Jupyter Notebook in the browser, you can run multiple BigDL Jupyter Notebook examples. The example shown below is the “dogs-vs-cats”.
 
@@ -120,7 +120,7 @@ After connecting to the Jupyter Notebook in the browser, you can run multiple Bi
 
 ![](images/notebook5.jpg)
 
-### **4. Shut Down Docker Container**
+### 4. Shut Down Docker Container
 
 You should shut down the BigDL Docker container after using it.
 
diff --git a/docs/readthedocs/source/doc/UserGuide/hadoop.md b/docs/readthedocs/source/doc/UserGuide/hadoop.md
index 72c2c25f..aa8e4d2c 100644
--- a/docs/readthedocs/source/doc/UserGuide/hadoop.md
+++ b/docs/readthedocs/source/doc/UserGuide/hadoop.md
@@ -8,7 +8,7 @@ For _**Scala users**_, please see [Scala User Guide](./scala.md) for how to run
 
 For _**Python users**_, you can run BigDL programs on standard Hadoop/YARN clusters without any changes to the cluster (i.e., no need to pre-install BigDL or other Python libraries on all nodes in the cluster).
 
-### **1. Prepare Python Environment**
+### 1. Prepare Python Environment
 
 - You need to first use [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the Python environment _**on the local machine**_ where you submit your application. Create a conda environment, install BigDL and all the needed Python libraries in the created conda environment:
 
@@ -54,7 +54,7 @@ For _**Python users**_, you can run BigDL programs on standard Hadoop/YARN clust
   Also, a CDH cluster's `HADOOP_CONF_DIR` should be `/etc/hadoop/conf` on CDH by default.
 
 ---
-### **2. Run on YARN with built-in function**
+### 2. Run on YARN with built-in function
 
 _**This is the easiest and most recommended way to run BigDL on YARN,**_ as you don't need to care about environment preparation and Spark related commands. In this way, you can easily switch your job between local (for test) and YARN (for production) by changing the "cluster_mode".
 
@@ -85,7 +85,7 @@ _**This is the easiest and most recommended way to run BigDL on YARN,**_ as you
   ```
 
 ---
-### **3. Run on YARN with spark-submit**
+### 3. Run on YARN with spark-submit
 
 Follow the steps below if you need to run BigDL with [spark-submit](https://spark.apache.org/docs/latest/running-on-yarn.html#launching-spark-on-yarn).  
 
diff --git a/docs/readthedocs/source/doc/UserGuide/k8s.md b/docs/readthedocs/source/doc/UserGuide/k8s.md
index eb564f00..73815c76 100644
--- a/docs/readthedocs/source/doc/UserGuide/k8s.md
+++ b/docs/readthedocs/source/doc/UserGuide/k8s.md
@@ -2,7 +2,7 @@
 
 ---
 
-### **1. Pull `bigdl-k8s` Docker Image**
+### 1. Pull `bigdl-k8s` Docker Image
 
 You may pull the prebuilt  BigDL `bigdl-k8s` Image from [Docker Hub](https://hub.docker.com/r/intelanalytics/bigdl-k8s/tags) as follows:
 
@@ -32,7 +32,7 @@ sudo systemctl daemon-reload
 sudo systemctl restart docker
 ```
 
-### **2. Launch a Client Container**
+### 2. Launch a Client Container
 
 You can submit BigDL application from a client container that provides the required environment.
 
@@ -102,7 +102,7 @@ The `/opt` directory contains:
 - spark is the spark home.
 - redis is the redis home.
 
-### **3. Submit to k8s from remote**
+### 3. Submit to k8s from remote
 
 Instead of lanuching a client container, you can also submit BigDL application from a remote node with the following steps:
 
@@ -118,13 +118,13 @@ Instead of lanuching a client container, you can also submit BigDL application f
 2. Follow the steps in the [Python User Guide](./python.html#install) to install BigDL in a conda environment.
 
 
-### **4. Run BigDL on k8s**
+### 4. Run BigDL on k8s
 
 _**Note**: Please make sure `kubectl` has appropriate permission to create, list and delete pod._
 
 You may refer to [Section 5](#known-issues) for some known issues when running BigDL on k8s.
 
-#### **4.1 K8s client mode**
+#### 4.1 K8s client mode
 
 We recommend using `init_orca_context` at the very beginning of your code (e.g. in script.py) to initiate and run BigDL on standard K8s clusters in [client mode](http://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode).
 
@@ -140,7 +140,7 @@ Remark: You may need to specify Spark driver host and port if necessary by addin
 
 Execute `python script.py` to run your program on k8s cluster directly.
 
-#### **4.2 K8s cluster mode**
+#### 4.2 K8s cluster mode
 
 For k8s [cluster mode](https://spark.apache.org/docs/3.1.2/running-on-kubernetes.html#cluster-mode), you can call `init_orca_context` and specify cluster_mode to be "spark-submit" in your python script (e.g. in script.py):
 
@@ -175,7 +175,7 @@ ${SPARK_HOME}/bin/spark-submit \
   local:///path/script.py
 ```
 
-#### **4.3 Run Jupyter Notebooks**
+#### 4.3 Run Jupyter Notebooks
 
 After a Docker container is launched and user login into the container, you can start the Jupyter Notebook service inside the container.
 
@@ -195,7 +195,7 @@ You will see the output message like below. This means the Jupyter Notebook serv
 
 Then, refer [docker guide](./docker.md) to open Jupyter Notebook service from a browser and run notebook.
 
-#### **4.4 Run Scala programs**
+#### 4.4 Run Scala programs
 
 Use spark-submit to submit your BigDL program.  e.g., run [nnframes imageInference](../../../../../../scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/nnframes/imageInference) example (running in either local mode or cluster mode) as follows:
 
@@ -241,11 +241,11 @@ Options:
 - --class: scala example class name.
 - --inputDir: input data path of the nnframe example. The data path is the mounted filesystem of the host. Refer to more details by [Kubernetes Volumes](https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-kubernetes-volumes).
 
-### **5 Known Issues**
+### 5 Known Issues
 
 This section shows some common topics for both client mode and cluster mode.
 
-#### **5.1 How to specify the Python environment?**
+#### 5.1 How to specify the Python environment?
 
 In client mode, follow [python user guide](./python.md) to install conda and BigDL and run application:
 ```python
@@ -263,7 +263,7 @@ In cluster mode, install conda, pack environment and use on both the driver and
   --archives local:///bigdl2.0/data/environment.tar.gz#env \ # this path shoud be that k8s pod can access
   ```
 
-#### **5.2 How to retain executor logs for debugging?**
+#### 5.2 How to retain executor logs for debugging?
 
 The k8s would delete the pod once the executor failed in client mode and cluster mode.  If you want to get the content of executor log, you could set "temp-dir" to a mounted network file system (NFS) storage to change the log dir to replace the former one. In this case, you may meet `JSONDecodeError` because multiple executors would write logs to the same physical folder and cause conflicts. The solutions are in the next section.
 
@@ -271,11 +271,11 @@ The k8s would delete the pod once the executor failed in client mode and cluster
 init_orca_context(..., extra_params = {"temp-dir": "/bigdl/"})
 ```
 
-#### **5.3 How to deal with "JSONDecodeError"?**
+#### 5.3 How to deal with "JSONDecodeError"?
 
 If you set `temp-dir` to a mounted nfs storage and use multiple executors , you may meet `JSONDecodeError` since multiple executors would write to the same physical folder and cause conflicts. Do not mount `temp-dir` to shared storage is one option to avoid conflicts. But if you debug ray on k8s, you need to output logs to a shared storage. In this case, you could set num-nodes to 1. After testing, you can remove `temp-dir` setting and run multiple executors.
 
-#### **5.4 How to use NFS?**
+#### 5.4 How to use NFS?
 
 If you want to save some files out of pod's lifecycle, such as logging callbacks or tensorboard callbacks, you need to set the output dir to a mounted persistent volume dir. Let NFS be a simple example.
 
@@ -301,7 +301,7 @@ ${SPARK_HOME}/bin/spark-submit \
   file:///path/script.py
 ```
 
-#### **5.5 How to deal with "RayActorError"?**
+#### 5.5 How to deal with "RayActorError"?
 
 "RayActorError" may caused by running out of the ray memory. If you meet this error, try to increase the memory for ray.
 
@@ -309,15 +309,15 @@ ${SPARK_HOME}/bin/spark-submit \
 init_orca_context(..., extra_executor_memory_for_ray="100g")
 ```
 
-#### **5.6 How to set proper "steps_per_epoch" and "validation steps"?**
+#### 5.6 How to set proper "steps_per_epoch" and "validation steps"?
 
 The `steps_per_epoch` and `validation_steps` should equal to numbers of dataset divided by batch size if you want to train all dataset. The `steps_per_epoch` and `validation_steps` do not relate to the `num_nodes` when total dataset and batch size are fixed. For example, you set `num_nodes` to 1, and set `steps_per_epoch` to 6. If you change the `num_nodes` to 3, the `steps_per_epoch` should still be 6.
 
-#### **5.7 Others**
+#### 5.7 Others
 
 `spark.kubernetes.container.image.pullPolicy` needs to be specified as `always` if you need to update your spark executor image for k8s.
 
-### **6. Access logs and clear pods**
+### 6. Access logs and clear pods
 
 When application is running, it’s possible to stream logs on the driver pod:
 
diff --git a/docs/readthedocs/source/doc/UserGuide/python.md b/docs/readthedocs/source/doc/UserGuide/python.md
index 5eed6354..336d1541 100644
--- a/docs/readthedocs/source/doc/UserGuide/python.md
+++ b/docs/readthedocs/source/doc/UserGuide/python.md
@@ -3,7 +3,7 @@
 ---
 Supported Platforms: Linux and macOS. For Windows, Refer to [Windows User Guide](./win.md).
 
-### **1. Install**
+### 1. Install
 - We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the Python environment as follows:
 
   ```bash
@@ -28,7 +28,7 @@ Supported Platforms: Linux and macOS. For Windows, Refer to [Windows User Guide]
   java -version  # Verify the version of JDK.
   ```
 
-#### **1.1 Official Release**
+#### 1.1 Official Release
 
 You can install the latest release version of BigDL (built on top of Spark 2.4.6 by default) as follows:
 ```bash
@@ -37,7 +37,7 @@ pip install bigdl
 _**Note:** Installing BigDL will automatically install all the BigDL packages including
 `bigdl-nano`, `bigdl-dllib`, `bigdl-orca`, `bigdl-chronos`, `bigdl-friesian`, `bigdl-serving` and their dependencies if they haven't been detected in your conda environment._
 
-#### **1.2 Nightly Build**
+#### 1.2 Nightly Build
 
 You can install the latest nightly build of BigDL as follows:
 
@@ -60,7 +60,7 @@ You could uninstall all the packages of BigDL as follows:
 pip uninstall bigdl-dllib bigdl-core bigdl-tf bigdl-math bigdl-orca bigdl-chronos bigdl-friesian bigdl-nano bigdl-serving bigdl
 ```
 
-#### **1.3 BigDL on Spark 3**
+#### 1.3 BigDL on Spark 3
 
 You can install BigDL built on top of Spark 3.1.2 as follows:
 ```bash
@@ -76,12 +76,12 @@ pip uninstall bigdl-dllib-spark3 bigdl-core bigdl-tf bigdl-math bigdl-orca-spark
 ```
 
 ---
-### **2. Run**
+### 2. Run
 
 _**Note:** Installing BigDL from pip will automatically install `pyspark`. To avoid possible conflicts, you are highly recommended to  **unset the environment variable `SPARK_HOME`**  if it exists in your environment._
 
 
-#### **2.1 Interactive Shell**
+#### 2.1 Interactive Shell
 
 You may test if the installation is successful using the interactive Python shell as follows:
 
@@ -94,7 +94,7 @@ You may test if the installation is successful using the interactive Python shel
   sc = init_orca_context()  # Initiation of bigdl on the underlying cluster.
   ```
 
-#### **2.2 Jupyter Notebook**
+#### 2.2 Jupyter Notebook
 
 You can start the Jupyter notebook as you normally do using the following command and run BigDL programs directly in a Jupyter notebook:
 
@@ -102,7 +102,7 @@ You can start the Jupyter notebook as you normally do using the following comman
 jupyter notebook --notebook-dir=./ --ip=* --no-browser
 ```
 
-#### **2.3 Python Script**
+#### 2.3 Python Script
 
 You can directly write BigDL programs in a Python file (e.g. script.py) and run in the command line as a normal Python program:
 
@@ -111,14 +111,14 @@ python script.py
 ```
 
 ---
-### **3. Python Dependencies**
+### 3. Python Dependencies
 
 We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to manage your Python dependencies. Libraries installed in the current conda environment will be automatically distributed to the cluster when calling `init_orca_context`. You can also add extra dependencies as `.py`, `.zip` and `.egg` files by specifying `extra_python_lib` argument in `init_orca_context`.
 
 For more details, please refer to [Orca Context](../Orca/Overview/orca-context.md).
 
 ---
-### **4. Compatibility**
+### 4. Compatibility
 
 BigDL has been tested on __Python 3.6 and 3.7__ with the following library versions:
 
@@ -155,18 +155,18 @@ Theano==1.0.4
 ```
 
 ---
-### **5. Known Issues**
+### 5. Known Issues
 
 - If you meet the following error when `pip install bigdl`:
-```
-ERROR: Could not find a version that satisfies the requirement pypandoc (from versions: none)
-ERROR: No matching distribution found for pypandoc
-Could not import pypandoc - required to package PySpark
-Traceback (most recent call last):
-  File "/root/anaconda3/lib/python3.8/site-packages/setuptools/installer.py", line 126, in fetch_build_egg
-    subprocess.check_call(cmd)
-  File "/root/anaconda3/lib/python3.8/subprocess.py", line 364, in check_call
-    raise CalledProcessError(retcode, cmd)
-subprocess.CalledProcessError: Command '['/root/anaconda3/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmprefr87ue', '--quiet', 'pypandoc']' returned non-zero exit status 1.
-```
-This is actually caused by `pip install pyspark` in your Python environment. You can fix it by running `pip install pypandoc` first and then `pip install bigdl`.
+  ```
+  ERROR: Could not find a version that satisfies the requirement pypandoc (from versions: none)
+  ERROR: No matching distribution found for pypandoc
+  Could not import pypandoc - required to package PySpark
+  Traceback (most recent call last):
+    File "/root/anaconda3/lib/python3.8/site-packages/setuptools/installer.py", line 126, in fetch_build_egg
+      subprocess.check_call(cmd)
+    File "/root/anaconda3/lib/python3.8/subprocess.py", line 364, in check_call
+      raise CalledProcessError(retcode, cmd)
+  subprocess.CalledProcessError: Command '['/root/anaconda3/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmprefr87ue', '--quiet', 'pypandoc']' returned non-zero exit status 1.
+  ```
+  This is actually caused by `pip install pyspark` in your Python environment. You can fix it by running `pip install pypandoc` first and then `pip install bigdl`.
diff --git a/docs/readthedocs/source/doc/UserGuide/scala.md b/docs/readthedocs/source/doc/UserGuide/scala.md
index 63a4e74d..f4c7446e 100644
--- a/docs/readthedocs/source/doc/UserGuide/scala.md
+++ b/docs/readthedocs/source/doc/UserGuide/scala.md
@@ -3,10 +3,10 @@
 ---
 Supported Platforms: Linux and macOS. _**Note:** Windows is currently not supported._
  
-### **1. Try BigDL Examples**
+### 1. Try BigDL Examples
 This section will show you how to download BigDL prebuild packages and run the build-in examples.
 
-#### **1.1 Download and config** 
+#### 1.1 Download and config
 You can download the BigDL official releases and nightly build from the [Release Page](../release.md). After extracting the prebuild package, you need to set environment variables **BIGDL_HOME** and **SPARK_HOME** as follows:
 
 ```bash
@@ -14,7 +14,7 @@ export SPARK_HOME=folder path where you extract the Spark package
 export BIGDL_HOME=folder path where you extract the BigDL package
 ```
 
-#### **1.2 Use Spark interactive shell**
+#### 1.2 Use Spark interactive shell
 You can  try BigDL using the Spark interactive shell as follows:
 
 ```bash
@@ -60,7 +60,7 @@ scala> val seq = Sequential()
        seq.add(layer)
 ```
 
-#### **1.3 Run BigDL examples**
+#### 1.3 Run BigDL examples
 
 You can run a bigdl-dllib program, e.g., the [Language Model](https://github.com/intel-analytics/BigDL/tree/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/languagemodel), as a standard Spark program (running on either a local machine or a distributed cluster) as follows:
 
@@ -143,12 +143,12 @@ If you are to run your own program, do remember to do the initialize before call
 ```
 --- 
 
-### **2. Build BigDL Applications**
+### 2. Build BigDL Applications
 
 This section will show you how to build your own deep learning project with BigDL. 
 
-#### **2.1 Add BigDL dependency**
-##### **2.1.1 official Release** 
+#### 2.1 Add BigDL dependency
+##### 2.1.1 official Release
 Currently, BigDL releases are hosted on maven central; below is an example to add the BigDL dllib dependency to your own project:
 
 ```xml
@@ -167,7 +167,7 @@ SBT developers can use
 libraryDependencies += "com.intel.analytics.bigdl" % "bigdl-dllib-spark_2.4.6" % "0.14.0"
 ```
 
-##### **2.1.2 Nightly Build**
+##### 2.1.2 Nightly Build
 
 Currently, BigDL nightly build is hosted on [SonaType](https://oss.sonatype.org/content/groups/public/com/intel/analytics/bigdl/).
 
@@ -194,6 +194,6 @@ resolvers += "ossrh repository" at "https://oss.sonatype.org/content/repositorie
 ```
 
 
-#### **2.2 Build a Scala project**
+#### 2.2 Build a Scala project
 To enable BigDL in project, you should add BigDL to your project's dependencies using maven or sbt. Here is a [simple MLP example](https://github.com/intel-analytics/BigDL/tree/branch-2.0/apps/SimpleMlp) to show you how to use BigDL to build your own deep learning project using maven or sbt, and how to run the simple example in IDEA and spark-submit.
 
diff --git a/docs/readthedocs/source/doc/UserGuide/win.md b/docs/readthedocs/source/doc/UserGuide/win.md
index edc80ec4..2782185b 100644
--- a/docs/readthedocs/source/doc/UserGuide/win.md
+++ b/docs/readthedocs/source/doc/UserGuide/win.md
@@ -60,9 +60,9 @@ pip install bigdl
 
     **Related Readings**
     ^^^
-    * `BigDL Installation Guide <../UserGuide/python>`_
-    * `Nano Installation Guide <../Nano/Overview/nano.html#install>`_
-    * `Chronos Installation Guide <../Chronos/Overview/chronos.html#install>`_
+    * `BigDL Installation Guide <./python.html>`_
+    * `Nano Installation Guide <../Nano/Overview/install.html>`_
+    * `Chronos Installation Guide <../Chronos/Overview/install.html>`_
 ```
 
 ### Setup Jupyter Notebook Environment
diff --git a/docs/readthedocs/source/doc/error-log-api.md b/docs/readthedocs/source/doc/error-log-api.md
index a164adf6..80fbea2a 100644
--- a/docs/readthedocs/source/doc/error-log-api.md
+++ b/docs/readthedocs/source/doc/error-log-api.md
@@ -1,7 +1,7 @@
 BigDL provides error handling api. Please don't use `assert`, `raise`, `throw` to fail the application.
 Use the error handling api instead, it will provide useful message for debugging.
 
-## **Error handling API**
+## Error handling API
 
 **Scala**
 
@@ -69,7 +69,7 @@ invalidOperationError(condition, errMsg, fixMsg=None, cause=None)
 * `cause`: Exception need to throw.
 
 ---
-## **Examples**
+## Examples
 **Scala**
 
 If you want to use: