Update_document by heyang (#30)

2024-03-25 10:06:02 +08:00 · 2024-03-25 10:06:02 +08:00 · 16b2ef49c6
commit 16b2ef49c6
parent e2d25de17d
579 changed files with 1940 additions and 25873 deletions
--- a/README.md
+++ b/README.md
@ -1,38 +1,31 @@
-<div align="center">
+## IPEX-LLM

-<p align="center"> <img src="https://llm-assets.readthedocs.io/en/latest/_images/bigdl_logo.jpg" height="140px"><br></p>
-
-</div>
-
---
-## BigDL-LLM
-
-**`bigdl-llm`** is a library for running **LLM** (large language model) on Intel **XPU** (from *Laptop* to *GPU* to *Cloud*) using **INT4/FP4/INT8/FP8** with very low latency[^1] (for any **PyTorch** model).
+**`ipex-llm`** is a library for running **LLM** (large language model) on Intel **XPU** (from *Laptop* to *GPU* to *Cloud*) using **INT4/FP4/INT8/FP8** with very low latency[^1] (for any **PyTorch** model).

 > *It is built on the excellent work of [llama.cpp](https://github.com/ggerganov/llama.cpp), [bitsandbytes](https://github.com/TimDettmers/bitsandbytes), [qlora](https://github.com/artidoro/qlora), [gptq](https://github.com/IST-DASLab/gptq), [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ), [awq](https://github.com/mit-han-lab/llm-awq), [AutoAWQ](https://github.com/casper-hansen/AutoAWQ), [vLLM](https://github.com/vllm-project/vllm), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), [gptq_for_llama](https://github.com/qwopqwop200/GPTQ-for-LLaMa), [chatglm.cpp](https://github.com/li-plus/chatglm.cpp), [redpajama.cpp](https://github.com/togethercomputer/redpajama.cpp), [gptneox.cpp](https://github.com/byroneverson/gptneox.cpp), [bloomz.cpp](https://github.com/NouamaneTazi/bloomz.cpp/), etc.*

 ### Latest update 🔥 
- [2024/03] **LangChain** added support for `bigdl-llm`; see the details [here](https://python.langchain.com/docs/integrations/llms/bigdl).
- [2024/02] `bigdl-llm` now supports directly loading model from [ModelScope](python/llm/example/GPU/ModelScope-Models) ([魔搭](python/llm/example/CPU/ModelScope-Models)).
- [2024/02] `bigdl-llm` added inital **INT2** support (based on llama.cpp [IQ2](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GGUF-IQ2) mechanism), which makes it possible to run large-size LLM (e.g., Mixtral-8x7B) on Intel GPU with 16GB VRAM.
- [2024/02] Users can now use `bigdl-llm` through [Text-Generation-WebUI](https://github.com/intel-analytics/text-generation-webui) GUI.
- [2024/02] `bigdl-llm` now supports *[Self-Speculative Decoding](https://bigdl.readthedocs.io/en/latest/doc/LLM/Inference/Self_Speculative_Decoding.html)*, which in practice brings **~30% speedup** for FP16 and BF16 inference latency on Intel [GPU](python/llm/example/GPU/Speculative-Decoding) and [CPU](python/llm/example/CPU/Speculative-Decoding) respectively.
- [2024/02] `bigdl-llm` now supports a comprehensive list of LLM finetuning on Intel GPU (including [LoRA](python/llm/example/GPU/LLM-Finetuning/LoRA), [QLoRA](python/llm/example/GPU/LLM-Finetuning/QLoRA), [DPO](python/llm/example/GPU/LLM-Finetuning/DPO), [QA-LoRA](python/llm/example/GPU/LLM-Finetuning/QA-LoRA) and [ReLoRA](python/llm/example/GPU/LLM-Finetuning/ReLora)).
- [2024/01] Using `bigdl-llm` [QLoRA](python/llm/example/GPU/LLM-Finetuning/QLoRA), we managed to finetune LLaMA2-7B in **21 minutes** and LLaMA2-70B in **3.14 hours** on 8 Intel Max 1550 GPU for [Standford-Alpaca](python/llm/example/GPU/LLM-Finetuning/QLoRA/alpaca-qlora) (see the blog [here](https://www.intel.com/content/www/us/en/developer/articles/technical/finetuning-llms-on-intel-gpus-using-bigdl-llm.html)).
- [2024/01] 🔔🔔🔔 ***The default `bigdl-llm` GPU Linux installation has switched from PyTorch 2.0 to PyTorch 2.1, which requires new oneAPI and GPU driver versions. (See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for more details.)***
- [2023/12] `bigdl-llm` now supports [ReLoRA](python/llm/example/GPU/LLM-Finetuning/ReLora) (see *["ReLoRA: High-Rank Training Through Low-Rank Updates"](https://arxiv.org/abs/2307.05695)*).
- [2023/12] `bigdl-llm` now supports [Mixtral-8x7B](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral) on both Intel [GPU](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral) and [CPU](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mixtral).
- [2023/12] `bigdl-llm` now supports [QA-LoRA](python/llm/example/GPU/LLM-Finetuning/QA-LoRA) (see *["QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models"](https://arxiv.org/abs/2309.14717)*).
- [2023/12] `bigdl-llm` now supports [FP8 and FP4 inference](python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types) on Intel ***GPU***.
- [2023/11] Initial support for directly loading [GGUF](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GGUF), [AWQ](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/AWQ) and [GPTQ](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GPTQ) models into `bigdl-llm` is available.
- [2023/11] `bigdl-llm` now supports [vLLM continuous batching](python/llm/example/GPU/vLLM-Serving) on both Intel [GPU](python/llm/example/GPU/vLLM-Serving) and [CPU](python/llm/example/CPU/vLLM-Serving).
- [2023/10] `bigdl-llm` now supports [QLoRA finetuning](python/llm/example/GPU/LLM-Finetuning/QLoRA) on both Intel [GPU](python/llm/example/GPU/LLM-Finetuning/QLoRA) and [CPU](python/llm/example/CPU/QLoRA-FineTuning).
- [2023/10] `bigdl-llm` now supports [FastChat serving](python/llm/src/bigdl/llm/serving) on on both Intel CPU and GPU.
- [2023/09] `bigdl-llm` now supports [Intel GPU](python/llm/example/GPU) (including iGPU, Arc, Flex and MAX).
- [2023/09] `bigdl-llm` [tutorial](https://github.com/intel-analytics/bigdl-llm-tutorial) is released.
- [2023/09] Over 40 models have been optimized/verified on `bigdl-llm`, including *LLaMA/LLaMA2, ChatGLM2/ChatGLM3, Mistral, Falcon, MPT, LLaVA, WizardCoder, Dolly, Whisper, Baichuan/Baichuan2, InternLM, Skywork, QWen/Qwen-VL, Aquila, MOSS,* and more; see the complete list [here](#verified-models).
-     
-### `bigdl-llm` Demos
+- [2024/03] **LangChain** added support for `ipex-llm`; see the details [here](https://python.langchain.com/docs/integrations/llms/bigdl).
+- [2024/02] `ipex-llm` now supports directly loading model from [ModelScope](python/llm/example/GPU/ModelScope-Models) ([魔搭](python/llm/example/CPU/ModelScope-Models)).
+- [2024/02] `ipex-llm` added inital **INT2** support (based on llama.cpp [IQ2](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GGUF-IQ2) mechanism), which makes it possible to run large-size LLM (e.g., Mixtral-8x7B) on Intel GPU with 16GB VRAM.
+- [2024/02] Users can now use `ipex-llm` through [Text-Generation-WebUI](https://github.com/intel-analytics/text-generation-webui) GUI.
+- [2024/02] `ipex-llm` now supports *[Self-Speculative Decoding](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Inference/Self_Speculative_Decoding.html)*, which in practice brings **~30% speedup** for FP16 and BF16 inference latency on Intel [GPU](python/llm/example/GPU/Speculative-Decoding) and [CPU](python/llm/example/CPU/Speculative-Decoding) respectively.
+- [2024/02] `ipex-llm` now supports a comprehensive list of LLM finetuning on Intel GPU (including [LoRA](python/llm/example/GPU/LLM-Finetuning/LoRA), [QLoRA](python/llm/example/GPU/LLM-Finetuning/QLoRA), [DPO](python/llm/example/GPU/LLM-Finetuning/DPO), [QA-LoRA](python/llm/example/GPU/LLM-Finetuning/QA-LoRA) and [ReLoRA](python/llm/example/GPU/LLM-Finetuning/ReLora)).
+- [2024/01] Using `ipex-llm` [QLoRA](python/llm/example/GPU/LLM-Finetuning/QLoRA), we managed to finetune LLaMA2-7B in **21 minutes** and LLaMA2-70B in **3.14 hours** on 8 Intel Max 1550 GPU for [Standford-Alpaca](python/llm/example/GPU/LLM-Finetuning/QLoRA/alpaca-qlora) (see the blog [here](https://www.intel.com/content/www/us/en/developer/articles/technical/finetuning-llms-on-intel-gpus-using-bigdl-llm.html)).
+- [2024/01] 🔔🔔🔔 ***The default `ipex-llm` GPU Linux installation has switched from PyTorch 2.0 to PyTorch 2.1, which requires new oneAPI and GPU driver versions. (See the [GPU installation guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for more details.)***
+- [2023/12] `ipex-llm` now supports [ReLoRA](python/llm/example/GPU/LLM-Finetuning/ReLora) (see *["ReLoRA: High-Rank Training Through Low-Rank Updates"](https://arxiv.org/abs/2307.05695)*).
+- [2023/12] `ipex-llm` now supports [Mixtral-8x7B](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral) on both Intel [GPU](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral) and [CPU](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mixtral).
+- [2023/12] `ipex-llm` now supports [QA-LoRA](python/llm/example/GPU/LLM-Finetuning/QA-LoRA) (see *["QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models"](https://arxiv.org/abs/2309.14717)*).
+- [2023/12] `ipex-llm` now supports [FP8 and FP4 inference](python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types) on Intel ***GPU***.
+- [2023/11] Initial support for directly loading [GGUF](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GGUF), [AWQ](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/AWQ) and [GPTQ](python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations/GPTQ) models into `ipex-llm` is available.
+- [2023/11] `ipex-llm` now supports [vLLM continuous batching](python/llm/example/GPU/vLLM-Serving) on both Intel [GPU](python/llm/example/GPU/vLLM-Serving) and [CPU](python/llm/example/CPU/vLLM-Serving).
+- [2023/10] `ipex-llm` now supports [QLoRA finetuning](python/llm/example/GPU/LLM-Finetuning/QLoRA) on both Intel [GPU](python/llm/example/GPU/LLM-Finetuning/QLoRA) and [CPU](python/llm/example/CPU/QLoRA-FineTuning).
+- [2023/10] `ipex-llm` now supports [FastChat serving](python/llm/src/ipex_llm/llm/serving) on on both Intel CPU and GPU.
+- [2023/09] `ipex-llm` now supports [Intel GPU](python/llm/example/GPU) (including iGPU, Arc, Flex and MAX).
+- [2023/09] `ipex-llm` [tutorial](https://github.com/intel-analytics/ipex-llm-tutorial) is released.
+- [2023/09] Over 40 models have been optimized/verified on `ipex-llm`, including *LLaMA/LLaMA2, ChatGLM2/ChatGLM3, Mistral, Falcon, MPT, LLaVA, WizardCoder, Dolly, Whisper, Baichuan/Baichuan2, InternLM, Skywork, QWen/Qwen-VL, Aquila, MOSS,* and more; see the complete list [here](#verified-models).
+
+### `ipex-llm` Demos
 See the ***optimized performance*** of `chatglm2-6b` and `llama-2-13b-chat` models on 12th Gen Intel Core CPU and Intel Arc GPU below.

 <table width="100%">
@ -62,11 +55,11 @@ See the ***optimized performance*** of `chatglm2-6b` and `llama-2-13b-chat` mode
  </tr>
 </table>

-### `bigdl-llm` quickstart
+### `ipex-llm` quickstart

- [Windows GPU installation](https://bigdl.readthedocs.io/en/latest/doc/LLM/Quickstart/install_windows_gpu.html)
- [Run BigDL-LLM in Text-Generation-WebUI](https://bigdl.readthedocs.io/en/latest/doc/LLM/Quickstart/webui_quickstart.html)
- [Run BigDL-LLM using Docker](docker/llm)
+- [Windows GPU installation](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/install_windows_gpu.html)
+- [Run IPEX-LLM in Text-Generation-WebUI](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/webui_quickstart.html)
+- [Run IPEX-LLM using Docker](docker/llm)
 - [CPU INT4](#cpu-int4)
 - [GPU INT4](#gpu-int4)
 - [More Low-Bit support](#more-low-bit-support)
@ -74,12 +67,12 @@ See the ***optimized performance*** of `chatglm2-6b` and `llama-2-13b-chat` mode

 #### CPU INT4
 ##### Install
-You may install **`bigdl-llm`** on Intel CPU as follows:
-> Note: See the [CPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_cpu.html) for more details.
+You may install **`ipex-llm`** on Intel CPU as follows:
+> Note: See the [CPU installation guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_cpu.html) for more details.
 ```bash
-pip install --pre --upgrade bigdl-llm[all]
+pip install --pre --upgrade ipex-llm[all]
 ```
-> Note: `bigdl-llm` has been tested on Python 3.9, 3.10 and 3.11
+> Note: `ipex-llm` has been tested on Python 3.9, 3.10 and 3.11

 ##### Run Model
 You may apply INT4 optimizations to any Hugging Face *Transformers* models as follows.
@ -100,13 +93,13 @@ output = tokenizer.batch_decode(output_ids)

 #### GPU INT4
 ##### Install
-You may install **`bigdl-llm`** on Intel GPU as follows:
-> Note: See the [GPU installation guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for more details.
+You may install **`ipex-llm`** on Intel GPU as follows:
+> Note: See the [GPU installation guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html) for more details.
 ```bash
 # below command will install intel_extension_for_pytorch==2.1.10+xpu as default
-pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
+pip install --pre --upgrade ipex-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
 ```
-> Note: `bigdl-llm` has been tested on Python 3.9, 3.10 and 3.11
+> Note: `ipex-llm` has been tested on Python 3.9, 3.10 and 3.11

 ##### Run Model
 You may apply INT4 optimizations to any Hugging Face *Transformers* models as follows.
@ -130,7 +123,7 @@ output = tokenizer.batch_decode(output_ids.cpu())
 #### More Low-Bit Support
 ##### Save and load

-After the model is optimized using `bigdl-llm`, you may save and load the model as follows:
+After the model is optimized using `ipex-llm`, you may save and load the model as follows:
 ```python
 model.save_low_bit(model_path)
 new_model = AutoModelForCausalLM.load_low_bit(model_path)
@ -138,7 +131,7 @@ new_model = AutoModelForCausalLM.load_low_bit(model_path)
 *See the complete example [here](python/llm/example/CPU/HF-Transformers-AutoModels/Save-Load).*

 ##### Additonal data types
- 
+
 In addition to INT4, You may apply other low bit optimizations (such as *INT8*, *INT5*, *NF4*, etc.) as follows: 
 ```python
 model = AutoModelForCausalLM.from_pretrained('/path/to/model/', load_in_low_bit="sym_int8")
@ -146,470 +139,62 @@ model = AutoModelForCausalLM.from_pretrained('/path/to/model/', load_in_low_bit=
 *See the complete example [here](python/llm/example/CPU/HF-Transformers-AutoModels/More-Data-Types).*

 #### Verified Models
-Over 40 models have been optimized/verified on `bigdl-llm`, including *LLaMA/LLaMA2, ChatGLM/ChatGLM2, Mistral, Falcon, MPT, Baichuan/Baichuan2, InternLM, QWen* and more; see the example list below.
-  
-| Model      | CPU Example                                                    | GPU Example                                                     |
-|------------|----------------------------------------------------------------|-----------------------------------------------------------------|
-| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/vicuna) |[link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/vicuna)|
-| LLaMA 2    | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/llama2) | [link1](python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2), [link2-low GPU memory example](python/llm/example/GPU/PyTorch-Models/Model/llama2#example-2---low-memory-version-predict-tokens-using-generate-api) |
-| ChatGLM    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm)   |    | 
-| ChatGLM2   | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm2)  | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm2)   |
-| ChatGLM3   | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm3)  | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm3)   |
-| Mistral    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mistral)   | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mistral)    |
-| Mixtral    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mixtral)   | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral)    |
-| Falcon     | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/falcon)    | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/falcon)     |
-| MPT        | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt)       | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mpt)        |
-| Dolly-v1   | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1)  | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/dolly-v1)   | 
-| Dolly-v2   | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2)  | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/dolly-v2)   | 
-| Replit Code| [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit)    | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/replit)     |
-| RedPajama  | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/redpajama) |    | 
-| Phoenix    | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phoenix)   |    | 
-| StarCoder  | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/starcoder) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/starcoder) | 
-| Baichuan   | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan)  | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan)   |
-| Baichuan2  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2)  |
-| InternLM   | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm)  | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/internlm)   |
-| Qwen       | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen)      | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen)       |
-| Qwen1.5 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen1.5) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen1.5) |
-| Qwen-VL    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen-vl)   | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen-vl)    |
-| Aquila     | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila)    | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/aquila)     |
-| Aquila2     | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila2)    | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/aquila2)     |
-| MOSS       | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/moss)      |    | 
-| Whisper    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/whisper)   | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/whisper)    |
-| Phi-1_5    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phi-1_5)   | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/phi-1_5)    |
-| Flan-t5    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/flan-t5)   | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/flan-t5)    |
-| LLaVA      | [link](python/llm/example/CPU/PyTorch-Models/Model/llava)                 | [link](python/llm/example/GPU/PyTorch-Models/Model/llava)                  |
-| CodeLlama  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/codellama) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/codellama)  |
-| Skywork      | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/skywork)                 |    |
-| InternLM-XComposer  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm-xcomposer)   |    |
-| WizardCoder-Python | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/wizardcoder-python) | |
-| CodeShell | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/codeshell) | |
-| Fuyu      | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/fuyu) | |
-| Distil-Whisper | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/distil-whisper) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/distil-whisper) |
-| Yi | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/yi) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/yi) |
-| BlueLM | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/bluelm) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/bluelm) |
-| Mamba | [link](python/llm/example/CPU/PyTorch-Models/Model/mamba) | [link](python/llm/example/GPU/PyTorch-Models/Model/mamba) |
-| SOLAR | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/solar) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/solar) |
-| Phixtral | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phixtral) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/phixtral) |
-| InternLM2 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/internlm2) |
-| RWKV4 |  | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/rwkv4) |
-| RWKV5 |  | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/rwkv5) |
-| Bark | [link](python/llm/example/CPU/PyTorch-Models/Model/bark) | [link](python/llm/example/GPU/PyTorch-Models/Model/bark) |
-| SpeechT5 |  | [link](python/llm/example/GPU/PyTorch-Models/Model/speech-t5) |
-| DeepSeek-MoE | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/deepseek-moe) |  |
-| Ziya-Coding-34B-v1.0 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/ziya) | |
-| Phi-2 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phi-2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/phi-2) |
-| Yuan2 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/yuan2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/yuan2) |
-| Gemma | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/gemma) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/gemma) |
-| DeciLM-7B | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/deciLM-7b) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/deciLM-7b) |
-| Deepseek | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/deepseek) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/deepseek) |
-
-
-***For more details, please refer to the `bigdl-llm` [Document](https://test-bigdl-llm.readthedocs.io/en/main/doc/LLM/index.html), [Readme](python/llm), [Tutorial](https://github.com/intel-analytics/bigdl-llm-tutorial) and [API Doc](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/LLM/index.html).***
-
---
-## Overview of the complete BigDL project
-
-BigDL seamlessly scales your data analytics & AI applications from laptop to cloud, with the following libraries:
-
- [LLM](python/llm): Low-bit (INT3/INT4/INT5/INT8) large language model library for Intel CPU/GPU
-
- [Orca](#orca): Distributed Big Data & AI (TF & PyTorch) Pipeline on Spark and Ray
-
- [Nano](#nano): Transparent Acceleration of Tensorflow & PyTorch Programs on Intel CPU/GPU
-
- [DLlib](#dllib): “Equivalent of Spark MLlib” for Deep Learning
-
- [Chronos](#chronos): Scalable Time Series Analysis using AutoML
-
- [Friesian](#friesian): End-to-End Recommendation Systems
-
- [PPML](#ppml): Secure Big Data and AI (with SGX/TDX Hardware Security)
-
-For more information, you may [read the docs](https://bigdl.readthedocs.io/).
-
---
-
-## Choosing the right BigDL library
-```mermaid
-flowchart TD;
-    Feature1{{HW Secured Big Data & AI?}};
-    Feature1-- No -->Feature2{{Python vs. Scala/Java?}};
-    Feature1-- "Yes"  -->ReferPPML([<em><strong>PPML</strong></em>]);
-    Feature2-- Python -->Feature3{{What type of application?}};
-    Feature2-- Scala/Java -->ReferDLlib([<em><strong>DLlib</strong></em>]);
-    Feature3-- "Large Language Model" -->ReferLLM([<em><strong>LLM</strong></em>]);
-    Feature3-- "Big Data + AI (TF/PyTorch)" -->ReferOrca([<em><strong>Orca</strong></em>]);
-    Feature3-- Accelerate TensorFlow / PyTorch -->ReferNano([<em><strong>Nano</strong></em>]);
-    Feature3-- DL for Spark MLlib -->ReferDLlib2([<em><strong>DLlib</strong></em>]);
-    Feature3-- High Level App Framework -->Feature4{{Domain?}};
-    Feature4-- Time Series -->ReferChronos([<em><strong>Chronos</strong></em>]);
-    Feature4-- Recommender System -->ReferFriesian([<em><strong>Friesian</strong></em>]);
-    
-    click ReferLLM "https://github.com/intel-analytics/bigdl/tree/main/python/llm"
-    click ReferNano "https://github.com/intel-analytics/bigdl#nano"
-    click ReferOrca "https://github.com/intel-analytics/bigdl#orca"
-    click ReferDLlib "https://github.com/intel-analytics/bigdl#dllib"
-    click ReferDLlib2 "https://github.com/intel-analytics/bigdl#dllib"
-    click ReferChronos "https://github.com/intel-analytics/bigdl#chronos"
-    click ReferFriesian "https://github.com/intel-analytics/bigdl#friesian"
-    click ReferPPML "https://github.com/intel-analytics/bigdl#ppml"
-    
-    classDef ReferStyle1 fill:#5099ce,stroke:#5099ce;
-    classDef Feature fill:#FFF,stroke:#08409c,stroke-width:1px;
-    class ReferLLM,ReferNano,ReferOrca,ReferDLlib,ReferDLlib2,ReferChronos,ReferFriesian,ReferPPML ReferStyle1;
-    class Feature1,Feature2,Feature3,Feature4,Feature5,Feature6,Feature7 Feature;
-    
-```
---
-## Installing
-
- - To install BigDL, we recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/)  environment:
-
-    ```bash
-    conda create -n my_env 
-    conda activate my_env
-    pip install bigdl
-    ```
-    To install latest nightly build, use `pip install --pre --upgrade bigdl`; see [Python](https://bigdl.readthedocs.io/en/latest/doc/UserGuide/python.html) and [Scala](https://bigdl.readthedocs.io/en/latest/doc/UserGuide/scala.html) user guide for more details.
-
- - To install each individual library, such as Chronos, use `pip install bigdl-chronos`; see the [document website](https://bigdl.readthedocs.io/) for more details.
---
-
-## Getting Started
-### Orca
-
- The _Orca_ library seamlessly scales out your single node **TensorFlow**, **PyTorch** or **OpenVINO** programs across large clusters (so as to process distributed Big Data).
-
-  <details><summary>Show Orca example</summary>
-  <br/>
-
-  You can build end-to-end, distributed data processing & AI programs using _Orca_ in 4 simple steps:
-
-  ```python
-  # 1. Initilize Orca Context (to run your program on K8s, YARN or local laptop)
-  from bigdl.orca import init_orca_context, OrcaContext
-  sc = init_orca_context(cluster_mode="k8s", cores=4, memory="10g", num_nodes=2) 
-
-  # 2. Perform distribtued data processing (supporting Spark DataFrames,
-  # TensorFlow Dataset, PyTorch DataLoader, Ray Dataset, Pandas, Pillow, etc.)
-  spark = OrcaContext.get_spark_session()
-  df = spark.read.parquet(file_path)
-  df = df.withColumn('label', df.label-1)
-  ...
-
-  # 3. Build deep learning models using standard framework APIs
-  # (supporting TensorFlow, PyTorch, Keras, OpenVino, etc.)
-  from tensorflow import keras
-  ...
-  model = keras.models.Model(inputs=[user, item], outputs=predictions)  
-  model.compile(...)
-
-  # 4. Use Orca Estimator for distributed training/inference
-  from bigdl.orca.learn.tf.estimator import Estimator
-  est = Estimator.from_keras(keras_model=model)  
-  est.fit(data=df,
-          feature_cols=['user', 'item'],
-          label_cols=['label'],
-          ...)
-  ```
-
-  </details> 
-
-  *See Orca [user guide](https://bigdl.readthedocs.io/en/latest/doc/Orca/Overview/orca.html), as well as [TensorFlow](https://bigdl.readthedocs.io/en/latest/doc/Orca/Howto/tf2keras-quickstart.html) and [PyTorch](https://bigdl.readthedocs.io/en/latest/doc/Orca/Howto/pytorch-quickstart.html) quickstarts, for more details.*
-
- In addition, you can also run standard **Ray** programs on Spark cluster using _**RayOnSpark**_ in Orca.
-
-  <details><summary>Show RayOnSpark example</summary>
-  <br/>
-  
-  You can not only run Ray program on Spark cluster, but also write Ray code inline with Spark code (so as to process the in-memory Spark RDDs or DataFrames) using _RayOnSpark_ in Orca.
- 
-  ```python
-  # 1. Initilize Orca Context (to run your program on K8s, YARN or local laptop)
-  from bigdl.orca import init_orca_context, OrcaContext
-  sc = init_orca_context(cluster_mode="yarn", cores=4, memory="10g", num_nodes=2, init_ray_on_spark=True) 
-
-  # 2. Distribtued data processing using Spark
-  spark = OrcaContext.get_spark_session()
-  df = spark.read.parquet(file_path).withColumn(...)
-  
-  # 3. Convert Spark DataFrame to Ray Dataset
-  from bigdl.orca.data import spark_df_to_ray_dataset
-  dataset = spark_df_to_ray_dataset(df)
-  
-  # 4. Use Ray to operate on Ray Datasets
-  import ray
-
-  @ray.remote
-  def consume(data) -> int:
-     num_batches = 0
-     for batch in data.iter_batches(batch_size=10):
-         num_batches += 1
-     return num_batches
-
-  print(ray.get(consume.remote(dataset)))
-  ```
-
-  </details>  
-  
-  *See RayOnSpark [user guide](https://bigdl.readthedocs.io/en/latest/doc/Orca/Overview/ray.html) and [quickstart](https://bigdl.readthedocs.io/en/latest/doc/Orca/Howto/ray-quickstart.html) for more details.*
-### Nano
-You can transparently accelerate your TensorFlow or PyTorch programs on your laptop or server using *Nano*. With minimum code changes, *Nano* automatically applies modern CPU optimizations (e.g., SIMD,  multiprocessing, low precision, etc.) to standard TensorFlow and PyTorch code, with up-to 10x speedup.
-
-<details><summary>Show Nano inference example</summary>
-<br/>
-
-You can automatically optimize a trained PyTorch model for inference or deployment using _Nano_:
-
-```python
-model = ResNet18().load_state_dict(...)
-train_dataloader = ...
-val_dataloader = ...
-def accuracy (pred, target):
-  ... 
-
-from bigdl.nano.pytorch import InferenceOptimizer
-optimizer = InferenceOptimizer()
-optimizer.optimize(model,
-                   training_data=train_dataloader,
-                   validation_data=val_dataloader,
-                   metric=accuracy)
-new_model, config = optimizer.get_best_model()
-
-optimizer.summary()
-```
-The output of `optimizer.summary()` will be something like:
-```
- -------------------------------- ---------------------- -------------- ----------------------
-|             method             |        status        | latency(ms)  |     metric value     |
- -------------------------------- ---------------------- -------------- ----------------------
-|            original            |      successful      |    45.145    |        0.975         |
-|              bf16              |      successful      |    27.549    |        0.975         |
-|          static_int8           |      successful      |    11.339    |        0.975         |
-|         jit_fp32_ipex          |      successful      |    40.618    |        0.975*        |
-|  jit_fp32_ipex_channels_last   |      successful      |    19.247    |        0.975*        |
-|         jit_bf16_ipex          |      successful      |    10.149    |        0.975         |
-|  jit_bf16_ipex_channels_last   |      successful      |    9.782     |        0.975         |
-|         openvino_fp32          |      successful      |    22.721    |        0.975*        |
-|         openvino_int8          |      successful      |    5.846     |        0.962         |
-|        onnxruntime_fp32        |      successful      |    20.838    |        0.975*        |
-|    onnxruntime_int8_qlinear    |      successful      |    7.123     |        0.981         |
- -------------------------------- ---------------------- -------------- ----------------------
-* means we assume the metric value of the traced model does not change, so we don't recompute metric value to save time.
-Optimization cost 60.8s in total.
-```
-
-</details>
-
-<details><summary>Show Nano Training example</summary>
-<br/>
-You may easily accelerate PyTorch training (e.g., IPEX, BF16, Multi-Instance Training, etc.) using Nano:
-
-```python
-model = ResNet18()
-optimizer = torch.optim.SGD(...)
-train_loader = ...
-val_loader = ...
-
-from bigdl.nano.pytorch import TorchNano
-
-# Define your training loop inside `TorchNano.train`
-class Trainer(TorchNano):
-	def train(self):
-	# call `setup` to prepare for model, optimizer(s) and dataloader(s) for accelerated training
-	model, optimizer, (train_loader, val_loader) = self.setup(model, optimizer,
-  train_loader, val_loader)
-  
-    for epoch in range(num_epochs):  
-      model.train()  
-      for data, target in train_loader:  
-        optimizer.zero_grad()  
-        output = model(data)  
-        # replace the loss.backward() with self.backward(loss)  
-        loss = loss_fuc(output, target)  
-        self.backward(loss)  
-        optimizer.step()   
-
-# Accelerated training (IPEX, BF16 and Multi-Instance Training)
-Trainer(use_ipex=True, precision='bf16', num_processes=2).train()
-```
-
-</details>  
-
-*See Nano [user guide](https://bigdl.readthedocs.io/en/latest/doc/Nano/Overview/nano.html) and [tutotial](https://github.com/intel-analytics/BigDL/tree/main/python/nano/tutorial) for more details.*
-    
-### DLlib
-
-With _DLlib_, you can write distributed deep learning applications as standard (**Scala** or **Python**) Spark programs, using the same **Spark DataFrames** and **ML Pipeline** APIs.
-
-<details><summary>Show DLlib Scala example</summary>
-<br/>
-
-You can build distributed deep learning applications for Spark using *DLlib* Scala APIs in 3 simple steps:
-
-```scala
-// 1. Call `initNNContext` at the beginning of the code: 
-import com.intel.analytics.bigdl.dllib.NNContext
-val sc = NNContext.initNNContext()
-
-// 2. Define the deep learning model using Keras-style API in DLlib:
-import com.intel.analytics.bigdl.dllib.keras.layers._
-import com.intel.analytics.bigdl.dllib.keras.Model
-val input = Input[Float](inputShape = Shape(10))  
-val dense = Dense[Float](12).inputs(input)  
-val output = Activation[Float]("softmax").inputs(dense)  
-val model = Model(input, output)
-
-// 3. Use `NNEstimator` to train/predict/evaluate the model using Spark DataFrame and ML pipeline APIs
-import org.apache.spark.sql.SparkSession
-import org.apache.spark.ml.feature.MinMaxScaler
-import org.apache.spark.ml.Pipeline
-import com.intel.analytics.bigdl.dllib.nnframes.NNEstimator
-import com.intel.analytics.bigdl.dllib.nn.CrossEntropyCriterion
-import com.intel.analytics.bigdl.dllib.optim.Adam
-val spark = SparkSession.builder().getOrCreate()
-val trainDF = spark.read.parquet("train_data")
-val validationDF = spark.read.parquet("val_data")
-val scaler = new MinMaxScaler().setInputCol("in").setOutputCol("value")
-val estimator = NNEstimator(model, CrossEntropyCriterion())  
-        .setBatchSize(128).setOptimMethod(new Adam()).setMaxEpoch(5)
-val pipeline = new Pipeline().setStages(Array(scaler, estimator))
-
-val pipelineModel = pipeline.fit(trainDF)  
-val predictions = pipelineModel.transform(validationDF)
-```
-
-</details>
-
-<details><summary>Show DLlib Python example</summary>
-<br/>
-
-You can build distributed deep learning applications for Spark using *DLlib* Python APIs in 3 simple steps:
-
-```python
-# 1. Call `init_nncontext` at the beginning of the code:
-from bigdl.dllib.nncontext import init_nncontext
-sc = init_nncontext()
-
-# 2. Define the deep learning model using Keras-style API in DLlib:
-from bigdl.dllib.keras.layers import Input, Dense, Activation
-from bigdl.dllib.keras.models import Model
-input = Input(shape=(10,))
-dense = Dense(12)(input)
-output = Activation("softmax")(dense)
-model = Model(input, output)
-
-# 3. Use `NNEstimator` to train/predict/evaluate the model using Spark DataFrame and ML pipeline APIs
-from pyspark.sql import SparkSession
-from pyspark.ml.feature import MinMaxScaler
-from pyspark.ml import Pipeline
-from bigdl.dllib.nnframes import NNEstimator
-from bigdl.dllib.nn.criterion import CrossEntropyCriterion
-from bigdl.dllib.optim.optimizer import Adam
-spark = SparkSession.builder.getOrCreate()
-train_df = spark.read.parquet("train_data")
-validation_df = spark.read.parquet("val_data")
-scaler = MinMaxScaler().setInputCol("in").setOutputCol("value")
-estimator = NNEstimator(model, CrossEntropyCriterion())\
-    .setBatchSize(128)\
-    .setOptimMethod(Adam())\
-    .setMaxEpoch(5)
-pipeline = Pipeline(stages=[scaler, estimator])
-
-pipelineModel = pipeline.fit(train_df)
-predictions = pipelineModel.transform(validation_df)
-```
-
-</details>
-
-*See DLlib [NNFrames](https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/nnframes.html) and [Keras API](https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/keras-api.html) user guides for more details.*
-
-### Chronos
-
-The *Chronos* library makes it easy to build fast, accurate and scalable **time series analysis** applications (with AutoML).
-
-<details><summary>Show Chronos example</summary>
-<br/>
-
-You can train a time series forecaster using _Chronos_ in 3 simple steps:
-
-```python
-from bigdl.chronos.forecaster import TCNForecaster 
-from bigdl.chronos.data.repo_dataset import get_public_dataset
-
-# 1. Process time series data using `TSDataset`
-tsdata_train, tsdata_val, tsdata_test = get_public_dataset(name='nyc_taxi')
-for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
-    data.roll(lookback=100, horizon=1)
-
-# 2. Create a `TCNForecaster` (automatically configured based on train_data)
-forecaster = TCNForecaster.from_tsdataset(train_data)
-
-# 3. Train the forecaster for prediction
-forecaster.fit(train_data)
-
-pred = forecaster.predict(test_data)
-```
-
-To apply AutoML, use `AutoTSEstimator` instead of normal forecasters.
-```python
-# Create and fit an `AutoTSEstimator`
-from bigdl.chronos.autots import AutoTSEstimator
-autotsest = AutoTSEstimator(model="tcn", future_seq_len=10)
-
-tsppl = autotsest.fit(data=tsdata_train, validation_data=tsdata_val)
-pred = tsppl.predict(tsdata_test)
-```
-
-</details>  
-
-*See Chronos [user guide](https://bigdl.readthedocs.io/en/latest/doc/Chronos/index.html) and [quick start](https://bigdl.readthedocs.io/en/latest/doc/Chronos/QuickStart/chronos-autotsest-quickstart.html) for more details.*
-
-### Friesian
-The *Friesian* library makes it easy to build end-to-end, large-scale **recommedation system** (including *offline* feature transformation and traning, *near-line* feature and model update, and *online* serving pipeline). 
-
-*See Freisian [readme](https://github.com/intel-analytics/BigDL/blob/main/python/friesian/README.md) for more details.* 
-
-### PPML
-
-*BigDL PPML* provides a **hardware (Intel SGX) protected** *Trusted Cluster Environment* for running distributed Big Data & AI applications (in a secure fashion on private or public cloud). 
-
-*See PPML [user guide](https://bigdl.readthedocs.io/en/latest/doc/PPML/Overview/ppml.html) and [tutorial](https://github.com/intel-analytics/BigDL/blob/main/ppml/README.md) for more details.* 
-
-## Getting Support
-
- [Mail List](mailto:bigdl-user-group+subscribe@googlegroups.com)
- [User Group](https://groups.google.com/forum/#!forum/bigdl-user-group)
- [Github Issues](https://github.com/intel-analytics/BigDL/issues)
---
-
-## Citation
-
-If you've found BigDL useful for your project, you may cite our papers as follows:
-
- *[BigDL 2.0](https://arxiv.org/abs/2204.01715): Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster*
-  ```
-  @INPROCEEDINGS{9880257,
-      title={BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster}, 
-      author={Dai, Jason Jinquan and Ding, Ding and Shi, Dongjie and Huang, Shengsheng and Wang, Jiao and Qiu, Xin and Huang, Kai and Song, Guoqiong and Wang, Yang and Gong, Qiyuan and Song, Jiaming and Yu, Shan and Zheng, Le and Chen, Yina and Deng, Junwei and Song, Ge},
-      booktitle={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, 
-      year={2022},
-      pages={21407-21414},
-      doi={10.1109/CVPR52688.2022.02076}
-  }
-  ```
-
-[^1]: Performance varies by use, configuration and other factors. `bigdl-llm` may not optimize to the same degree for non-Intel products. Learn more at www.Intel.com/PerformanceIndex.
-
- *[BigDL](https://arxiv.org/abs/1804.05839): A Distributed Deep Learning Framework for Big Data*
-  ```
-  @INPROCEEDINGS{10.1145/3357223.3362707,
-      title = {BigDL: A Distributed Deep Learning Framework for Big Data},
-      author = {Dai, Jason Jinquan and Wang, Yiheng and Qiu, Xin and Ding, Ding and Zhang, Yao and Wang, Yanzhang and Jia, Xianyan and Zhang, Cherry Li and Wan, Yan and Li, Zhichao and Wang, Jiao and Huang, Shengsheng and Wu, Zhongyuan and Wang, Yang and Yang, Yuhao and She, Bowen and Shi, Dongjie and Lu, Qi and Huang, Kai and Song, Guoqiong},
-      booktitle = {Proceedings of the ACM Symposium on Cloud Computing (SoCC)},
-      year = {2019},
-      pages = {50–60},
-      doi = {10.1145/3357223.3362707}
-  }
-  ```
-  
+Over 40 models have been optimized/verified on `ipex-llm`, including *LLaMA/LLaMA2, ChatGLM/ChatGLM2, Mistral, Falcon, MPT, Baichuan/Baichuan2, InternLM, QWen* and more; see the example list below.
+
+| Model                                    | CPU Example                              | GPU Example                              |
+| ---------------------------------------- | ---------------------------------------- | ---------------------------------------- |
+| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/vicuna) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/vicuna) |
+| LLaMA 2                                  | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/llama2) | [link1](python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2), [link2-low GPU memory example](python/llm/example/GPU/PyTorch-Models/Model/llama2#example-2---low-memory-version-predict-tokens-using-generate-api) |
+| ChatGLM                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm) |                                          |
+| ChatGLM2                                 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm2) |
+| ChatGLM3                                 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm3) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm3) |
+| Mistral                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mistral) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mistral) |
+| Mixtral                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mixtral) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mixtral) |
+| Falcon                                   | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/falcon) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/falcon) |
+| MPT                                      | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/mpt) |
+| Dolly-v1                                 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/dolly-v1) |
+| Dolly-v2                                 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/dolly-v2) |
+| Replit Code                              | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/replit) |
+| RedPajama                                | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/redpajama) |                                          |
+| Phoenix                                  | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phoenix) |                                          |
+| StarCoder                                | [link1](python/llm/example/CPU/Native-Models), [link2](python/llm/example/CPU/HF-Transformers-AutoModels/Model/starcoder) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/starcoder) |
+| Baichuan                                 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan) |
+| Baichuan2                                | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2) |
+| InternLM                                 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/internlm) |
+| Qwen                                     | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen) |
+| Qwen1.5                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen1.5) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen1.5) |
+| Qwen-VL                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen-vl) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen-vl) |
+| Aquila                                   | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/aquila) |
+| Aquila2                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/aquila2) |
+| MOSS                                     | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/moss) |                                          |
+| Whisper                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/whisper) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/whisper) |
+| Phi-1_5                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phi-1_5) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/phi-1_5) |
+| Flan-t5                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/flan-t5) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/flan-t5) |
+| LLaVA                                    | [link](python/llm/example/CPU/PyTorch-Models/Model/llava) | [link](python/llm/example/GPU/PyTorch-Models/Model/llava) |
+| CodeLlama                                | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/codellama) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/codellama) |
+| Skywork                                  | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/skywork) |                                          |
+| InternLM-XComposer                       | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm-xcomposer) |                                          |
+| WizardCoder-Python                       | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/wizardcoder-python) |                                          |
+| CodeShell                                | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/codeshell) |                                          |
+| Fuyu                                     | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/fuyu) |                                          |
+| Distil-Whisper                           | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/distil-whisper) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/distil-whisper) |
+| Yi                                       | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/yi) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/yi) |
+| BlueLM                                   | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/bluelm) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/bluelm) |
+| Mamba                                    | [link](python/llm/example/CPU/PyTorch-Models/Model/mamba) | [link](python/llm/example/GPU/PyTorch-Models/Model/mamba) |
+| SOLAR                                    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/solar) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/solar) |
+| Phixtral                                 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phixtral) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/phixtral) |
+| InternLM2                                | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/internlm2) |
+| RWKV4                                    |                                          | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/rwkv4) |
+| RWKV5                                    |                                          | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/rwkv5) |
+| Bark                                     | [link](python/llm/example/CPU/PyTorch-Models/Model/bark) | [link](python/llm/example/GPU/PyTorch-Models/Model/bark) |
+| SpeechT5                                 |                                          | [link](python/llm/example/GPU/PyTorch-Models/Model/speech-t5) |
+| DeepSeek-MoE                             | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/deepseek-moe) |                                          |
+| Ziya-Coding-34B-v1.0                     | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/ziya) |                                          |
+| Phi-2                                    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/phi-2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/phi-2) |
+| Yuan2                                    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/yuan2) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/yuan2) |
+| Gemma                                    | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/gemma) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/gemma) |
+| DeciLM-7B                                | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/deciLM-7b) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/deciLM-7b) |
+| Deepseek                                 | [link](python/llm/example/CPU/HF-Transformers-AutoModels/Model/deepseek) | [link](python/llm/example/GPU/HF-Transformers-AutoModels/Model/deepseek) |
+
+
+***For more details, please refer to the `ipex-llm` [Document](https://test-ipex-llm.readthedocs.io/en/main/doc/LLM/index.html), [Readme](python/llm), [Tutorial](https://github.com/intel-analytics/ipex-llm-tutorial) and [API Doc](https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/index.html).***
--- a/docs/readthedocs/source/doc/Application/blogs.md
+++ b/docs/readthedocs/source/doc/Application/blogs.md
@ -1,49 +0,0 @@
-Blogs
---
-**2023**
- [Large-scale Offline Book Recommendation with BigDL at Dangdang.com](https://www.intel.com/content/www/us/en/developer/articles/technical/dangdang-offline-recommendation-service-with-bigdl.html)
-
-**2022**
- [Optimized Large-Scale Item Search with Intel BigDL at Yahoo! JAPAN Shopping](https://www.intel.com/content/www/us/en/developer/articles/technical/offline-item-search-with-bigdl-at-yahoo-japan.html)
- [Tencent Trusted Computing Solution on SGX with Intel BigDL PPML](https://www.intel.com/content/www/us/en/developer/articles/technical/tencent-trusted-computing-solution-with-bigdl-ppml.html)
- [BigDL Privacy Preserving Machine Learning with Occlum OSS on Azure Confidential Computing](https://techcommunity.microsoft.com/t5/azure-confidential-computing/bigdl-privacy-preserving-machine-learning-with-occlum-oss-on/ba-p/3658667)
- ["AI at Scale" in Mastercard with BigDL](https://www.intel.com/content/www/us/en/developer/articles/technical/ai-at-scale-in-mastercard-with-bigdl.html)
- [BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster](https://arxiv.org/abs/2204.01715)
- [Project Bose: A smart way to enable sustainable 5G networks in Capgemini](https://www.capgemini.com/insights/expert-perspectives/project-bose-a-smart-way-to-enable-sustainable-5g-networks/)
- [Intelligent Power Prediction Solution in Goldwind](https://www.intel.com/content/www/us/en/customer-spotlight/stories/goldwind-customer-story.html)
- [5G Core Network Power Saving using BigDL Chronos Framework in China Unicom](https://www.intel.cn/content/www/cn/zh/customer-spotlight/cases/china-unicom-bigdl-chronos-framework-5gc.html) (in Chinese)
-
-**2021**
- [From Ray to Chronos: Build end-to-end AI use cases using BigDL on top of Ray](https://www.anyscale.com/blog/from-ray-to-chronos-build-end-to-end-ai-use-cases-using-bigdl-on-top-of-ray)
- [Scalable AutoXGBoost Using Analytics Zoo AutoML](https://medium.com/intel-analytics-software/scalable-autoxgboost-using-analytics-zoo-automl-30d576cb138a)
- [Intelligent 5G L2 MAC Scheduler: Powered by Capgemini NetAnticipate 5G on Intel Architecture](https://networkbuilders.intel.com/solutionslibrary/intelligent-5g-l2-mac-scheduler-powered-by-capgemini-netanticipate-5g-on-intel-architecture)
- [Better Together: Privacy-Preserving Machine Learning Powered by Intel SGX and Intel DL Boost](https://www.intel.com/content/www/us/en/artificial-intelligence/posts/alibaba-privacy-preserving-machine-learning.html)
-
-**2020**
- [SK Telecom, Intel Build AI Pipeline to Improve Network Quality](https://networkbuilders.intel.com/solutionslibrary/sk-telecom-intel-build-ai-pipeline-to-improve-network-quality)
- [Build End-to-End AI Pipelines Using Ray and Apache Spark](https://medium.com/distributed-computing-with-ray/build-end-to-end-ai-pipeline-using-ray-and-apache-spark-23f70f36115e)
- [Tencent Cloud Leverages Analytics Zoo to Improve Performance of TI-ONE ML Platform](https://www.intel.com/content/www/us/en/developer/articles/technical/tencent-cloud-leverages-analytics-zoo-to-improve-performance-of-ti-one-ml-platform.html)
- [Context-Aware Fast Food Recommendation at Burger King with RayOnSpark](https://medium.com/riselab/context-aware-fast-food-recommendation-at-burger-king-with-rayonspark-2e7a6009dd2d)
- [Seamlessly Scaling AI for Distributed Big Data](https://medium.com/swlh/seamlessly-scaling-ai-for-distributed-big-data-5b589ead2434)
- [Distributed Inference Made Easy with Analytics Zoo Cluster Serving](https://www.intel.com/content/www/us/en/developer/articles/technical/distributed-inference-made-easy-with-analytics-zoo-cluster-serving.html)
-
-**2019**
- [BigDL: A Distributed Deep-Learning Framework for Big Data](https://arxiv.org/abs/1804.05839)
- [Scalable AutoML for Time-Series Prediction Using Ray and BigDL & Analytics Zoo](https://medium.com/riselab/scalable-automl-for-time-series-prediction-using-ray-and-analytics-zoo-b79a6fd08139)
- [RayOnSpark: Run Emerging AI Applications on Big Data Clusters with Ray and BigDL & Analytics Zoo](https://medium.com/riselab/rayonspark-running-emerging-ai-applications-on-big-data-clusters-with-ray-and-analytics-zoo-923e0136ed6a)
- [Real-time Product Recommendations for Office Depot Using Apache Spark and Analytics Zoo on AWS](https://www.intel.com/content/www/us/en/developer/articles/technical/real-time-product-recommendations-for-office-depot-using-apache-spark-and-analytics-zoo-on.html)
- [Machine Learning Pipelines for High Energy Physics Using Apache Spark with BigDL and Analytics Zoo](https://db-blog.web.cern.ch/blog/luca-canali/machine-learning-pipelines-high-energy-physics-using-apache-spark-bigdl)
- [Deep Learning with Analytic Zoo Optimizes Mastercard Recommender AI Service](https://www.intel.com/content/www/us/en/developer/articles/technical/deep-learning-with-analytic-zoo-optimizes-mastercard-recommender-ai-service.html)
- [Using Intel Analytics Zoo to Inject AI into Customer Service Platform (Part II)](https://www.infoq.com/articles/analytics-zoo-qa-module/)
- [Talroo Uses Analytics Zoo and AWS to Leverage Deep Learning for Job Recommendations](https://www.intel.com/content/www/us/en/developer/articles/technical/talroo-uses-analytics-zoo-and-aws-to-leverage-deep-learning-for-job-recommendations.html)
-
-**2018**
- [Analytics Zoo: Unified Analytics + AI Platform for Distributed Tensorflow, and BigDL on Apache Spark](https://www.infoq.com/articles/analytics-zoo/)
- [Industrial Inspection Platform in Midea and KUKA: Using Distributed TensorFlow on Analytics Zoo](https://www.intel.com/content/www/us/en/developer/articles/technical/industrial-inspection-platform-in-midea-and-kuka-using-distributed-tensorflow-on-analytics.html)
- [Use Analytics Zoo to Inject AI Into Customer Service Platforms on Microsoft Azure](https://www.intel.com/content/www/us/en/developer/articles/technical/use-analytics-zoo-to-inject-ai-into-customer-service-platforms-on-microsoft-azure-part-1.html)
- [LSTM-Based Time Series Anomaly Detection Using Analytics Zoo for Apache Spark and BigDL at Baosight](https://www.intel.com/content/www/us/en/developer/articles/technical/lstm-based-time-series-anomaly-detection-using-analytics-zoo-for-apache-spark-and-bigdl.html)
-
-**2017**
- [Accelerating Deep-Learning Training with BigDL and Drizzle on Apache Spark](https://rise.cs.berkeley.edu/blog/accelerating-deep-learning-training-with-bigdl-and-drizzle-on-apache-spark)
- [Using BigDL to Build Image Similarity-Based House Recommendations](https://www.intel.com/content/www/us/en/developer/articles/technical/using-bigdl-to-build-image-similarity-based-house-recommendations.html)
- [Building Large-Scale Image Feature Extraction with BigDL at JD.com](https://www.intel.com/content/www/us/en/developer/articles/technical/building-large-scale-image-feature-extraction-with-bigdl-at-jdcom.html)
--- a/docs/readthedocs/source/doc/Application/index.rst
+++ b/docs/readthedocs/source/doc/Application/index.rst
@ -1,2 +0,0 @@
-Real-World Application
-=========================
--- a/docs/readthedocs/source/doc/Application/powered-by.md
+++ b/docs/readthedocs/source/doc/Application/powered-by.md
@ -1,93 +0,0 @@
-# Powered By
---
-
-* __Alibaba__
-  <br>• [Alibaba Cloud and Intel synergize BigDL PPML and Alibaba Cloud Data Trust to protect E2E privacy of AI and big data](https://www.intel.com/content/www/us/en/customer-spotlight/stories/alibaba-cloud-ppml-customer-story.html)
-  <br>• [Better Together: Alibaba Cloud Realtime Compute and Distributed AI Inference](https://www.intel.cn/content/dam/www/central-libraries/cn/zh/documents/better-together-alibaba-cloud-realtime-compute-and-distibuted-ai-inference.pdf) (in Chinese)
-  <br>• [Better Together: Privacy-Preserving Machine Learning](https://www.intel.com/content/www/us/en/artificial-intelligence/posts/alibaba-privacy-preserving-machine-learning.html)
-* __AsiaInfo__
-  <br>• [AsiaInfo Technology Leverages Hardware and Software Products and Technologies to Create New Intelligent Energy Saving Solutions for 5G Cloud Based Base Station Products](https://www.intel.cn/content/www/cn/zh/communications/asiainfo-create-intelligent-energy-saving-solution.html) (in Chinese)
-  <br>• [Network AI Applications using BigDL and oneAPI toolkit on Intel Xeon](https://www.intel.cn/content/www/cn/zh/customer-spotlight/cases/asiainfo-taps-intelligent-network-applications.html)
-* __Baosight__
-  <br>• [LSTM-Based Time Series Anomaly Detection Using Analytics Zoo for Apache Spark and BigDL at Baosight](https://www.intel.com/content/www/us/en/developer/articles/technical/lstm-based-time-series-anomaly-detection-using-analytics-zoo-for-apache-spark-and-bigdl.html)
-* __BBVA__
-  <br>• [A Graph Convolutional Network Implementation](https://emartinezs44.medium.com/graph-convolutions-networks-ad8295b3ce57)
-* __Burger King__
-  <br>• [Context-Aware Fast Food Recommendation at Burger King with RayOnSpark](https://medium.com/riselab/context-aware-fast-food-recommendation-at-burger-king-with-rayonspark-2e7a6009dd2d)
-  <br>• [How Intel and Burger King built an order recommendation system that preserves customer privacy](https://venturebeat.com/2021/04/06/how-intel-and-burger-king-built-an-order-recommendation-system-that-preserves-customer-privacy/)
-  <br>• [Burger King: Context-Aware Recommendations (video)](https://www.intel.com/content/www/us/en/customer-spotlight/stories/burger-king-ai-customer-story.html)
-* __Capgemini__
-<br>• [Project Bose: A smart way to enable sustainable 5G networks in Capgemini](https://www.capgemini.com/insights/expert-perspectives/project-bose-a-smart-way-to-enable-sustainable-5g-networks/)
-<br>• [Intelligent 5G L2 MAC Scheduler: Powered by Capgemini NetAnticipate 5G on Intel Architecture](https://networkbuilders.intel.com/solutionslibrary/intelligent-5g-l2-mac-scheduler-powered-by-capgemini-netanticipate-5g-on-intel-architecture)
-* __China Unicom__
-  <br>• [China Unicom Data Center Energy Saving and Emissions Reduction with Intel Intelligent Energy Management](https://www.intel.com/content/www/us/en/content-details/768821/china-unicom-data-center-energy-saving-and-emissions-reduction-with-intel-intelligent-energy-management.html)
-  <br>• [Cloud Data Center Power Saving using BigDL Chronos in China Unicom](https://www.intel.cn/content/www/cn/zh/customer-spotlight/cases/china-unicom-bigdl-chronos-framework-5gc.html)
-* __CERN__
- <br>• [Deep Learning Pipelines for High Energy Physics using Apache Spark with Distributed Keras on Analytics Zoo](https://databricks.com/session_eu19/deep-learning-pipelines-for-high-energy-physics-using-apache-spark-with-distributed-keras-on-analytics-zoo)
- <br>• [Topology classification at CERN's Large Hadron Collider using Analytics Zoo](https://db-blog.web.cern.ch/blog/luca-canali/machine-learning-pipelines-high-energy-physics-using-apache-spark-bigdl)
- <br>• [Deep Learning on Apache Spark at CERN's Large Hadron Collider with Intel Technologies](https://databricks.com/session/deep-learning-on-apache-spark-at-cerns-large-hadron-collider-with-intel-technologies)
-* __China Telecom__
- <br>• [Face Recognition Application and Practice Based on Intel Analytics Zoo: Part 1](https://mp.weixin.qq.com/s/FEiXoTDi-yy04PJ2Mlfl4A) (in Chinese)
- <br>• [Face Recognition Application and Practice Based on Intel Analytics Zoo: Part 2](https://mp.weixin.qq.com/s/VIyWRORTAVAAsC4v6Fi0xw) (in Chinese)
-* __Cray__ 
-<br>• [A deep learning approach for precipitation nowcasting with RNN using Analytics Zoo in Cray](https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/69413)
-* __Dangdang__
- <br>• [Large-scale Offline Book Recommendation with BigDL at Dangdang.com](https://www.intel.com/content/www/us/en/developer/articles/technical/dangdang-offline-recommendation-service-with-bigdl.html)
-* __Dell EMC__
-<br>• [AI-assisted Radiology Using Distributed Deep
-Learning on Apache Spark and Analytics Zoo](https://www.dellemc.com/resources/en-us/asset/white-papers/solutions/h17686_hornet_wp.pdf)
-<br>• [Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest X-rays](https://databricks.com/session/using-deep-learning-on-apache-spark-to-diagnose-thoracic-pathology-from-chest-x-rays)
-* __GoldWind__
-<br>• [Goldwind SE: Intelligent Power Prediction Solution](https://www.intel.com/content/www/us/en/customer-spotlight/stories/goldwind-customer-story.html)
-<br>• [Intel big data analysis + AI platform helps GoldWind to build a new energy intelligent power prediction solution](https://www.intel.cn/content/www/cn/zh/analytics/artificial-intelligence/create-power-forecasting-solutions.html)
-* __Inspur__
-<br>• [Inspur’s Big Data Intelligent Computing AIO Solution Based on Intel Architecture](https://dpgresources.intel.com/asset-library/inspur-insight-big-data-platform-solution-icx-prc/)
-<br>• [Inspur E2E Smart Transportation CV application](https://jason-dai.github.io/cvpr2021/slides/Inspur%20E2E%20Smart%20Transportation%20CV%20application%20-CVPR21.pdf)
-<br>• [Inspur End-to-End Smart Computing Solution with Intel Analytics Zoo](https://dpgresources.intel.com/asset-library/inspur-end-to-end-smart-computing-solution-with-intel-analytics-zoo/)
-* __JD__
-<br>• [Object Detection and Image Feature Extraction at JD.com](https://software.intel.com/en-us/articles/building-large-scale-image-feature-extraction-with-bigdl-at-jdcom)
-* __MasterCard__
-<br>• ["AI at Scale" in Mastercard with BigDL](https://www.intel.com/content/www/us/en/developer/articles/technical/ai-at-scale-in-mastercard-with-bigdl0.html)
-<br>• [Deep Learning with Analytic Zoo Optimizes Mastercard Recommender AI Service](https://www.intel.com/content/www/us/en/developer/articles/technical/deep-learning-with-analytic-zoo-optimizes-mastercard-recommender-ai-service.html)
-* __Microsoft Azure__
-<br>• [Use Analytics Zoo to Inject AI Into Customer Service Platforms on Microsoft Azure: Part 1](https://www.intel.com/content/www/us/en/developer/articles/technical/use-analytics-zoo-to-inject-ai-into-customer-service-platforms-on-microsoft-azure-part-1.html)
-<br>• [Use Analytics Zoo to Inject AI Into Customer Service Platforms on Microsoft Azure: Part 2](https://www.infoq.com/articles/analytics-zoo-qa-module/?from=timeline&isappinstalled=0)
-* __Midea__
-<br>• [Industrial Inspection Platform in Midea and KUKA: Using Distributed TensorFlow on Analytics Zoo](https://www.intel.com/content/www/us/en/developer/articles/technical/industrial-inspection-platform-in-midea-and-kuka-using-distributed-tensorflow-on-analytics.html) 
-<br>• [Ability to add "eyes" and "brains" to smart manufacturing](https://www.intel.cn/content/www/cn/zh/analytics/artificial-intelligence/midea-case-study.html) (in Chinese)
-* __MLSListings__
-<br>• [Image Similarity-Based House Recommendations and Search](https://www.intel.com/content/www/us/en/developer/articles/technical/using-bigdl-to-build-image-similarity-based-house-recommendations.html)
-* __NeuSoft/BMW__
-<br>• [Neusoft RealSight APM partners with Intel to create an application performance management platform with active defense capabilities](https://platform.neusoft.com/2020/01/17/xw-intel.html) (in Chinese)
-* __NeuSoft/Mazda__
-<br>• [JD, Neusoft and Intel Jointly Building Intelligent and Connected Vehicle Cloud for HaiMa(former Hainan Mazda)](https://www.neusoft.com/Products/Platforms/2472/4735110231.html)
-<br>• [JD, Neusoft and Intel Jointly Building Intelligent and Connected Vehicle Cloud for Hainan-Mazda](https://platform.neusoft.com/2020/06/11/jjfa-haimaqiche.html) (in Chinese)
-* __Office Depot__
-<br>• [Real-time Product Recommendations for Office Depot Using Apache Spark and Analytics Zoo on AWS](https://www.intel.com/content/www/us/en/developer/articles/technical/real-time-product-recommendations-for-office-depot-using-apache-spark-and-analytics-zoo-on.html)
-<br>• [Office Depot product recommender using Analytics Zoo on AWS](https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/73079)
-* __SK Telecom__
-<br>• [Reference Architecture for Confidential Computing on SKT 5G MEC](https://networkbuilders.intel.com/solutionslibrary/reference-architecture-for-confidential-computing-on-skt-5g-mec)
-<br>• [SK Telecom, Intel Build AI Pipeline to Improve Network Quality](https://networkbuilders.intel.com/solutionslibrary/sk-telecom-intel-build-ai-pipeline-to-improve-network-quality)
-<br>• [Vectorized Deep Learning Acceleration from Preprocessing to Inference and Training on Apache Spark in SK Telecom](https://databricks.com/session_na20/vectorized-deep-learning-acceleration-from-preprocessing-to-inference-and-training-on-apache-spark-in-sk-telecom)
-<br>• [Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction with Geospatial Visualization](https://databricks.com/session_eu19/apache-spark-ai-use-case-in-telco-network-quality-analysis-and-prediction-with-geospatial-visualization)
- * __Talroo__
-<br>• [Uses Analytics Zoo and AWS to Leverage Deep Learning for Job Recommendations](https://www.intel.com/content/www/us/en/developer/articles/technical/talroo-uses-analytics-zoo-and-aws-to-leverage-deep-learning-for-job-recommendations.html)
-<br>• [Job recommendations leveraging deep learning using Analytics Zoo on Apache Spark and BigDL](https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/69113)
-* __Telefonica__
- <br>• [Running Analytics Zoo jobs on Telefónica Open Cloud’s MRS Service](https://medium.com/@fernando.delaiglesia/running-analytics-zoo-jobs-on-telef%C3%B3nica-open-clouds-mrs-service-2e64bc823c50)
-* __Tencent__
-<br>• [Tencent Trusted Computing Solution on SGX with Intel BigDL PPML](https://www.intel.com/content/www/us/en/developer/articles/technical/tencent-trusted-computing-solution-with-bigdl-ppml.html)
-<br>• [Analytics Zoo helps Tencent Cloud improve the performance of its intelligent titanium machine learning platform](https://www.intel.cn/content/www/cn/zh/service-providers/analytics-zoo-helps-tencent-cloud-improve-ti-ml-platform-performance.html)
-<br>• [Tencent Cloud Leverages Analytics Zoo to Improve Performance of TI-ONE ML Platform](https://software.intel.com/content/www/us/en/develop/articles/tencent-cloud-leverages-analytics-zoo-to-improve-performance-of-ti-one-ml-platform.html)
-<br>• [Enhance Tencent's TUSI Identity Practice with Intel Analytics Zoo](https://mp.weixin.qq.com/s?__biz=MzAwNzc5NzM5Mw==&mid=2651030944&idx=1&sn=d6e06c6e14a7355971953a501689b232&chksm=808f8a5eb7f80348fc8e88c4c9e415341bf43ef6bdf3fd4f3001da89e2c9ba7fa2ed5deeb09a&mpshare=1&scene=1&srcid=0412WxM3eWdsLLoO2TYJGWbS&pass_ticket=E6l%2FfOZNKjhr05lsU7inAVCi7mAy5LFEehvEJOS2ZGdHg6%2FH%2BeBQisHA9sfXDOoy#rd) (in Chinese)
-* __UC Berkeley RISELab__
-<br>• [RayOnSpark: Running Emerging AI Applications on Big Data Clusters with Ray and Analytics Zoo](https://medium.com/riselab/rayonspark-running-emerging-ai-applications-on-big-data-clusters-with-ray-and-analytics-zoo-923e0136ed6a)
-<br>• [Scalable AutoML for Time Series Prediction Using Ray and Analytics Zoo](https://medium.com/riselab/scalable-automl-for-time-series-prediction-using-ray-and-analytics-zoo-b79a6fd08139)
-* __UnionPay__
-<br>• [Technical Verification of SGX and BigDL Based Privacy Computing for Multi Source Financial Big Data](https://www.intel.cn/content/www/cn/zh/now/data-centric/sgx-bigdl-financial-big-data.html) (in Chinese)
-* __World Bank__
-<br>• [Using Crowdsourced Images to Create Image Recognition Models with Analytics Zoo using BigDL](https://databricks.com/session/using-crowdsourced-images-to-create-image-recognition-models-with-bigdl)
-*  __Yahoo! JAPAN__
-<br>• [Optimized Large-Scale Item Search with Intel BigDL at Yahoo! JAPAN Shopping](https://www.intel.com/content/www/us/en/developer/articles/technical/offline-item-search-with-bigdl-at-yahoo-japan.html)
-*  __Yunda__
-<br>• [Intelligent transformation brings "quality change" to the express delivery industry](https://www.intel.cn/content/www/cn/zh/analytics/artificial-intelligence/yunda-brings-quality-change-to-the-express-delivery-industry.html) (in Chinese)
--- a/docs/readthedocs/source/doc/Application/presentations.md
+++ b/docs/readthedocs/source/doc/Application/presentations.md
@ -1,99 +0,0 @@
-# Presentations
---
-
-**Tutorial:**
- Seamlessly Scaling out Big Data AI on Ray and Apache Spark, [CVPR 2021](https://cvpr2021.thecvf.com/program) [tutorial](https://jason-dai.github.io/cvpr2021/), June 2021 ([slides](https://jason-dai.github.io/cvpr2021/slides/End-to-End%20Big%20Data%20AI%20Pipeline%20using%20Analytics%20Zoo%20-%20CVPR21.pdf))
-
- Automated Machine Learning Workflow for Distributed Big Data Using Analytics Zoo,  [CVPR 2020](https://cvpr2020.thecvf.com/program/tutorials) [tutorial](https://jason-dai.github.io/cvpr2020/), June 2020 ([slides](https://jason-dai.github.io/cvpr2020/slides/AIonBigData_cvpr20.pdf))
-
- Building Deep Learning Applications for Big Data, [AAAI 2019]( https://aaai.org/Conferences/AAAI-19/aaai19tutorials/#sp2) [tutorial](https://jason-dai.github.io/aaai2019/), January 2019 ([slides](https://jason-dai.github.io/aaai2019/slides/AI%20on%20Big%20Data%20(Jason%20Dai).pdf))
-
- Analytics Zoo: Distributed TensorFlow and Keras on Apache Spark, [AI conference](https://conferences.oreilly.com/artificial-intelligence/ai-ca-2019/public/schedule/detail/77069), Sep 2019, San Jose ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Tutorial%20Analytics%20ZOO.pdf))
-
- Building Deep Learning Applications on Big Data Platforms, [CVPR 2018](https://cvpr2018.thecvf.com/) [tutorial](https://jason-dai.github.io/cvpr2018/), June 2018 ([slides](https://jason-dai.github.io/cvpr2018/slides/BigData_DL_Jason-CVPR.pdf))
-
-**Talks:**
- BigDL 2.0: Seamlessly scaling end-to-end AI pipelines, [Ray Summit 2022](https://www.anyscale.com/ray-summit-2022/agenda/sessions/174), August 2022 ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/BigDL-2.0-Seamlessly-scaling-end-to-end-AI-pipelines.pdf))
-
- Exploration on Confidential Computing for Big Data & AI, [oneAPI DevSummit for AI 2022](https://www.oneapi.io/event-sessions/exploration-on-confidential-computing-for-big-data-ai-ai-2022/), July 2022 ([slides](https://simplecore.intel.com/oneapi-io/wp-content/uploads/sites/98/Qiyuan-Gong-and-Chunyang-Hui-Exploration-on-Confidential-Computing-for-Big-Data-AI.pdf))
-
- Privacy Preserving Machine Learning and Big Data Analytics Using Apache Spark, [Data + AI Summit 2022](https://www.databricks.com/dataaisummit/session/privacy-preserving-machine-learning-and-big-data-analytics-using-apache-spark), June 2022 ([slides](https://microsites.databricks.com/sites/default/files/2022-07/Privacy-Preserving-Machine-Learning-and-Big-Data-Analytics-Using-Apache-Spark.pdf))
-
- E2E Smart Transportation CV application in Inspur (using Insight Data-Intelligence platform), [CVPR 2021](https://jason-dai.github.io/cvpr2021/), July 2021 ([slides](https://jason-dai.github.io/cvpr2021/slides/Inspur%20E2E%20Smart%20Transportation%20CV%20application%20-CVPR21.pdf))
-
- Mobile Order Click-Through Rate (CTR) Recommendation with Ray on Apache Spark at Burger King, [Ray Summit 2021](https://www.anyscale.com/events/2021/06/22/mobile-order-click-through-rate-ctr-recommendation-with-ray-on-apache-spark-at-burger-king), June 2021 ([slides](https://files.speakerdeck.com/presentations/1870110b5adf4bfc8f0c76255a417f09/Kai_Huang_and_Luyang_Wang.pdf))
-
- Deep Reinforcement Learning Recommenders using RayOnSpark, *Data + AI Summit 2021*, May 2021 ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/210527DeepReinforcementLearningRecommendersUsingRayOnSpark2.pdf))
-
- Cluster Serving: Deep Learning Model Serving for Big Data, *Data + AI Summit 2021*, May 2021 ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/210526Cluster-Serving.pdf))
-
- Offer Recommendation System with Apache Spark at Burger King, [Data + AI Summit 2021](https://databricks.com/session_na21/offer-recommendation-system-with-apache-spark-at-burger-king), May 2021 ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/20210526Offer%20Recommendation.pdf))
-
- Context-aware Fast Food Recommendation with Ray on Apache Spark at Burger King, [Data + AI Summit Europe 2020](https://databricks.com/session_eu20/context-aware-fast-food-recommendation-with-ray-on-apache-spark-at-burger-king), November 2020 ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/1118%20Context-aware%20Fast%20Food%20Recommendation%20with%20Ray%20on%20Apache%20Spark%20at%20Burger%20King.pdf))
-
- Cluster Serving: Distributed Model Inference using Apache Flink in Analytics Zoo, [Flink Forward 2020](https://www.flink-forward.org/global-2020/conference-program#cluster-serving--distributed-model-inference-using-apache-flink-in-analytics-zoo), October 2020 ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/1020%20Cluster%20Serving%20Distributed%20Model%20Inference%20using%20Apache%20Flink%20in%20Analytics%20Zoo%20.pdf))
-
- Project Zouwu: Scalable AutoML for Telco Time Series Analysis using Ray and Analytics Zoo, [Ray Summit Connect 2020](https://anyscale.com/blog/videos-and-slides-for-the-fourth-ray-summit-connect-august-12-2020/), August 2020 ([slides](https://anyscale.com/wp-content/uploads/2020/08/Ding-Ding-Connect-slides.pdf))
-
- Cluster Serving: Distributed Model Inference using Big Data Streaming in Analytics Zoo, [OpML 2020](https://www.usenix.org/conference/opml20/presentation/song), July 2020 ([slides](https://www.usenix.org/sites/default/files/conference/protected-files/opml20_talks_43_slides_song.pdf))
-
- Scalable AutoML for Time Series Forecasting using Ray, [OpML 2020](https://www.usenix.org/conference/opml20/presentation/huang), July 2020 ([slides](https://www.usenix.org/sites/default/files/conference/protected-files/opml20_talks_84_slides_huang.pdf))
-
- Scalable AutoML for Time Series Forecasting using Ray, [Spark + AI Summit 2020](https://databricks.com/session_na20/scalable-automl-for-time-series-forecasting-using-ray), June 2020 ([slides](https://www.slideshare.net/databricks/scalable-automl-for-time-series-forecasting-using-ray))
-
- Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark, [Spark + AI Summit 2020](https://databricks.com/session_na20/running-emerging-ai-applications-on-big-data-platforms-with-ray-on-apache-spark), June 2020 ([slides](https://www.slideshare.net/databricks/running-emerging-ai-applications-on-big-data-platforms-with-ray-on-apache-spark))
-
- Vectorized Deep Learning Acceleration from Preprocessing to Inference and Training on Apache Spark in SK Telecom, [Spark + AI Summit 2020](https://databricks.com/session_na20/vectorized-deep-learning-acceleration-from-preprocessing-to-inference-and-training-on-apache-spark-in-sk-telecom), June 2020 ([slides](https://www.slideshare.net/databricks/vectorized-deep-learning-acceleration-from-preprocessing-to-inference-and-training-on-apache-spark-in-sk-telecom?from_action=save))
-
- Architecture and practice of big data analysis and deep learning model inference using Analytics Zoo on Flink, [Flink Forward Asia 2019](https://developer.aliyun.com/special/ffa2019-conference?spm=a2c6h.13239638.0.0.21f27955PCNMUB#), Nov 2019, Beijing ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Architecture%20and%20practice%20of%20big%20data%20analysis%20and%20deep%20learning%20model%20inference%20using%20Analytics%20Zoo%20on%20Flink(FFA2019)%20.pdf))
-
- Data analysis + AI platform technology and case studies, [AICon BJ 2019](https://aicon.infoq.cn/2019/beijing/), Nov 2019, Beijing ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/AICON%20AZ%20Cluster%20Serving%20Beijing%20Qiyuan_v5.pdf))
-
- Architectural practices for building a unified big data AI application with Analytics-Zoo, [QCon SH 2019](https://qcon.infoq.cn/2019/shanghai/presentation/1921), Oct 2019, Shanghai ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Architectural%20practices%20for%20building%20a%20unified%20big%20data%20AI%20application%20with%20Analytics-Zoo.pdf))
-
- Building AI to play the FIFA video game using distributed TensorFlow, [TensorFlow World](https://conferences.oreilly.com/tensorflow/tf-ca/public/schedule/detail/78309), Oct 2019, Santa Clara ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Building%20AI%20to%20play%20the%20FIFA%20video%20game%20using%20distributed%20TensorFlow.pdf))
-
- Deep Learning Pipelines for High Energy Physics using Apache Spark with Distributed Keras on Analytics Zoo, [Spark+AI Summit](https://databricks.com/session_eu19/deep-learning-pipelines-for-high-energy-physics-using-apache-spark-with-distributed-keras-on-analytics-zoo), Oct 2019, Amsterdam ([slides](https://www.slideshare.net/databricks/deep-learning-pipelines-for-high-energy-physics-using-apache-spark-with-distributed-keras-on-analytics-zoo))
-
- Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction with Geospatial Visualization, [Spark+AI Summit](https://databricks.com/session_eu19/apache-spark-ai-use-case-in-telco-network-quality-analysis-and-prediction-with-geospatial-visualization), Oct 2019, Amsterdam ([slides](https://www.slideshare.net/databricks/apache-spark-ai-use-case-in-telco-network-quality-analysis-and-prediction-with-geospatial-visualization))
-
- LSTM-based time series anomaly detection using Analytics Zoo for Spark and BigDL, [Strata Data conference](https://conferences.oreilly.com/strata/strata-eu/public/schedule/detail/74077), May 2019, London ([slides](https://cdn.oreillystatic.com/en/assets/1/event/292/LSTM-based%20time%20series%20anomaly%20detection%20using%20Analytics%20Zoo%20for%20Spark%20and%20BigDL%20Presentation.pptx))
-
- Game Playing Using AI on Apache Spark, [Spark+AI Summit](https://databricks.com/session/game-playing-using-ai-on-apache-spark), April 2019, San Francisco ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/game-playing-using-ai-on-apache-spark.pdf))
-
- Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest X-rays in DELL EMC, [Spark+AI Summit](https://databricks.com/session/using-deep-learning-on-apache-spark-to-diagnose-thoracic-pathology-from-chest-x-rays), April 2019, San Francisco ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Using%20Deep%20Learning%20on%20Apache%20Spark%20to%20diagnose%20thoracic%20pathology%20from%20.._.pdf))
-
- Leveraging NLP and Deep Learning for Document Recommendation in the Cloud, [Spark+AI Summit](https://databricks.com/session/leveraging-nlp-and-deep-learning-for-document-recommendations-in-the-cloud), April 2019, San Francisco ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Leveraging%20NLP%20and%20Deep%20Learning%20for%20Document%20Recommendation%20in%20the%20Cloud.pdf))
-
- Analytics Zoo: Distributed Tensorflow, Keras and BigDL in production on Apache Spark, [Strata Data conference](https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/72802), March 2019, San Francisco ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Analytics%20Zoo-Distributed%20Tensorflow%2C%20Keras%20and%20BigDL%20in%20production%20on%20Apache%20Spark.pdf))
-
- User-based real-time product recommendations leveraging deep learning using Analytics Zoo on Apache Spark in Office Depot, [Strata Data conference](https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/73079), March 2019, San Francisco ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/User-based%20real-time%20product%20recommendations%20leveraging%20deep%20learning%20using%20Analytics%20Zoo%20on%20Apache%20Spark%20and%20BigDL%20Presentation.pdf))
-
- Analytics Zoo: Unifying Big Data Analytics and AI for Apache Spark, [Shanghai Apache Spark + AI meetup](https://www.meetup.com/Shanghai-Apache-Spark-AI-Meetup/events/255788956/), Nov 2018, Shanghai ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Analytics%20Zoo-Unifying%20Big%20Data%20Analytics%20and%20AI%20for%20Apache%20Spark.pdf))
-
- Use Intel Analytics Zoo to build an intelligent QA Bot for Microsoft Azure, [Shanghai Apache Spark + AI meetup](https://www.meetup.com/Shanghai-Apache-Spark-AI-Meetup/events/255788956/), Nov 2018, Shanghai ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Use%20Intel%20Analytics%20Zoo%20to%20build%20an%20intelligent%20QA%20Bot%20for%20Microsoft%20Azure.pdf))
-
- A deep learning approach for precipitation nowcasting with RNN using Analytics Zoo in Cray, [Strata Data conference](https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/69413), Sep 2018, New York ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/A%20deep%20learning%20approach%20for%20precipitation%20nowcasting%20with%20RNN%20using%20Analytics%20Zoo%20on%20BigDL.pdf))
-
- Job recommendations leveraging deep learning using Analytics Zoo on Apache Spark in Talroo, [Strata Data conference](https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/69113), Sep 2018, New York ([slides](https://cdn.oreillystatic.com/en/assets/1/event/278/Job%20recommendations%20leveraging%20deep%20learning%20using%20Analytics%20Zoo%20on%20Apache%20Spark%20and%20BigDL%20Presentation.pdf))
-
- Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark, [Spark + AI Summit](https://databricks.com/session/accelerating-deep-learning-training-with-bigdl-and-drizzle-on-apache-spark), June 2018, San Francisco ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Accelerating%20deep%20learning%20on%20apache%20spark%20Using%20BigDL%20with%20coarse-grained%20scheduling.pdf))
-
- Using Crowdsourced Images to Create Image Recognition Models with Analytics Zoo in World Bank, [Spark + AI Summit](https://databricks.com/session/using-crowdsourced-images-to-create-image-recognition-models-with-bigdl), June 2018, San Francisco ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Using%20Crowdsourced%20Images%20to%20Create%20Image%20Recognition%20Models%20with%20Analytics%20Zoo%20using%20BigDL.pdf))
-
- Building Deep Reinforcement Learning Applications on Apache Spark with Analytics Zoo using BigDL, [Spark + AI Summit](https://databricks.com/session/building-deep-reinforcement-learning-applications-on-apache-spark-using-bigdl), June 2018, San Francisco ([slides](https://github.com/analytics-zoo/analytics-zoo.github.io/blob/master/presentations/Building%20Deep%20Reinforcement%20Learning%20Applications%20on%20Apache%20Spark%20with%20Analytics%20Zoo%20using%20BigDL.pdf))
-
- Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience at Scale, [Spark + AI Summit](https://databricks.com/session/using-bigdl-on-apache-spark-to-improve-the-mls-real-estate-search-experience-at-scale), June 2018, San Francisco
-
- Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL, [Spark + AI Summit](https://databricks.com/session/analytics-zoo-building-analytics-and-ai-pipeline-for-apache-spark-and-bigdl), June 2018, San Francisco
-
- Using Siamese CNNs for removing duplicate entries from real estate listing databases, [Strata Data conference](https://conferences.oreilly.com/strata/strata-eu-2018/public/schedule/detail/65518), May 2018, London ([slides](https://cdn.oreillystatic.com/en/assets/1/event/267/Using%20Siamese%20CNNs%20for%20removing%20duplicate%20entries%20from%20real%20estate%20listing%20databases%20Presentation.pdf))
-
- Classifying images on Spark in World Bank, [AI conference](https://conferences.oreilly.com/artificial-intelligence/ai-ny-2018/public/schedule/detail/64939), May 2018, New York ([slides](https://cdn.oreillystatic.com/en/assets/1/event/280/Classifying%20images%20in%20Spark%20Presentation.pdf))
-
- Improving user-merchant propensity modeling using neural collaborative filtering and wide and deep models on Spark BigDL in Mastercard, [Strata Data conference](https://conferences.oreilly.com/strata/strata-ca-2018/public/schedule/detail/63897), March 2018, San Jose ([slides](https://cdn.oreillystatic.com/en/assets/1/event/269/Improving%20user-merchant%20propensity%20modeling%20using%20neural%20collaborative%20filtering%20and%20wide%20and%20deep%20models%20on%20Spark%20BigDL%20at%20scale%20Presentation.pdf))
-
- Accelerating deep learning on Apache Spark using BigDL with coarse-grained scheduling, [Strata Data conference](https://conferences.oreilly.com/strata/strata-ca-2018/public/schedule/detail/63960), March 2018, San Jose ([slides](https://cdn.oreillystatic.com/en/assets/1/event/269/Accelerating%20deep%20learning%20on%20Apache%20Spark%20using%20BigDL%20with%20coarse-grained%20scheduling%20Presentation.pptx))
-
- Automatic 3D MRI knee damage classification with 3D CNN using BigDL on Spark in UCSF, [Strata Data conference](https://conferences.oreilly.com/strata/strata-ca-2018/public/schedule/detail/64023), March 2018, San Jose ([slides](https://cdn.oreillystatic.com/en/assets/1/event/269/Automatic%203D%20MRI%20knee%20damage%20classification%20with%203D%20CNN%20using%20BigDL%20on%20Spark%20Presentation.pdf))
-
--- a/docs/readthedocs/source/doc/Chronos/Howto/docker_guide_single_node.md
+++ b/docs/readthedocs/source/doc/Chronos/Howto/docker_guide_single_node.md
@ -1,139 +0,0 @@
-# Use Chronos in Container (docker)
-This page helps user to build and use a docker image where Chronos-nightly build version is deployed.
-
-## Download image from Docker Hub
-We provide docker image with Chronos-nightly build version deployed in [Docker Hub](https://hub.docker.com/r/intelanalytics/bigdl-chronos/tags). You can directly download it by running command:
-```bash
-docker pull intelanalytics/bigdl-chronos:latest
-```
-
-## Build an image (Optional)
-**If you have downloaded docker image, you can just skip this part and go on [Use Chronos](#use-chronos).**
-
-First clone the repo `BigDL` to the local.
-```bash
-git clone https://github.com/intel-analytics/BigDL.git
-```
-Then `cd` to the root directory of `BigDL`, and copy the Dockerfile to it. 
-```bash
-cd BigDL
-cp docker/chronos-nightly/Dockerfile ./Dockerfile
-```
-When building image, you can specify some build args to install chronos with necessary dependencies according to your own needs.
-The build args are similar to the install options in [Chronos documentation](https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/install.html).
-
-```
-model: which model or framework you want. 
-       value: pytorch
-              tensorflow
-              prophet
-              arima
-              ml (default, for machine learning models).
-
-auto_tuning: whether to enable auto tuning.
-             value: y (for yes)
-                    n (default, for no).
-
-hardware: run chronos on a single machine or a cluster.
-          value: single (default)
-                 cluster
-
-inference: whether to install dependencies for inference optimization (e.g. onnx, openvino, ...).
-           value: y (for yes)
-                  n (default, for no)
-
-extra_dep: whether to install some extra dependencies.
-           value: y (for yes)
-                  n (default, for no)
-           if specified to y, the following dependencies will be installed:
-           tsfresh, pyarrow, prometheus_pandas, xgboost, jupyter, matplotlib
-```
-
-If you want to build image with the default options, you can simply use the following command:
-```bash
-sudo docker build -t intelanalytics/bigdl-chronos:latest . # You may choose any NAME:TAG you want.
-```
-
-You can also build with other options by specifying the build args:
-```bash
-sudo docker build \
-    --build-arg model=pytorch \
-    --build-arg auto_tuning=y \
-    --build-arg hardware=single \
-    --build-arg inference=n \
-    --build-arg extra_dep=n \
-     -t intelanalytics/bigdl-chronos:latest . # You may choose any NAME:TAG you want.
-```
-
-(Optional) If you need a proxy, you can add two additional build args to specify it:
-```bash
-# typically, you need a proxy for building since there will be some downloading.
-sudo docker build \
-    --build-arg http_proxy=http://<your_proxy_ip>:<your_proxy_port> \ #optional
-    --build-arg https_proxy=http://<your_proxy_ip>:<your_proxy_port> \ #optional
-    -t intelanalytics/bigdl-chronos:latest . # You may choose any NAME:TAG you want.
-```
-According to your network status, this building will cost **15-30 mins**. 
-
-**Tips:** When errors happen like `failed: Connection timed out.`, it's usually related to the bad network status. Please build with a proxy.
-
-## Run the image
-```bash
-sudo docker run -it --rm --net=host intelanalytics/bigdl-chronos:latest bash
-```
-
-## Use Chronos
-A conda environment is created for you automatically. `bigdl-chronos` and the necessary depenencies (based on the build args when you build image) are installed inside this environment.
-```bash
-(chronos) root@icx-5:/opt/work# 
-```
-```eval_rst
-.. important::
-
-       Considering the image size, we build docker image with the default args and upload it to Docker Hub. If you use it directly, only ``bigdl-chronos`` is installed inside this environment. There are two methods to install other necessary dependencies according to your own needs:
-
-       1. Make sure network is available and run install command following `Install using Conda <https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/install.html#install-using-conda>`_ , such as ``pip install --pre --upgrade bigdl-chronos[pytorch]``.
-
-       2. Make sure network is available and bash ``/opt/install-python-env.sh`` with build args. The values are introduced in `Build an image <#build-an-image-optional>`_.
-
-       .. code-block:: python
-
-              # bash /opt/install-python-env.sh ${model} ${auto_tuning} ${hardware} ${inference} ${extra_dep}
-              # For example, if you want to install bigdl-chronos[pytorch,inference]
-              bash /opt/install-python-env.sh pytorch n single y n
-
-```
-
-## Run unittest examples on Jupyter Notebook for a quick use
-> Note: To use jupyter notebook, you need to specify the build arg `extra_dep` to `y`.
-
-You can run these on Jupyter Notebook on single node server if you pursue a quick use on Chronos.
-```bash
-(chronos) root@icx-5:/opt/work# cd /opt/work/colab-notebook #Unittest examples are here.
-```
-```bash
-(chronos) root@icx-5:/opt/work/colab-notebook# jupyter notebook --notebook-dir=./ --ip=* --allow-root #Start the Jupyter Notebook services.
-```
-After the Jupyter Notebook service is successfully started, you can connect to the Jupyter Notebook service from a browser.
-1. Get the IP address of the container
-2. Launch a browser, and connect to the Jupyter Notebook service with the URL: 
-</br>`https://container-ip-address:port-number/?token=your-token`
-</br>As a result, you will see the Jupyter Notebook opened.
-3. Open one of these `.ipynb` files, run through the example and learn how to use Chronos to predict time series.
-
-## Shut down docker container
-You should shut down the BigDL Docker container after using it.
-1. First, use `ctrl+p+q` to quit the container when you are still in it. 
-2. Then, you can list all the active Docker containers by command line:
-   ```bash
-   sudo docker ps
-   ```
-   You will see your docker containers:
-   ```bash
-   CONTAINER ID   IMAGE                                 COMMAND   CREATED       STATUS       PORTS     NAMES
-   ef133bd732d1   intelanalytics/bigdl-chronos:latest   "bash"    2 hours ago   Up 2 hours             happy_babbage
-   ```
-3. Shut down the corresponding docker container by its ID:
-   ```bash
-   sudo docker rm -f ef133bd732d1
-   ```
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_choose_forecasting_alg.md
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_choose_forecasting_alg.md
@ -1,48 +0,0 @@
-# Choose proper forecasting model
-
-How to choose a forecasting model among so many built-in models (or build one by yourself) in Chronos? That's a common question when users want to build their first forecasting model. Different forecasting models are more suitable for different data and different metrics(accuracy or performances).
-
-The flowchart below is designed to guide our users which forecasting model to try on your own data. Click on the blocks in the chart below to see its documentation/examples.
-
-```eval_rst
-.. note::
-
-    Following flowchart may need some time to load.
-```
-
-
-```eval_rst
-.. mermaid::
-
-    flowchart TD
-        StartPoint[I want to build a forecasting model]
-        StartPoint-- always start from --> TCN[TCNForecaster]
-        TCN -- performance is not satisfying --> TCN_OPT[Make sure optimizations are deploied]
-        TCN_OPT -- further performance improvement is needed --> SER[Performance-awared Hyperparameter Optimization]
-        SER -- only 1 step to be predicted --> LSTMForecaster
-        SER -- only 1 var to be predicted --> NBeatsForecaster
-        LSTMForecaster -- does not work --> CUS[customized model]
-        NBeatsForecaster -- does not work --> CUS[customized model]
-
-        TCN -- accuracy is not satisfying --> Tune[Hyperparameter Optimization]
-        Tune -- only 1 step to be predicted --> LSTMForecaster2[LSTMForecaster]
-        LSTMForecaster2 -- does not work --> AutoformerForecaster
-        Tune -- more than 1 step to be predicted --> AutoformerForecaster
-        AutoformerForecaster -- does not work --> Seq2SeqForecaster
-        Seq2SeqForecaster -- does not work --> CUS[customized model]
-
-        click TCN "https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/forecasting.html#tcnforecaster"
-        click LSTMForecaster "https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/forecasting.html#lstmforecaster"
-        click LSTMForecaster2 "https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/forecasting.html#lstmforecaster"
-        click NBeatsForecaster "https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/forecasting.html#nbeatsforecaster"
-        click Seq2SeqForecaster "https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/forecasting.html#seq2seqforecaster"
-        click AutoformerForecaster "https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/forecasting.html#AutoformerForecaster"
-
-        click TCN_OPT "https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/speed_up.html"
-        click SER "https://github.com/intel-analytics/BigDL/blob/main/python/chronos/example/hpo/muti_objective_hpo_with_builtin_latency_tutorial.ipynb"
-        click Tune "https://bigdl.readthedocs.io/en/latest/doc/Chronos/Howto/how_to_tune_forecaster_model.html"
-        click CUS "https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/speed_up.html"
-
-        classDef Model fill:#FFF,stroke:#0f29ba,stroke-width:1px;
-        class TCN,LSTMForecaster,NBeatsForecaster,LSTMForecaster2,AutoformerForecaster,Seq2SeqForecaster Model;
-```
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_create_forecaster.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_create_forecaster.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how-to-create-forecaster.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_evaluate_a_forecaster.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_evaluate_a_forecaster.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_evaluate_a_forecaster.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_export_data_processing_pipeline_to_torchscript.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_export_data_processing_pipeline_to_torchscript.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_export_data_processing_pipeline_to_torchscript.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_export_onnx_files.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_export_onnx_files.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_export_onnx_files.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_export_openvino_files.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_export_openvino_files.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_export_openvino_files.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_export_torchscript_files.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_export_torchscript_files.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_export_torchscript_files.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_generate_confidence_interval_for_prediction.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_generate_confidence_interval_for_prediction.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_generate_confidence_interval_for_prediction.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_optimize_a_forecaster.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_optimize_a_forecaster.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_optimize_a_forecaster.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_preprocess_my_data.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_preprocess_my_data.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_preprocess_my_data.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_process_data_in_production_environment.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_process_data_in_production_environment.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_process_data_in_production_environment.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_save_and_load_forecaster.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_save_and_load_forecaster.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_save_and_load_forecaster.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_speedup_inference_of_forecaster_through_ONNXRuntime.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_speedup_inference_of_forecaster_through_ONNXRuntime.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_speedup_inference_of_forecaster_through_ONNXRuntime.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_speedup_inference_of_forecaster_through_OpenVINO.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_speedup_inference_of_forecaster_through_OpenVINO.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_speedup_inference_of_forecaster_through_OpenVINO.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_train_forecaster_on_one_node.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_train_forecaster_on_one_node.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_train_forecaster_on_one_node.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_tune_forecaster_model.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_tune_forecaster_model.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_tune_forecaster_model.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_use_benchmark_tool.md
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_use_benchmark_tool.md
@ -1,174 +0,0 @@
-# Use Chronos benchmark tool
-This page demonstrates how to use Chronos benchmark tool to benchmark forecasting performance on platforms.
-
-## Basic Usage
-The benchmark tool is installed automatically when `bigdl-chronos` is installed. To get information about performance (currently for forecasting only) on the your own machine.
-
-Run benchmark tool with default options using following command:
-```bash
-benchmark-chronos -l 96 -o 720
-```
-```eval_rst
-.. note::
-    **Required Options**:
-
-     ``-l/--lookback`` and ``-o/--horizon`` are required options for Chronos benchmark tool. Use ``-l/--lookback`` to specify the history time steps while use ``-o/--horizon`` to specify the output time steps. For more details, please refer to `here <https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/forecasting.html#regular-regression-rr-style>`_.
-```
-By default, the tool will load `tsinghua_electricity` dataset and train a `TCNForecaster` with input lookback and horizon parameters under `PyTorch` framework. As it loads, it prints information about hardware, environment variables and benchmark parameters. When benchmarking is completed, it reports the average throughput during training process. Users may be able to improve forecasting performance by following suggested changes on Nano environment variables.
-
-Besides the default usage, more execution parameters can be set to obtain more benchmark results. Read on to learn more about the configuration options available in Chronos benchmark tool.
-
-## Configuration Options
-The benchmark tool provides various options for configuring execution parameters. Some key configuration options are introduced in this part and a list of all options is given in [**Advanced Options**](#advanced-options).
-
-### Model
-The tool provides several built-in time series forecasting models, including TCN, LSTM, Seq2Seq, NBeats and Autoformer. To specify which model to use, run benchmark tool with `-m/--model`. If not specified, TCN is used as the default.
-```bash
-benchmark-chronos -m lstm -l 96 -o 720
-```
-
-### Stage
-Regarding a model, training and inference stages are most concerned. By setting `-s/--stage` parameter, users can obtain knowledge of throughput during training (`-s train`), accuracy after training(`-s accuracy`). throughput during inference (`-s throughput`) and latency of inference (`-s latency`). If not specified, train is used as the default.
-```bash
-benchmark-chronos -s latency -l 96 -o 720
-```
-```eval_rst
-.. note::
-    **More About Accuracy Results**:
-
-     After setting ``-s accuracy``, the tool will load dataset and split it to train, validation and test set with ratio of 7:1:2. Then validation loss is monitored during training epoches and checkpoint of the epoch with smallest loss is loaded after training. With the trained forecaster, obtain evaluation results corresponding to ``--metrics``.
-```
-
-### Dataset
-Several built-in datasets can be chosen, including nyc_taxi and tsinghua_electricity. If users are with poor Internet connection and hard to download dataset, run benchmark tool with `-d synthetic_dataset` to use synthetic dataset. Default to be tsinghua_electricity if `-d/--dataset` parameter is not specified.
-```bash
-benchmark-chronos -d nyc_taxi -l 96 -o 720
-```
-```eval_rst
-.. note::
-    **Download tsinghua_electricity Dataset**:
-
-     The tsinghua_electricity dataset does not support automatic downloading. Users can download manually from `here <https://github.com/thuml/Autoformer#get-started>`_ to path "~/.chronos/dataset/".
-```
-
-### Framework
-Pytorch and tensorflow are both supported and can be specified by setting `-f torch` or `-f tensorflow`. And the default framework is pytorch.
-```bash
-benchmark-chronos -f tensorflow -l 96 -o 720
-```
-```eval_rst
-.. note::
-     NBeats and Autoformer does not support tensorflow backend now.
-```
-
-### Core number
-By default, the benchmark tool will run on all physical cores. And users can explicitly specify the number of cores through `-c/--cores` parameter.
-```bash
-benchmark-chronos -c 4 -l 96 -o 720
-```
-
-### Lookback
-Forecasting aims at predicting the future by using the knowledge from the history. The required option `-l/--lookback`corresponds to the length of historical data along time.
-```bash
-benchmark-chronos -l 96 -o 720
-```
-
-### Horizon
-Forecasting aims at predicting the future by using the knowledge from the history. The required option `-o/--horizon`corresponds to the length of predicted data along time.
-```bash
-benchmark-chronos -l 96 -o 720
-```
-
-## Advanced Options
-When `-s/--stage accuracy` is set, users can further specify evaluation metrics through `--metrics` which default to be mse and mae.
-```bash
-benchmark-chronos --stage accuracy --metrics mse rmse  -l 96 -o 720
-```
-
-To improve model accuracy, the tool provides with normalization trick to alleviate distribution shift. Once enable `--normalization`, normalization trick will be applied to forecaster.
-```bash
-benchmark-chronos --stage accuracy --normalization -l 96 -o 720
-```
-```eval_rst
-.. note::
-     Only TCNForecaster supports normalization trick now.
-```
-
-Besides, number of processes and epoches can be set by `--training_processes` and `--training_epochs`. Users can also tune batchsize during training and inference through `--training_batchsize` and `--inference_batchsize` respectively.
-```bash
-benchmark-chronos --training_processes 2 --training_epochs 3 --training_batchsize 32 --inference_batchsize 128 -l 96 -o 720
-```
-
-To speed up inference, accelerators like ONNXRuntime and OpenVINO are usually used. To benchmark inference performance with or without accelerator, run tool with `--inference_framework` to specify without accelerator (`--inference_framework torch`)or with ONNXRuntime (`--inference_framework onnx`) or with OpenVINO (`--inference_framework openvino`) or with jit (`--inference_framework jit`).
-```bash
-benchmark-chronos --inference_framework onnx -l 96 -o 720
-```
-
-When benchmark tool is run with `--ipex` enabled, intel-extension-for-pytorch will be used as accelerator for trainer. 
-
-If want to use quantized model to predict, just run the benchmark tool with `--quantize` enabled and the quantize framework can be specified by `--quantize_type`. The parameter`--quantize_type` need to be set as pytorch_ipex when users want to use pytorch_ipex as quantize type. Otherwise, the defaut quantize type will be selected according to `--inference_framework`. If pytorch is the inference framework, then pytorch_fx will be the default. If users choose ONNXRuntime as inference framework, onnxrt_qlinearops will be quantize type. And if OpenVINO is chosen, the openvino quantize type will be selected.
-```bash
-benchmark-chronos --ipex --quantize --quantize_type pytorch_ipex -l 96 -o 720
-```
-
-
-Moreover, if want to benchmark inference performance of a trained model, run benchmark tool with `--ckpt` to specify the checkpoint path of model. By default, the model for inference will be trained first according to input parameters.
-
-Running the benchmark tool with `-h/--help` yields the following usage message, which contains all configuration options:
-```bash
-benchmark-chronos -h
-```
-```eval_rst
-.. code-block:: python
-
-     usage: benchmark-chronos [-h] [-m] [-s] [-d] [-f] [-c] -l lookback -o horizon
-                             [--training_processes] [--training_batchsize]
-                             [--training_epochs] [--inference_batchsize]
-                             [--quantize] [--inference_framework  [...]] [--ipex]
-                             [--quantize_type] [--ckpt] [--metrics  [...]]
-                             [--normalization]
-
-     Benchmarking Parameters
-
-     optional arguments:
-      -h, --help            show this help message and exit
-      -m, --model           model name, choose from
-                            tcn/lstm/seq2seq/nbeats/autoformer, default to "tcn".
-      -s, --stage           stage name, choose from
-                            train/latency/throughput/accuracy, default to "train".
-      -d, --dataset         dataset name, choose from
-                            nyc_taxi/tsinghua_electricity/synthetic_dataset,
-                            default to "tsinghua_electricity".
-      -f, --framework       framework name, choose from torch/tensorflow, default
-                            to "torch".
-      -c, --cores           core number, default to all physical cores.
-      -l lookback, --lookback lookback
-                            required, the history time steps (i.e. lookback).
-      -o horizon, --horizon horizon
-                            required, the output time steps (i.e. horizon).
-      --training_processes 
-                            number of processes when training, default to 1.
-      --training_batchsize 
-                            batch size when training, default to 32.
-      --training_epochs     number of epochs when training, default to 1.
-      --inference_batchsize 
-                            batch size when infering, default to 1.
-      --quantize            if use the quantized model to predict, default to
-                            False.
-      --inference_framework  [ ...]
-                            predict without/with accelerator, choose from
-                            torch/onnx/openvino/jit, default to "torch" (i.e. predict
-                            without accelerator).
-      --ipex                if use ipex as accelerator for trainer, default to
-                            False.
-      --quantize_type       quantize framework, choose from
-                            pytorch_fx/pytorch_ipex/onnxrt_qlinearops/openvino,
-                            default to "pytorch_fx".
-      --ckpt                checkpoint path of a trained model, e.g.
-                            "checkpoints/tcn", default to "checkpoints/tcn".
-      --metrics  [ ...]     evaluation metrics of a trained model, e.g.
-                            "mse"/"mae", default to "mse, mae".
-      --normalization       if to use normalization trick to alleviate
-                            distribution shift.
-```
-
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_use_built-in_datasets.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_use_built-in_datasets.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_use_built-in_datasets.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/how_to_use_forecaster_to_predict_future_data.nblink
+++ b/docs/readthedocs/source/doc/Chronos/Howto/how_to_use_forecaster_to_predict_future_data.nblink
@ -1,3 +0,0 @@
-{
-    "path": "../../../../../../python/chronos/colab-notebook/howto/how_to_use_forecaster_to_predict_future_data.ipynb"
-}
--- a/docs/readthedocs/source/doc/Chronos/Howto/index.rst
+++ b/docs/readthedocs/source/doc/Chronos/Howto/index.rst
@ -1,52 +0,0 @@
-Chronos How-to Guides
-=========================
-How-to guides are bite-sized, executable examples where users could check when meeting with some specific topic during the usage.
-
-Installation
-------------------------
-
-* `Install Chronos on Windows <windows_guide.html>`__
-* `Use Chronos in container(docker) <docker_guide_single_node.html>`__
-
-Data Processing
-------------------------
-* `Preprocess my data <how_to_preprocess_my_data.html>`__
-* `Built-in dataset <how_to_use_built-in_datasets.html>`__
-
-
-Forecasting
-------------------------
-
-Develop a forecaster
-~~~~~~~~~~~~~~~~~~~~~~~~~
-* `Choose a forecaster algorithm <how_to_choose_forecasting_alg.html>`__
-* `Create a forecaster <how_to_create_forecaster.html>`__
-* `Train forecaster on single node <how_to_train_forecaster_on_one_node.html>`__
-* `Tune forecaster on single node <how_to_tune_forecaster_model.html>`__
-* `Evaluate a forecaster <how_to_evaluate_a_forecaster.html>`__
-* `Use forecaster to predict future data <how_to_use_forecaster_to_predict_future_data.html>`__
-* `Generate confidence interval for prediction <how_to_generate_confidence_interval_for_prediction.html>`__
-
-Speed up a forecaster
-~~~~~~~~~~~~~~~~~~~~~~~~~
-* `Speed up inference of forecaster through ONNXRuntime <how_to_speedup_inference_of_forecaster_through_ONNXRuntime.html>`__
-* `Speed up inference of forecaster through OpenVINO <how_to_speedup_inference_of_forecaster_through_OpenVINO.html>`__
-* `Optimize a forecaster by searching the best accelerate method <how_to_optimize_a_forecaster.html>`__
-
-Persist a forecaster
-~~~~~~~~~~~~~~~~~~~~~~~~~
-* `Save and load a forecaster <how_to_save_and_load_forecaster.html>`__
-* `Export the ONNX model files to disk <how_to_export_onnx_files.html>`__
-* `Export the OpenVINO model files to disk <how_to_export_openvino_files.html>`__
-* `Export the TorchScript model files to disk <how_to_export_torchscript_files.html>`__
-* `Preprocess my data <how_to_preprocess_my_data.html>`__
-* `Built-in dataset <how_to_use_built-in_datasets.html>`__
-
-Benchmark a forecaster
-~~~~~~~~~~~~~~~~~~~~~~~~~
-* `Use Chronos benchmark tool <how_to_use_benchmark_tool.html>`__
-
-Deploy a forecaster
-~~~~~~~~~~~~~~~~~~~~~~~~~
-* `A whole workflow in production environment after my forecaster is developed <how_to_process_data_in_production_environment.html>`__
-* `Export data processing pipeline to torchscript for further deployment without Python environment <how_to_export_data_processing_pipeline_to_torchscript.html>`__
--- a/docs/readthedocs/source/doc/Chronos/Howto/windows_guide.md
+++ b/docs/readthedocs/source/doc/Chronos/Howto/windows_guide.md
@ -1,91 +0,0 @@
-# Install Chronos on Windows
-
-There are 2 ways to install Chronos on Windows: install using WSL2 and install on native Windows. With WSL2, all the features of Chronos are available, while on native Windows, there are some limitations now.
-
-## Install using WSL2
-### Step 1: Install WSL2
-
-Follow [BigDL Windows User guide](../../UserGuide/win.md) to install WSL2.
-
-
-### Step 2: Install Chronos
-
-Follow the [Chronos Installation guide](../Overview/chronos.md#install) to install Chronos.
-
-## Install on native Windows
-
-### Step1: Install conda
-
-We recommend using conda to manage the Chronos python environment, for more information on install conda on Windows, you can refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
-
-When conda is successfully installed, open the Anaconda Powershell Prompt, then you can create a conda environment using the following command:
-
-```
-# create a conda environment for chronos
-conda create -n my_env python=3.7 setuptools=58.0.4  # you could change my_env to any name you want
-```
-
-### Step2: Install Chronos from PyPI
-You can simply install Chronos from PyPI using the following command:
-
-```
-# activate your conda environment
-conda activate my_env
-
-# install Chronos nightly build version (2.1.0 stable release is not supported on native Windows)
-pip install --pre --upgrade bigdl-chronos[pytorch]
-```
-
-You can use the [install panel](https://bigdl.readthedocs.io/en/latest/doc/Chronos/Overview/install.html#install-using-conda) to select the proper install options based on your need, but there are some limitations now:
-
- `bigdl-chronos[distributed]` is not supported.
-
- `intel_extension_for_pytorch (ipex)` is unavailable for Windows now, so the related feature is not supported.
-
-### Known Issues on Native Windows
-
-#### Fail to Install Neural-compressor via pip
-
-**Problem description**
-
-Installing neural-compressor via pip may stuck when installing pycocotools.
-
-**Solution**
-
-Install pycocotools using conda:
-
-`conda install pycocotools -c esri`
-
-Then neural-compressor can be successfully installed using pip, we recommend installing neural-compressor 1.13.1 or higher:
-
-`pip install neural-compressor==1.13.1`
-
-#### RuntimeError during Quantization
-
-**Problem description**
-
-Calling `forecaster.quantize()` without specifying the `metric` parameter (e.g. `forecaster.quantize(train_data)`) will raise runtime error, it may happen when neural-compressor version is lower than `1.13.1`
-
-> [ERROR] Unexpected exception AssertionError('please use start() before end()') happened during tuning.
->
-> RuntimeError: Found no quantized model satisfying accuracy criterion.
-
-**Solution**
-
-Upgrade neural-compressor to 1.13.1 or higher.
-
-`pip install neural-compressor==1.13.1`
-
-#### RuntimeError during forecaster.fit
-
-**Problem description**
-
-`ProphetForecaster.fit` and `ProphetModel.fit_eval` may raise runtime error on native Windows.
-
-> RuntimeError: Error during optimization!
->
-> [ERROR] Chain [1] error: terminated by signal 3221225657
-
-According to our test, this issue only arises on some test machines or environments, you could check it by running `ProphetForecaster.fit` and `ProphetModel.fit_eval` on your own machines or environments.
-
-There is a similar [issue](https://github.com/facebook/prophet/issues/2227) in prophet repo, we will stay tuned for its progress.
--- a/docs/readthedocs/source/doc/Chronos/Image/aiops-workflow.png
+++ b/docs/readthedocs/source/doc/Chronos/Image/aiops-workflow.png
--- a/docs/readthedocs/source/doc/Chronos/Image/anomaly_detection.svg
+++ b/docs/readthedocs/source/doc/Chronos/Image/anomaly_detection.svg
@ -1 +0,0 @@
-<svg width="1320" height="990" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" overflow="hidden"><defs><clipPath id="clip0"><rect x="1907" y="139" width="1320" height="990"/></clipPath></defs><g clip-path="url(#clip0)" transform="translate(-1907 -139)"><rect x="1907" y="139" width="1320" height="990" fill="#FFFFFF"/><path d="M0.00756648 0.0364743 153.008 492.037" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2090.5 932.5)"/><path d="M2241.86 430.229 2400.86 941.229" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2047.36 930.729C2047.36 908.089 2065.72 889.728 2088.36 889.728 2111 889.728 2129.36 908.089 2129.36 930.729 2129.36 953.369 2111 971.729 2088.36 971.729 2065.72 971.729 2047.36 953.369 2047.36 930.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M2201.36 433.729C2201.36 410.532 2219.72 391.728 2242.36 391.728 2265 391.728 2283.36 410.532 2283.36 433.729 2283.36 456.924 2265 475.729 2242.36 475.729 2219.72 475.729 2201.36 456.924 2201.36 433.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M0.0637745 0.0364743 153.064 492.037" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2401.5 932.5)"/><path d="M2554.86 430.229 2713.86 941.229" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2360.36 930.729C2360.36 908.089 2378.72 889.728 2401.36 889.728 2424 889.728 2442.36 908.089 2442.36 930.729 2442.36 953.369 2424 971.729 2401.36 971.729 2378.72 971.729 2360.36 953.369 2360.36 930.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M2514.37 433.729C2514.37 410.532 2532.5 391.728 2554.87 391.728 2577.23 391.728 2595.37 410.532 2595.37 433.729 2595.37 456.924 2577.23 475.729 2554.87 475.729 2532.5 475.729 2514.37 456.924 2514.37 433.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M0.12133 0.005045 150.122 653.006" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2718.5 920.5)"/><path d="M2876.86 263.229 3071.86 926.23" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2675.36 920.729C2675.36 898.089 2694.16 879.728 2717.36 879.728 2740.56 879.728 2759.36 898.089 2759.36 920.729 2759.36 943.369 2740.56 961.729 2717.36 961.729 2694.16 961.729 2675.36 943.369 2675.36 920.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M3030.36 920.729C3030.36 898.089 3049.16 879.728 3072.36 879.728 3095.56 879.728 3114.36 898.089 3114.36 920.729 3114.36 943.369 3095.56 961.729 3072.36 961.729 3049.16 961.729 3030.36 943.369 3030.36 920.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M2835.37 279.232C2835.37 256.864 2853.94 238.732 2876.87 238.732 2899.79 238.732 2918.37 256.864 2918.37 279.232 2918.37 301.6 2899.79 319.732 2876.87 319.732 2853.94 319.732 2835.37 301.6 2835.37 279.232Z" fill="#DC3545" fill-rule="evenodd"/></g></svg>
--- a/docs/readthedocs/source/doc/Chronos/Image/automl_hparams.png
+++ b/docs/readthedocs/source/doc/Chronos/Image/automl_hparams.png
--- a/docs/readthedocs/source/doc/Chronos/Image/automl_monitor.png
+++ b/docs/readthedocs/source/doc/Chronos/Image/automl_monitor.png
--- a/docs/readthedocs/source/doc/Chronos/Image/automl_scalars.png
+++ b/docs/readthedocs/source/doc/Chronos/Image/automl_scalars.png
--- a/docs/readthedocs/source/doc/Chronos/Image/forecast-RR.png
+++ b/docs/readthedocs/source/doc/Chronos/Image/forecast-RR.png
--- a/docs/readthedocs/source/doc/Chronos/Image/forecast-TS.png
+++ b/docs/readthedocs/source/doc/Chronos/Image/forecast-TS.png
--- a/docs/readthedocs/source/doc/Chronos/Image/forecasting.svg
+++ b/docs/readthedocs/source/doc/Chronos/Image/forecasting.svg
@ -1 +0,0 @@
-<svg width="1320" height="990" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" overflow="hidden"><defs><clipPath id="clip0"><rect x="192" y="139" width="1320" height="990"/></clipPath></defs><g clip-path="url(#clip0)" transform="translate(-192 -139)"><rect x="192" y="139" width="1320" height="990" fill="#FFFFFF"/><path d="M316.254 753.236 563.254 1013.24" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M0.048095 0.0538267 156.049 500.054" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 563.5 1014.5)"/><path d="M719.254 504.237 966.254 764.236" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M0.119695 0.00939258 156.12 500.01" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" stroke-dasharray="83.3791 62.5344" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 964.5 765.5)"/><path d="M1120.25 255.236 1367.25 515.237" stroke="#0171C3" stroke-width="20.8448" stroke-miterlimit="8" stroke-dasharray="83.3791 62.5344" fill="none" fill-rule="evenodd"/><path d="M293.754 776.237C293.754 753.317 312.334 734.737 335.254 734.737 358.175 734.737 376.755 753.317 376.755 776.237 376.755 799.154 358.175 817.737 335.254 817.737 312.334 817.737 293.754 799.154 293.754 776.237Z" fill="#EE9040" fill-rule="evenodd"/><path d="M520.754 1013.24C520.754 990.321 539.782 971.737 563.255 971.737 586.727 971.737 605.754 990.321 605.754 1013.24 605.754 1036.15 586.727 1054.74 563.255 1054.74 539.782 1054.74 520.754 1036.15 520.754 1013.24Z" fill="#EE9040" fill-rule="evenodd"/><path d="M676.754 509.237C676.754 486.317 695.782 467.737 719.254 467.737 742.727 467.737 761.754 486.317 761.754 509.237 761.754 532.157 742.727 550.737 719.254 550.737 695.782 550.737 676.754 532.157 676.754 509.237Z" fill="#EE9040" fill-rule="evenodd"/><path d="M923.754 754.237C923.754 731.317 942.338 712.737 965.254 712.737 988.171 712.737 1006.75 731.317 1006.75 754.237 1006.75 777.154 988.171 795.737 965.254 795.737 942.338 795.737 923.754 777.154 923.754 754.237Z" fill="#EE9040" fill-rule="evenodd"/><path d="M1077.75 255.237C1077.75 231.765 1096.78 212.737 1120.25 212.737 1143.73 212.737 1162.75 231.765 1162.75 255.237 1162.75 278.709 1143.73 297.737 1120.25 297.737 1096.78 297.737 1077.75 278.709 1077.75 255.237Z" fill="#EE9040" fill-rule="evenodd"/><path d="M1326.75 504.237C1326.75 480.765 1345.34 461.737 1368.25 461.737 1391.17 461.737 1409.75 480.765 1409.75 504.237 1409.75 527.709 1391.17 546.737 1368.25 546.737 1345.34 546.737 1326.75 527.709 1326.75 504.237Z" fill="#EE9040" fill-rule="evenodd"/></g></svg>
--- a/docs/readthedocs/source/doc/Chronos/Image/simulation.svg
+++ b/docs/readthedocs/source/doc/Chronos/Image/simulation.svg
--- a/docs/readthedocs/source/doc/Chronos/Overview/aiops.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/aiops.md
@ -1,87 +0,0 @@
-# Artificial Intelligence for IT operations (AIOps)
-
-Chronos provides a template(i.e., `ConfigGenerator`) as an easy-to-use builder for an AIOps decision system with the usage of `Trigger`.
-
-## How does it work
-
-AIOps application typically relies on a decision system with one or multiple AI models. Generally, this AI system needs to be trained with some training data with a self-defined checkpoint. When using the AI system, we first initialize it throught previously trained checkpoint and inform the AI system with current status to get the suggested configuration.
-
-![](../Image/aiops-workflow.png)
-
-Sometimes the AI system need to be informed some **timely** information (e.g., some events in log or some monitoring data every second). Chronos also defines some triggers for this kind of usage.
-
-## Define ConfigGenerator
-
-### Start from a trivial ConfigGenerator
-Chronos provides `bigdl.chronos.aiops.ConfigGenerator` as a template for users to define their own AIOps AI system. Following is a "hello-world" case.
-
-```python
-class MyConfigGenerator(ConfigGenerator):
-    def __init__(self):
-        super().__init__()
-        self.best_config = [3.0, 1.6]
-
-    def genConfig(self):
-        return self.best_config
-```
-
-For this self-defined `MyConfigGenerator`, we keep generate a fixed best config with out considering current status. This could be a startpoint or smoke test configgenerator for your system. The whole system even do not need to be trained.
-
-### Add AI Model to ConfigGenerator
-Any model could be used in `ConfigGenerator`, to name a few, sklearn, pytorch or tensorflow models are all valid. Following is a normal flow you may want to add your model.
-
-```python
-class MyConfigGenerator(ConfigGenerator):
-    def __init__(self, path):
-        super().__init__()
-        self.model = load_model_from_checkpoint(path)
-
-    def genConfig(self, current_status):
-        return self.model(current_status)
-
-    @staticmethod
-    def train(train_data, path):
-        train_model_and_save_checkpoint(train_data, path)
-```
-
- In `MyConfigGenerator.train`, users will define the way to train their model and save to a specific path.
- In `MyConfigGenerator.__init__`, users will define the way to load the trained checkpoint.
- In `MyConfigGenerator.genConfig`, users will define the way to use the loaded model to do the prediction and get the suggested config.
-
-Please refer to [ConfigGenerator API doc](../../PythonAPI/Chronos/aiops.html) for detailed information.
-
-#### Use Chronos Forecaster/Anomaly detector
-Chronos also provides some out-of-box forecasters and anomaly detectors for time series data for users to build their AIOps use-case easier.
-
-Please refer to [Forecaster User Guide](./forecasting.html) and [Anomaly Detector User Guide](./anomaly_detection.html) for detailed information.
-
-### Use trigger in ConfigGenerator
-Sometimes the AI system need to be informed some **timely** information (e.g., some events in log or some monitoring data every second). Chronos also defines some triggers for this kind of usage. Following is a trivial case to help users understand what a `Trigger` can do. 
-
-```python
-class MyConfigGenerator(ConfigGenerator):
-    def __init__(self):
-        self.sweetpoint = 1
-        super().__init__()
-
-    def genConfig(self):
-        return self.sweetpoint
-
-    @triggerbyclock(2)
-    def update_sweetpoint(self):
-        self.sweetpoint += 1
-```
-
-In this case, once the `MyConfigGenerator` is initialized, `update_sweetpoint` will be called every 2 seconds, users will thus get an evolving ConfiguGenerator.
-
-```python
-mycg = MyConfigGenerator(1)
-time.sleep(2)
-assert mycg.genConfig() == 2
-time.sleep(2)
-assert mycg.genConfig() == 3
-```
-
-This trivial case may seem useless, but with a dedicated `update_sweetpoint`, such as get the CPU utils every second, users could bring useful information to their ConfigGenerator and make better decision with easy programming.
-
-Please refer to [Trigger API doc](../../PythonAPI/Chronos/aiops.html) for detailed information.
--- a/docs/readthedocs/source/doc/Chronos/Overview/anomaly_detection.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/anomaly_detection.md
@ -1,34 +0,0 @@
-# Anomaly Detection
-
-Anomaly Detection detects abnormal samples in a given time series. _Chronos_ provides a set of unsupervised anomaly detectors.
-
-View some examples notebooks for [Datacenter AIOps][AIOps].
-
-## 1. ThresholdDetector
-
-ThresholdDetector detects anomaly based on threshold. It can be used to detect anomaly on a given time series ([notebook][AIOps_anomaly_detect_unsupervised]), or used together with [Forecasters](#forecasting) to detect anomaly on new coming samples ([notebook][AIOps_anomaly_detect_unsupervised_forecast_based]).
-
-View [ThresholdDetector API Doc](../../PythonAPI/Chronos/anomaly_detectors.html#chronos-model-anomaly-th-detector) for more details.
-
-
-## 2. AEDetector
-
-AEDetector detects anomaly based on the reconstruction error of an autoencoder network.
-
-View anomaly detection [notebook][AIOps_anomaly_detect_unsupervised] and [AEDetector API Doc](../../PythonAPI/Chronos/anomaly_detectors.html#chronos-model-anomaly-ae-detector) for more details.
-
-## 3. DBScanDetector
-
-DBScanDetector uses DBSCAN clustering algortihm for anomaly detection.
-
-```eval_rst
-.. note::
-     Users may install ``scikit-learn-intelex`` to accelerate this detector. Chronos will detect if ``scikit-learn-intelex`` is installed to decide if using it. More details please refer to: https://intel.github.io/scikit-learn-intelex/installation.html
-```
-
-View anomaly detection [notebook][AIOps_anomaly_detect_unsupervised] and [DBScanDetector API Doc](../../PythonAPI/Chronos/anomaly_detectors.html#chronos-model-anomaly-dbscan-detector) for more details.
-
-
-[AIOps]:<https://github.com/intel-analytics/BigDL/tree/main/python/chronos/use-case/AIOps>
-[AIOps_anomaly_detect_unsupervised]:<https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/AIOps/AIOps_anomaly_detect_unsupervised.ipynb>
-[AIOps_anomaly_detect_unsupervised_forecast_based]:<https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/AIOps/AIOps_anomaly_detect_unsupervised_forecast_based.ipynb>
--- a/docs/readthedocs/source/doc/Chronos/Overview/chronos_known_issue.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/chronos_known_issue.md
@ -1,71 +0,0 @@
-# Chronos Known Issue
-
-## Version Compatibility Issues
-
-### Numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
-
-**Problem description**
-
-It seems to be a numpy compatibility issue, we do not recommend to solve it by downgrading Numpy to 1.19.x,
-when no other issues exist, the solution is given below.
-
-**Solution**
-* `pip install -y pycocotools`
-* `pip install pycocotools --no-cache-dir --no-binary :all:`
-* `conda install –c conda-forge pycocotools`
-
---------------------------
-
-### Cannot convert a symbolic Tensor (encoder_lstm_8/strided_slice:0) to a numpy array
-
-**Problem description**
-
-This is a compatibility issue caused by Tensorflow and Numpy 1.20.x
-
-**Solution**
-
-* `pip install numpy==1.19.5`
-
---------------------------
-
-### StanModel object has no attribute 'fit_class'
-
-**Problem description**
-
-We recommend reinstalling prophet using conda or miniconda.
-
-**Solution**
-
-* `pip uninstall pystan prophet –y`
-* `conda install –c conda-forge prophet=1.0.1`
-
---------------------------
-
-## Dependency Issues
-
-### RuntimeError: No active RayContext
-
-**Problem description**
-
-Exception: No active RayContext. Please call init_orca_context to create a RayContext.
-> ray_ctx = RayContext.get()<br>
-> ray_ctx = RayContext.get(initialize=False)
-
-**Solution**
-
-* Make sure all operations are before `stop_orca_context`. 
-* No other `RayContext` exists before `init_orca_context`. 
-
---------------------------
-
-### error while loading shared libraries: libunwind.so.8: cannot open shared object file: No such file or directory.
-
-**Problem description**
-
-A dependency is missing from your environment, only happens when you run `source bigdl-nano-init`.
-
-**Solution**
-
-* `apt-get install libunwind8-dev` 
-
---------------------------
--- a/docs/readthedocs/source/doc/Chronos/Overview/data_processing_feature_engineering.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/data_processing_feature_engineering.md
@ -1,276 +0,0 @@
-# Data Processing and Feature Engineering
-
-Time series data is a special data formulation with its specific operations. _Chronos_ provides [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) as a time series dataset abstract for data processing (e.g. impute, deduplicate, resample, scale/unscale, roll sampling) and auto feature engineering (e.g. datetime feature, aggregation feature). Chronos also provides [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) with same(or similar) API for distributed and parallelized data preprocessing on large data.
-
-Users can create a [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) quickly from many raw data types, including pandas dataframe, parquet files, spark dataframe or xshards objects. [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) can be directly used in [`AutoTSEstimator`](../../PythonAPI/Chronos/autotsestimator.html#autotsestimator) and [forecasters](../../PythonAPI/Chronos/forecasters). It can also be converted to pandas dataframe, numpy ndarray, pytorch dataloaders or tensorflow dataset for various usage.
-
-## 1. Basic concepts
-
-A time series can be interpreted as a sequence of real value whose order is timestamp. While a time series dataset can be a combination of one or a huge amount of time series. It may contain multiple time series since users may collect different time series in the same/different period of time (e.g. An AIops dataset may have CPU usage ratio and memory usage ratio data for two servers at a period of time. This dataset contains four time series).
-
-In [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) and [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset), we provide **2** possible dimensions to construct a high dimension time series dataset (i.e. **feature dimension** and **id dimension**).
-
-* feature dimension: Time series along this dimension might be independent or related. Though they may be related, they are assumed to have **different patterns and distributions** and collected on the **same period of time**. For example, the CPU usage ratio and Memory usage ratio for the same server at a period of time.
-* id dimension: Time series along this dimension are assumed to have the **same patterns and distributions** and might by collected on the **same or different period of time**. For example, the CPU usage ratio for two servers at a period of time.
-
-All the preprocessing operations will be done on each independent time series(i.e on both feature dimension and id dimension), while feature scaling will be only carried out on the feature dimension.
-
-```eval_rst
-.. note::
-
-     ``XShardsTSDataset`` will perform the data processing in parallel(based on spark) to support large dataset. While the parallelization will only be performed on "id dimension". This means, in previous example, ``XShardsTSDataset`` will only utilize multiple workers to process data for different servers at the same time. If a dataset only has 1 id, ``XShardsTSDataset`` will be even slower than ``TSDataset`` because of the overhead.
-
-```
-
-## 2. Create a TSDataset
-
-[`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) supports initializing from a pandas dataframe through [`TSDataset.from_pandas`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.from_pandas), from a parquet file through [`TSDataset.from_parquet`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.from_parquet) or from Prometheus data through [`TSDataset.from_prometheus`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.from_prometheus).
-
-[`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) supports initializing from an [xshards object](https://bigdl.readthedocs.io/en/latest/doc/Orca/Overview/data-parallel-processing.html#xshards-distributed-data-parallel-python-processing) through [`XShardsTSDataset.from_xshards`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.experimental.xshards_tsdataset.XShardsTSDataset.from_xshards) or from a Spark Dataframe through [`XShardsTSDataset.from_sparkdf`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.experimental.xshards_tsdataset.XShardsTSDataset.from_sparkdf).
-
-A typical valid time series dataframe `df` is shown below.
-
-You can initialize a [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) or [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) by simply:
-```eval_rst
-
-.. tabs::
-
-    .. tab:: TSDataset
-
-        .. code-block:: python
-
-            # Server id  Datetime         CPU usage   Mem usage
-            # 0          08:39 2021/7/9   93          24
-            # 0          08:40 2021/7/9   91          24
-            # 0          08:41 2021/7/9   93          25
-            # 0          ...              ...         ...
-            # 1          08:39 2021/7/9   73          79
-            # 1          08:40 2021/7/9   72          80
-            # 1          08:41 2021/7/9   79          80
-            # 1          ...              ...         ...
-            from bigdl.chronos.data import TSDataset
-
-            tsdata = TSDataset.from_pandas(df,
-                                           dt_col="Datetime",
-                                           id_col="Server id",
-                                           target_col=["CPU usage",
-                                                       "Mem usage"])
-
-    .. tab:: XShardsTSDataset
-
-        .. code-block:: python
-
-            # Here is a df example:
-            # id        datetime      value   "extra feature 1"   "extra feature 2"
-            # 00        2019-01-01    1.9     1                   2
-            # 01        2019-01-01    2.3     0                   9
-            # 00        2019-01-02    2.4     3                   4
-            # 01        2019-01-02    2.6     0                   2
-            from bigdl.orca.data.pandas import read_csv
-            from bigdl.chronos.data.experimental import XShardsTSDataset
-
-            shards = read_csv(csv_path)
-            tsdataset = XShardsTSDataset.from_xshards(shards, dt_col="datetime",
-                                                      target_col="value", id_col="id",
-                                                      extra_feature_col=["extra feature 1",
-                                                                         "extra feature 2"])
-
-```
-`target_col` is a list of all elements along feature dimension, while `id_col` is the identifier that distinguishes the id dimension. `dt_col` is the datetime column. For `extra_feature_col`(not shown in this case), you should list those features that you will use as input features but not as target features (e.g. you will **not** perform forecasting or anomaly detection task on this col).
-
-If you are building a prototype for your forecasting/anomaly detection task and you need to split you TSDataset to train/valid/test set, you can use `with_split` parameter.[`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) or [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) supports split with ratio by `val_ratio` and `test_ratio`.
-
-If you are deploying your model in production environment, you can use `deploy_mode` parameter and specify it to `True` when calling `TSDataset.from_pandas`, `TSDataset.from_parquet` or `TSDataset.from_prometheus`, which will reduce data processing latency and set necessary parameters for data processing and feature engineering.
-
-## 3. Time series dataset preprocessing
-[`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) supports [`impute`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.impute), [`deduplicate`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.deduplicate) and [`resample`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.resample). You may fill the missing point by [`impute`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.impute) in different modes. You may remove the records that are totally the same by [`deduplicate`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.deduplicate). You may change the sample frequency by [`resample`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.resample). [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) only supports [`impute`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.experimental.xshards_tsdataset.XShardsTSDataset.impute) for now.
-
-A typical cascade call for preprocessing is:
-```eval_rst
-.. tabs::
-
-    .. tab:: TSDataset
-
-        .. code-block:: python
-
-            tsdata.deduplicate().resample(interval="2s").impute()
-
-    .. tab:: XShardsTSDataset
-
-         .. code-block:: python
-
-            tsdata.impute()
-```
-## 4. Feature scaling
-Scaling all features to one distribution is important, especially when we want to train a machine learning/deep learning system. Scaling will make the training process much more stable. Still, we may always remember to unscale the prediction result at last.
-
-[`TSDataset`](../../PythonAPI/Chronos/tsdataset.html) and [`XShardsTSDataset`](../../PythonAPI/Chronos/tsdataset.html#xshardstsdataset) support all the scalers in sklearn through [`scale`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.scale) and [`unscale`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.unscale) method.
-
-Since a scaler should not fit, a typical call for scaling operations is is:
-```eval_rst
-.. tabs::
-
-    .. tab:: TSDataset
-
-        .. code-block:: python
-
-            from sklearn.preprocessing import StandardScaler
-            scale = StandardScaler()
-
-            # scale
-            for tsdata in [tsdata_train, tsdata_valid, tsdata_test]:
-                tsdata.scale(scaler, fit=tsdata is tsdata_train)
-
-            # unscale
-            for tsdata in [tsdata_train, tsdata_valid, tsdata_test]:
-                tsdata.unscale()
-
-    .. tab:: XShardsTSDataset
-
-        .. code-block:: python
-
-            from sklearn.preprocessing import StandardScaler
-            scale = StandardScaler()
-
-            # scale
-            scaler = {"id1": StandardScaler(), "id2": StandardScaler()}
-            for tsdata in [tsdata_train, tsdata_valid, tsdata_test]:
-                tsdata.scale(scaler, fit=tsdata is tsdata_train)
-
-            # unscale
-            for tsdata in [tsdata_train, tsdata_valid, tsdata_test]:
-                tsdata.unscale()
-```
-[`unscale_numpy`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.unscale_numpy) in TSDataset or [`unscale_xshards`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.experimental.xshards_tsdataset.XShardsTSDataset.unscale_xshards) in XShardsTSDataset is specially designed for forecasters. Users may unscale the output of a forecaster by this operation.
-
-A typical call is:
-```eval_rst
-.. tabs::
-
-    .. tab:: TSDataset
-
-        .. code-block:: python
-
-            x, y = tsdata_test.scale(scaler)\
-                              .roll(lookback=..., horizon=...)\
-                              .to_numpy()
-            yhat = forecaster.predict(x)
-            unscaled_yhat = tsdata_test.unscale_numpy(yhat)
-            unscaled_y = tsdata_test.unscale_numpy(y)
-            # calculate metric by unscaled_yhat and unscaled_y
-
-    .. tab:: XShardsTSDataset
-
-        .. code-block:: python
-
-            x, y = tsdata_test.scale(scaler)\
-                              .roll(lookback=..., horizon=...)\
-                              .to_xshards()
-            yhat = forecaster.predict(x)
-            unscaled_yhat = tsdata_test.unscale_xshards(yhat)
-            unscaled_y = tsdata_test.unscale_xshards(y, key="y")
-            # calculate metric by unscaled_yhat and unscaled_y
-```
-## 5. Feature generation
-Other than historical target data and other extra feature provided by users, some additional features can be generated automatically by [`TSDataset`](../../PythonAPI/Chronos/tsdataset.html). [`gen_dt_feature`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.gen_dt_feature) helps users to generate 10 datetime related features(e.g. MONTH, WEEKDAY, ...). [`gen_global_feature`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.gen_global_feature) and [`gen_rolling_feature`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.gen_rolling_feature) are powered by tsfresh to generate aggregated features (e.g. min, max, ...) for each time series or rolling windows respectively.
-
-## 6. Sampling and exporting
-A time series dataset needs to be sampling and exporting as numpy ndarray/dataloader to be used in machine learning and deep learning models(e.g. forecasters, anomaly detectors, auto models, etc.).
-```eval_rst
-.. warning::
-    You don't need to call any sampling or exporting methods introduced in this section when using ``AutoTSEstimator``.
-```
-### 6.1 Roll sampling
-Roll sampling (or sliding window sampling) is useful when you want to train a RR type supervised deep learning forecasting model. It works as the [diagram](#RR-forecast-image) shows.
-
-
-Please refer to the API doc [`roll`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.roll) for detailed behavior. Users can simply export the sampling result as numpy ndarray by [`to_numpy`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.to_numpy), pytorch dataloader [`to_torch_data_loader`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.to_torch_data_loader), tensorflow dataset by [`to_tf_dataset`](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.to_tf_dataset) or xshards object by [`to_xshards`](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.experimental.xshards_tsdataset.XShardsTSDataset.to_xshards).
-
-
-```eval_rst
-.. note::
-    **Difference between** ``roll`` **and** ``to_torch_data_loader``:
-
-    ``.roll(...)`` performs the rolling before RR forecasters/auto models training while ``.to_torch_data_loader(...)`` performs rolling during the training.
-
-    It is fine to use either of them when you have a relatively small dataset (less than 1G). ``.to_torch_data_loader(...)`` is recommended when you have a large dataset (larger than 1G) to save memory usage.
-```
-
-```eval_rst
-.. note::
-    **Roll sampling format**:
-
-    As decribed in RR style forecasting concept, the sampling result will have the following shape requirement.
-
-    | x: (sample_num, lookback, input_feature_num)
-    | y: (sample_num, horizon, output_feature_num)
-
-    Please follow the same shape if you use customized data creator.
-```
-
-A typical call of [`roll`](../../PythonAPI/Chronos/tsdataset.html#bigdl.chronos.data.tsdataset.TSDataset.roll) is as following:
-
-```eval_rst
-.. tabs::
-
-    .. tab:: TSDataset
-
-        .. code-block:: python
-
-            # forecaster
-            x, y = tsdata.roll(lookback=..., horizon=...).to_numpy()
-            forecaster.fit((x, y))
-
-    .. tab:: XShardsTSDataset
-
-        .. code-block:: python
-
-            # forecaster
-            data = tsdata.roll(lookback=..., horizon=...).to_xshards()
-            forecaster.fit(data)
-```
-
-### 6.2 Pandas Exporting
-Now we support pandas dataframe exporting through `to_pandas()` for users to carry out their own transformation. Here is an example of using only one time series for anomaly detection.
-```python
-# anomaly detector on "target" col
-x = tsdata.to_pandas()["target"].to_numpy()
-anomaly_detector.fit(x)
-```
-View [TSDataset API Doc](../../PythonAPI/Chronos/tsdataset.html#) for more details.
-
-## 7. Built-in Dataset
-
-Built-in Dataset supports the function of data downloading, preprocessing, and returning to the `TSDataset` object of the public data set.
-
-|Dataset name|Task|Time Series Length|Number of Instances|Feature Number|Information Page|Download Link|
-|---|---|---|---|---|---|---|
-|network_traffic|forecasting|8760|1|2|[network_traffic](http://mawi.wide.ad.jp/~agurim/about.html)|[network_traffic](http://mawi.wide.ad.jp/~agurim/dataset/)|
-|nyc_taxi|forecasting|10320|1|1|[nyc_taxi](https://github.com/numenta/NAB/blob/master/data/README.md)|[nyc_taxi](https://raw.githubusercontent.com/numenta/NAB/v1.0/data/realKnownCause/nyc_taxi.csv)|
-|fsi|forecasting|1259|1|1|[fsi](https://github.com/CNuge/kaggle-code/tree/master/stock_data)|[fsi](https://github.com/CNuge/kaggle-code/raw/master/stock_data/individual_stocks_5yr.zip)|
-|AIOps|anomaly_detect|61570|1|1|[AIOps](https://github.com/alibaba/clusterdata)|[AIOps](http://clusterdata2018pubcn.oss-cn-beijing.aliyuncs.com/machine_usage.tar.gz)|
-|uci_electricity|forecasting|140256|370|1|[uci_electricity](https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014)|[uci_electricity](https://archive.ics.uci.edu/ml/machine-learning-databases/00321/LD2011_2014.txt.zip)|
-|tsinghua_electricity|forecasting|26304|321|1|[tsinghua_electricity](https://cloud.tsinghua.edu.cn/d/e1ccfff39ad541908bae/?p=%2Felectricity&mode=list)|[tsinghua_electricity](https://cloud.tsinghua.edu.cn/d/e1ccfff39ad541908bae/?p=%2Felectricity&mode=list)|
-
-Specify the `name`, the raw data file will be saved in the specified `path` (defaults to ~/.chronos/dataset). `redownload` can help you re-download the files you need.
-
-When `with_split` is set to True, the length of the data set will be divided according to the specified `val_ratio` and `test_ratio`, and three `TSDataset` will be returned. `with_split` defaults to True, `val_ratio` and `test_ratio` defaults to **0.1**. If you need only one `TSDataset`, just specify `with_split` to False.
-About `TSDataset`, more details, please refer to [here](../../PythonAPI/Chronos/tsdataset.html).
-
-```python
-# load built-in dataset
-from bigdl.chronos.data import get_public_dataset
-from sklearn.preprocessing import StandardScaler
-tsdata_train, tsdata_val, \
-    tsdata_test = get_public_dataset(name='nyc_taxi',
-                                     with_split=True,
-                                     val_ratio=0.1,
-                                     test_ratio=0.1
-                                     )
-# carry out additional customized preprocessing on the dataset.
-stand = StandardScaler()
-for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
-    tsdata.gen_dt_feature(one_hot_features=['HOUR'])\
-          .impute()\
-          .scale(stand, fit=tsdata is tsdata_train)
-```
--- a/docs/readthedocs/source/doc/Chronos/Overview/deep_dive.rst
+++ b/docs/readthedocs/source/doc/Chronos/Overview/deep_dive.rst
@ -1,10 +0,0 @@
-Chronos Deep Dive
-=========
-
-* `Time Series Processing and Feature Engineering <data_processing_feature_engineering.html>`__ introduces how to load a built-in/customized dataset and carry out transformation and feature engineering on it.
-* `Time Series Forecasting <forecasting.html>`__ introduces how to build a time series forecasting application.
-* `Time Series Anomaly Detection <anomaly_detection.html>`__ introduces how to build a anomaly detection application.
-* `Generate Synthetic Sequential Data <simulation.html>`__ introduces how to build a series data generation application.
-* `Artificial Intelligence for IT operations (AIOps)`__ introduces how to build an AI system for AIOps use-cases.
-* `Speed up Chronos built-in/customized models <speed_up.html>`__ introduces how to speed up chronos built-in models/customized time-series models
-* `Useful Functionalities <useful_functionalities.html>`__ introduces some functionalities provided by Chronos that can help you improve accuracy/performance or scale the application to a larger data. 
--- a/docs/readthedocs/source/doc/Chronos/Overview/forecasting.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/forecasting.md
@ -1,287 +0,0 @@
-# Time Series Forecasting
-
-_Chronos_ provides both deep learning/machine learning models and traditional statistical models for forecasting.
-
-There're three ways to do forecasting:
- Use highly integrated [**AutoTS pipeline**](#use-autots-pipeline) with auto feature generation, data pre/post-processing, hyperparameter optimization.
- Use [**auto forecasting models**](#use-auto-forecasting-model) with auto hyperparameter optimization.
- Use [**standalone forecasters**](#use-standalone-forecaster-pipeline).
-
-Besides, _Chronos_ also provides **benchmark tool** to benchmark forecasting performance. For more information, please refer to [Use Chronos benchmark tool](https://bigdl.readthedocs.io/en/latest/doc/Chronos/Howto/how_to_use_benchmark_tool.html).
-
-#### 0. Supported Time Series Forecasting Model
-
- `Model`: Model name.
- `Style`: Forecasting model style. Detailed information will be stated in [this section](#time-series-forecasting-concepts).
- `Multi-Variate`: Predict more than one variable at the same time?
- `Multi-Step`: Predict more than one data point in the future?
- `Exogenous Variables`: Take other variables(you don't need to predict) into consideration?
- `Distributed`: Scale the model to a cluster and take data from distributed file system?
- `ONNX`: Export and use `OnnxRuntime` to do the inference.
- `Quantization`: Export and use quantized int8 model to do the inference.
- `Auto Models`: AutoModel API support.
- `AutoTS`: AutoTS API support.
- `Backend`: The DL framework we use to implement this model.
-
-<span id="supported_forecasting_model"></span>
-
-| Model   | Style | Multi-Variate | Multi-Step | Exogenous Variables | Distributed | ONNX | Quantization | Auto Models | AutoTS | Backend |
-| ----------------- | ----- | ------------- | ---------- | ------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- |
-| LSTM    | RR    | ✅             | ❌      | ✅    | ✅   | ✅           | ✅        | ✅          | ✅         | pytorch/tf2  |
-| Seq2Seq     | RR    | ✅             | ✅     | ✅     | ✅     | ✅           | ❌      | ✅          | ✅         | pytorch/tf2  |
-| TCN | RR    | ✅             | ✅     | ✅     | ✅     | ✅           | ✅      | ✅          | ✅         | pytorch/tf2  |
-| Autoformer | RR    | ✅             | ✅     | ✅     | ❌     | ❌           | ❌      | ❌          | ❌         | pytorch  |
-| NBeats | RR    | ❌             | ✅     | ❌     | ✅     | ✅           | ✅      | ❌          | ❌         | pytorch  |
-| MTNet   | RR    | ✅             | ❌    | ✅     | ❌     | ❌          | ❌         | ❌          | ✳️\*\*        | tf2 |
-| TCMF    | TS    | ✅             | ✅    | ✅      | ✳️\*     | ❌          | ❌         | ❌          | ❌         | pytorch  |
-| Prophet | TS    | ❌             | ✅    | ❌      | ❌        | ❌          | ❌      | ✅          | ❌         | prophet  |
-| ARIMA   | TS    | ❌             | ✅    | ❌      | ❌         | ❌          | ❌     | ✅          | ❌         | pmdarima |
-| Customized\*\*\* | RR | Customized | Customized | Customized | ❌ |✅|❌|❌|✅|pytorch
-
-\* TCMF only partially supports distributed training.<br>
-\*\*  Auto tuning of MTNet is only supported in our deprecated AutoTS API.<br>
-\*\*\* Customized model is only supported in `AutoTSEstimator` with pytorch as backend.
-
-
-
-#### 1. Time Series Forecasting Concepts
-Time series forecasting is one of the most popular tasks on time series data. **In short, forecasing aims at predicting the future by using the knowledge you can learn from the history.**
-
-##### 1.1 Traditional Statistical(TS) Style
-Traditionally, Time series forecasting problem was formulated with rich mathematical fundamentals and statistical models. Typically, one model can only handle one time series and fit on the whole time series before the last observed timestamp and predict the next few steps. Training(fit) is needed every time you change the last observed timestamp.
-
-![](../Image/forecast-TS.png)
-
-##### 1.2 Regular Regression(RR) Style
-Recent years, common deep learning architectures (e.g. RNN, CNN, Transformer, etc.) are being successfully applied to forecasting problem. Forecasting is transformed to a supervised learning regression problem in this style. A model can predict several time series. Typically, a sampling process based on sliding-window is needed, some terminology is explained as following:
-
- `lookback` / `past_seq_len`: the length of historical data along time. This number is tunable.
- `horizon` / `future_seq_len`: the length of predicted data along time. This number is depended on the task definition. If this value larger than 1, then the forecasting task is *Multi-Step*.
- `input_feature_num`: The number of variables the model can observe. This number is tunable since we can select a subset of extra feature to use.
- `output_feature_num`: The number of variables the model to predict. This number is depended on the task definition. If this value larger than 1, then the forecasting task is *Multi-Variate*.
-
-<span id="RR-forecast-image"></span>
-![](../Image/forecast-RR.png)
-
-#### 2. Use AutoTS Pipeline
-For AutoTS Pipeline, we will leverage `AutoTSEstimator`, `TSPipeline` and preferably `TSDataset`. A typical usage of AutoTS pipeline basically contains 3 steps.
-1. Prepare a `TSDataset` or customized data creator.
-2. Init a `AutoTSEstimator` and call `.fit()` on the data.
-3. Use the returned `TSPipeline` for further development.
-```eval_rst
-.. warning::
-    ``AutoTSTrainer`` workflow has been deprecated, no feature updates or performance improvement will be carried out. Users of ``AutoTSTrainer`` may refer to `Chronos API doc <https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Chronos/autots.html>`_.
-```
-```eval_rst
-.. note::
-    ``AutoTSEstimator`` currently only support pytorch backend.
-```
-View [Quick Start](../QuickStart/chronos-autotsest-quickstart.html) for a more detailed example.
-
-##### 2.1 Prepare dataset
-`AutoTSEstimator` support 2 types of data input.
-
-You can easily prepare your data in `TSDataset` (recommended). You may refer to [here](#TSDataset) for the detailed information to prepare your `TSDataset` with proper data processing and feature generation. Here is a typical `TSDataset` preparation.
-```python
-from bigdl.chronos.data import TSDataset
-from sklearn.preprocessing import StandardScaler
-
-tsdata_train, tsdata_val, tsdata_test\
-    = TSDataset.from_pandas(df, dt_col="timestamp", target_col="value", with_split=True, val_ratio=0.1, test_ratio=0.1)
-
-standard_scaler = StandardScaler()
-for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
-    tsdata.gen_dt_feature()\
-          .impute(mode="last")\
-          .scale(standard_scaler, fit=(tsdata is tsdata_train))
-```
-You can also create your own data creator. The data creator takes a dictionary config and returns a pytorch dataloader. Users may define their own customized key and add them to the search space. "batch_size" is the only fixed key.
-```python
-from torch.utils.data import DataLoader
-def training_data_creator(config):
-    return Dataloader(..., batch_size=config['batch_size'])
-```
-##### 2.2 Create an AutoTSEstimator
-`AutoTSEstimator` depends on the [Distributed Hyper-parameter Tuning](../../Orca/Overview/distributed-tuning.html) supported by Project Orca. It also provides time series only functionalities and optimization. Here is a typical initialization process.
-```python
-import bigdl.orca.automl.hp as hp
-from bigdl.chronos.autots import AutoTSEstimator
-auto_estimator = AutoTSEstimator(model='lstm',
-                                 search_space='normal',
-                                 past_seq_len=hp.randint(1, 10),
-                                 future_seq_len=1,
-                                 selected_features="auto")
-```
-We prebuild three defualt search space for each build-in model, which you can use the by setting `search_space` to "minimal"，"normal", or "large" or define your own search space in a dictionary. The larger the search space, the better accuracy you will get and the more time will be cost.
-
-`past_seq_len` can be set as a hp sample function, the proper range is highly related to your data. A range between 0.5 cycle and 2 cycle is reasonable. You may set it to `"auto"`, then a cycle length will be detected automatically and this parameter will be set to a random search between 0.5 cycle and 2 cycle length.
-
-`selected_features` is set to `"auto"` by default, where the `AutoTSEstimator` will find the best subset of extra features to help the forecasting task.
-
-##### 2.3 Fit on AutoTSEstimator
-Fitting on `AutoTSEstimator` is fairly easy. A `TSPipeline` will be returned once fitting is completed.
-```python
-ts_pipeline = auto_estimator.fit(data=tsdata_train,
-                                 validation_data=tsdata_val,
-                                 batch_size=hp.randint(32, 64),
-                                 epochs=5)
-```
-Detailed information and settings please refer to [AutoTSEstimator API doc](../../PythonAPI/Chronos/autotsestimator.html#id1).
-##### 2.4 Development on TSPipeline
-You may carry out predict, evaluate, incremental training or save/load for further development.
-```python
-# predict with the best trial
-y_pred = ts_pipeline.predict(tsdata_test)
-
-# evaluate the result pipeline
-mse, smape = ts_pipeline.evaluate(tsdata_test, metrics=["mse", "smape"])
-print("Evaluate: the mean square error is", mse)
-print("Evaluate: the smape value is", smape)
-
-# save the pipeline
-my_ppl_file_path = "/tmp/saved_pipeline"
-ts_pipeline.save(my_ppl_file_path)
-
-# restore the pipeline for further deployment
-from bigdl.chronos.autots import TSPipeline
-loaded_ppl = TSPipeline.load(my_ppl_file_path)
-```
-Detailed information please refer to [TSPipeline API doc](../../PythonAPI/Chronos/autotsestimator.html#tspipeline).
-
-```eval_rst
-.. note::
-    ``init_orca_context`` is not needed if you just use the trained TSPipeline for inference, evaluation or incremental fitting.
-```
-```eval_rst
-.. note::
-    Incremental fitting on TSPipeline just update the model weights the standard way, which does not involve AutoML.
-```
-
-#### 3. Use Standalone Forecaster Pipeline
-
-_Chronos_ provides a set of standalone time series forecasters without AutoML support, including deep learning models as well as traditional statistical models.
-
-View some examples notebooks for [Network Traffic Prediction][network_traffic]
-
-The common process of using a Forecaster looks like below.
-```python
-# set fixed hyperparameters, loss, metric...
-f = Forecaster(...)
-# input data, batch size, epoch...
-f.fit(...)
-# input test data x, batch size...
-f.predict(...)
-```
-The input data can be easily get from `TSDataset`.
-View [Quick Start](../QuickStart/chronos-tsdataset-forecaster-quickstart.md) for a more detailed example. Refer to [API docs](../../PythonAPI/Chronos/forecasters.html) of each Forecaster for detailed usage instructions and examples.
-
-<span id="LSTMForecaster"></span>
-##### 3.1 LSTMForecaster
-
-LSTMForecaster wraps a vanilla LSTM model, and is suitable for univariate time series forecasting.
-
-View Network Traffic Prediction [notebook][network_traffic_model_forecasting] and [LSTMForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#lstmforecaster) for more details.
-
-<span id="Seq2SeqForecaster"></span>
-##### 3.2 Seq2SeqForecaster
-
-Seq2SeqForecaster wraps a sequence to sequence model based on LSTM, and is suitable for multivariant & multistep time series forecasting.
-
-View [Seq2SeqForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#seq2seqforecaster) for more details.
-
-<span id="TCNForecaster"></span>
-##### 3.3 TCNForecaster
-
-Temporal Convolutional Networks (TCN) is a neural network that use convolutional architecture rather than recurrent networks. It supports multi-step and multi-variant cases. Causal Convolutions enables large scale parallel computing which makes TCN has less inference time than RNN based model such as LSTM.
-
-View Network Traffic multivariate multistep Prediction [notebook][network_traffic_multivariate_multistep_tcnforecaster] and [TCNForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#tcnforecaster) for more details.
-
-<span id="MTNetForecaster"></span>
-##### 3.4 MTNetForecaster
-
-```eval_rst
-.. note::
-    **Additional Dependencies**:
-    You need to install ``bigdl-nano[tensorflow]`` to enable this built-in model.
-
-    ``pip install bigdl-nano[tensorflow]``
-```
-
-MTNetForecaster wraps a MTNet model. The model architecture mostly follows the [MTNet paper](https://arxiv.org/abs/1809.02105) with slight modifications, and is suitable for multivariate time series forecasting.
-
-View Network Traffic Prediction [notebook][network_traffic_model_forecasting] and [MTNetForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#mtnetforecaster) for more details.
-
-<span id="TCMFForecaster"></span>
-##### 3.5 TCMFForecaster
-
-TCMFForecaster wraps a model architecture that follows implementation of the paper [DeepGLO paper](https://arxiv.org/abs/1905.03806) with slight modifications. It is especially suitable for extremely high dimensional (up-to millions) multivariate time series forecasting.
-
-View High-dimensional Electricity Data Forecasting [example][run_electricity] and [TCMFForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#tcmfforecaster) for more details.
-
-<span id="ARIMAForecaster"></span>
-##### 3.6 ARIMAForecaster
-
-```eval_rst
-.. note::
-    **Additional Dependencies**:
-    You need to install ``pmdarima`` to enable this built-in model.
-
-    ``pip install pmdarima==1.8.5``
-```
-
-ARIMAForecaster wraps a ARIMA model and is suitable for univariate time series forecasting. It works best with data that show evidence of non-stationarity in the sense of mean (and an initial differencing step (corresponding to the "I, integrated" part of the model) can be applied one or more times to eliminate the non-stationarity of the mean function.
-
-View [ARIMAForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#arimaforecaster) for more details.
-
-<span id="ProphetForecaster"></span>
-##### 3.7 ProphetForecaster
-
-```eval_rst
-.. note::
-    **Additional Dependencies**:
-    You need to install `prophet` to enable this built-in model.
-
-    ``pip install prophet==1.1.0``
-```
-
-```eval_rst
-.. note::
-    **Acceleration Note**:
-    Intel® Distribution for Python may improve the speed of prophet's training and inferencing. You may install it by refering to https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html.
-```
-
-ProphetForecaster wraps the Prophet model ([site](https://github.com/facebook/prophet)) which is an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects and is suitable for univariate time series forecasting. It works best with time series that have strong seasonal effects and several seasons of historical data and is robust to missing data and shifts in the trend, and typically handles outliers well.
-
-View Stock Prediction [notebook][stock_prediction_prophet] and [ProphetForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#prophetforecaster) for more details.
-
-<span id="NBeatsForecaster"></span>
-##### 3.8 NBeatsForecaster
-
-Neural basis expansion analysis for interpretable time series forecasting ([N-BEATS](https://arxiv.org/abs/1905.10437)) is a deep neural architecture based on backward and forward residual links and a very deep stack of fully-connected layers. Nbeats can solve univariate time series point forecasting problems, being interpretable, and fast to train.
-
-[NBeatsForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#nbeatsforecaster) for more details.
-
-<span id="AutoformerForecaster"></span>
-##### 3.9 AutoformerForecaster
-
-Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting ([Autoformer](https://arxiv.org/abs/2106.13008)) is a Transformer based neural network that could reach SOTA results on many datasets.
-
-[AutoformerForecaster API Doc](../../PythonAPI/Chronos/forecasters.html#autoformerforecaster) for more details.
-
-#### 4. Use Auto forecasting model
-Auto forecasting models are designed to be used exactly the same as Forecasters. The only difference is that you can set hp search function to the hyperparameters and the `.fit()` method will search the best hyperparameter setting.
-```python
-# set hyperparameters in hp search function, loss, metric...
-auto_model = AutoModel(...)
-# input data, batch size, epoch...
-auto_model.fit(...)
-# input test data x, batch size...
-auto_model.predict(...)
-```
-The input data can be easily get from `TSDataset`. Users can refer to detailed [API doc](../../PythonAPI/Chronos/automodels.html).
-
-[network_traffic]:<https://github.com/intel-analytics/BigDL/tree/main/python/chronos/use-case/network_traffic>
-[network_traffic_model_forecasting]:<https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_model_forecasting.ipynb>
-[network_traffic_multivariate_multistep_tcnforecaster]:<https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_multivariate_multistep_tcnforecaster.ipynb>
-[run_electricity]:<https://github.com/intel-analytics/BigDL/blob/main/python/chronos/example/tcmf/run_electricity.py>
-[stock_prediction_prophet]:<https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/fsi/stock_prediction_prophet.ipynb>
--- a/docs/readthedocs/source/doc/Chronos/Overview/install.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/install.md
@ -1,151 +0,0 @@
-# Chronos Installation
-
---
-
-#### OS and Python version requirement
-
-
-```eval_rst
-.. note::
-
-    **Supported OS**:
-
-    Chronos is thoroughly tested on Ubuntu (16.04/18.04/20.04), and should works fine on CentOS. If you are a Windows user, there are 2 ways to use Chronos:
-     
-    1. You could use Chronos on a windows laptop with WSL2 (you may refer to `here <https://docs.microsoft.com/en-us/windows/wsl/setup/environment>`_) or just install a ubuntu virtual machine.
-
-    2. You could use Chronos on native Windows, but some features are unavailable in this case, the limitations will be shown below.
-```
-```eval_rst
-.. note::
-
-    **Supported Python Version**:
-
-    Chronos supports all installation options on Python 3.7 ~ 3.9. For details about different installation options, refer to `here <#install-using-conda>`_.
-```
-
-
-
-#### Install using Conda
-
-We recommend using conda to manage the Chronos python environment. For more information about Conda, refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
-Select your preferences in the panel below to find the proper install command. Then run the install command as the example shown below.
-
-
-```eval_rst
-.. raw:: html
-
-    <link rel="stylesheet" type="text/css" href="../../../_static/css/installation_panel.css" />
-
-    <div class="installation-panel-wrapper">
-
-        <table class="installation-panel-table">
-            <tbody>
-                <tr>
-                    <td colspan="1">Functionality</td>
-                    <td colspan="2"><button id="Forecasting">Forecasting</button></td>
-                    <td colspan="2"><button id="Anomaly" class="fitting-cell">Anomaly Detection</button></td>
-                    <td colspan="2"><button id="Simulation">Simulation</button></td>
-                </tr>
-                <tr id="model">
-                    <td colspan="1">Model</td>
-                    <td colspan="2"><button id="Deep_learning_models">Deep learning</button></td>
-                    <td colspan="2"><button id="Prophet">Prophet</button></td>
-                    <td colspan="2"><button id="ARIMA">ARIMA</button></td>
-                </tr>
-                <tr>
-                    <td colspan="1">DL framework</td>
-                    <td colspan="3"><button id="pytorch"
-                            title="Use PyTorch as deep learning models' backend. Most of the model support and works better under PyTorch.">PyTorch (Recommended)</button>
-                    </td>
-                    <td colspan="3"><button id="tensorflow"
-                            title="Use Tensorflow as deep learning models' backend.">TensorFlow</button></td>
-                </tr>
-                <tr>
-                    <td colspan="1">OS</td>
-                    <td colspan="3"><button id="linux" title="Ubuntu/CentOS is recommended">Linux</button></td>
-                    <td colspan="3"><button id="win" title="WSL is needed for Windows users">Windows</button></td>
-                </tr>
-
-                <tr>
-                    <td colspan="1">Auto Tuning</td>
-                    <td colspan="3" title="I don't need any hyperparameter auto tuning feature."><button
-                            id="automlno">No</button></td>
-                    <td colspan="3" title="I need chronos to help me tune the hyperparameters."><button
-                            id="automlyes">Yes</button></td>
-                </tr>
-
-                <tr>
-                    <td colspan="1">Inference Opt</td>
-                    <td colspan="3" title="No need for low-latency inference models"><button id="inferenceno">No</button></td>
-                    <td colspan="3" title="Get low-latency inference models with onnx\openvino\inc"><button id="inferenceyes">Yes</button></td>
-                </tr>
-
-                <tr>
-                    <td colspan="1">Hardware</td>
-                    <td colspan="3"><button id="singlenode" title="For users use laptop/single node server.">Single
-                            node</button></td>
-                    <td colspan="3"><button id="cluster" title="For users use K8S/Yarn Cluster.">Cluster</button></td>
-                </tr>
-
-                <tr>
-                    <td colspan="1">Package</td>
-                    <td colspan="3"><button id="pypi" title="For users use pip to install chronos.">Pip</button></td>
-                    <td colspan="3"><button id="docker" title="For users use docker image.">Docker</button></td>
-                </tr>
-
-                <tr>
-                    <td colspan="1">Version</td>
-                    <td colspan="3"><button id="nightly"
-                            title="For users would like to try chronos's latest feature">Nightly</button></td>
-                    <td colspan="3"><button id="stable"
-                            title="For users would like to deploy chronos in their production">Stable</button></td>
-                </tr>
-
-                <tr>
-                    <td colspan="1">Install CMD</td>
-                    <td colspan="6" id="cmd">NA</td>
-                </tr>
-            </tbody>
-        </table>
-    </div>
-
-    <script src="../../../_static/js/chronos_installation_panel.js"></script>
-```
-
-</br>
-
-
-```bash
-# create a conda environment for chronos
-conda create -n my_env python=3.8 setuptools=58.0.4
-conda activate my_env
-
-# select your preference in above panel to find the proper command to replace the below command, e.g.
-pip install --pre --upgrade bigdl-chronos[pytorch]
-
-# init bigdl-nano to enable local accelerations
-source bigdl-nano-init  # accelerate the conda env
-```
-
-##### Install Chronos on native Windows
-
-Chronos can be simply installed using pip on native Windows, you could use the same command as Linux to install, but unfortunately, some features are unavailable now:
-
-1. `bigdl-chronos[distributed]` is not supported.
-
-2. `intel_extension_for_pytorch (ipex)` is unavailable for Windows now, so the related feature is not supported.
-
-For some known issues when installing and using Chronos on native Windows, you could refer to [windows_guide](https://bigdl.readthedocs.io/en/latest/doc/Chronos/Howto/windows_guide.html).
-
-##### Install Chronos along with specific Tensorflow
-
-Currently, the default Tensorflow version of Chronos is 2.7. But Chronos is also validated on Tensorflow 2.8-2.12. If you want to use specific Tensorflow, please follow the table below to find the extra install command after installing Chronos.
-
-| TF version       | Install CMD                                                                 |
-| ---------------- | --------------------------------------------------------------------------- |
-| **2.8**          | pip install tensorflow==2.8.0 intel-tensorflow==2.8.0                       |
-| **2.9**          | pip install tensorflow==2.9.0 intel-tensorflow==2.9.1                       |
-| **2.10**         | pip install tensorflow==2.10.0 intel-tensorflow==2.10.0                     |
-| **2.11**         | pip install tensorflow==2.11.0 intel-tensorflow==2.11.0                     |
-| **2.12**         | pip install tensorflow==2.12.0 intel-tensorflow==2.12.0 protobuf==3.20.3    |
--- a/docs/readthedocs/source/doc/Chronos/Overview/quick-tour.rst
+++ b/docs/readthedocs/source/doc/Chronos/Overview/quick-tour.rst
@ -1,289 +0,0 @@
-Chronos Quick Tour
-=================================
-Welcome to Chronos for building a fast, accurate and scalable time series analysis application🎉! Start with our quick tour to understand some critical concepts and how to use them to tackle your tasks.
-
-.. grid:: 1 1 1 1
-
-    .. grid-item-card::
-        :text-align: center
-
-        **Data processing**
-        ^^^
-        Time series data processing includes imputing, deduplicating, resampling, scale/unscale, roll sampling, etc to process raw time series data(typically in a table) to a format that is understandable to the models. ``TSDataset`` and ``XShardsTSDataset`` are provided for an abstraction.
-        +++
-        .. button-ref:: TSDataset/XShardsTSDataset
-            :color: primary
-            :expand:
-            :outline:
-
-            Get Started
-
-.. grid:: 1 3 3 3
-    :gutter: 2
-
-    .. grid-item-card::
-        :text-align: center
-        :class-card: sd-mb-2
-
-        **Forecasting**
-        ^^^
-        Time series forecasting uses history data to predict future data. ``Forecaster`` and ``AutoTSEstimator`` are provided for built-in algorithms and distributed hyperparameter tunning.
-        +++
-        .. button-ref:: Forecaster
-            :color: primary
-            :expand:
-            :outline:
-
-            Get Started
-
-    .. grid-item-card::
-        :text-align: center
-        :class-card: sd-mb-2
-
-        **Anomaly Detection**
-        ^^^
-        Time series anomaly detection finds the anomaly point in time series. ``Detector`` is provided for many built-in algorithms.
-        +++
-        .. button-ref:: Detector
-            :color: primary
-            :expand:
-            :outline:
-
-            Get Started
-
-    .. grid-item-card::
-        :text-align: center
-        :class-card: sd-mb-2
-
-        **Simulation**
-        ^^^
-        Time series simulation generates synthetic time series data. ``Simulator`` is provided for many built-in algorithms.
-        +++
-        .. button-ref:: Simulator(experimental)
-            :color: primary
-            :expand:
-            :outline:
-
-            Get Started
-
-
-TSDataset/XShardsTSDataset
---------------------
-
-In Chronos, we provide a ``TSDataset`` (and a ``XShardsTSDataset`` to handle large data input in distributed fashion) abstraction to represent a time series dataset. It is responsible for preprocessing raw time series data(typically in a table) to a format that is understandable to the models. Many typical transformation, preprocessing and feature engineering method can be called cascadely on ``TSDataset`` or ``XShardsTSDataset``.
-
-.. code-block:: python
-
-    # !wget https://raw.githubusercontent.com/numenta/NAB/v1.0/data/realKnownCause/nyc_taxi.csv
-    import pandas as pd
-    from sklearn.preprocessing import StandardScaler
-    from bigdl.chronos.data import TSDataset
-
-    df = pd.read_csv("nyc_taxi.csv", parse_dates=["timestamp"])
-    tsdata = TSDataset.from_pandas(df,
-                                dt_col="timestamp",
-                                target_col="value")
-    scaler = StandardScaler()
-    tsdata.deduplicate()\
-        .impute()\
-        .gen_dt_feature()\
-        .scale(scaler)\
-        .roll(lookback=100, horizon=1)
-
-
-.. grid:: 2
-    :gutter: 2
-
-    .. grid-item-card::
-
-        .. button-ref:: ./data_processing_feature_engineering
-            :color: primary
-            :expand:
-            :outline:
-
-            Tutorial
-
-    .. grid-item-card::
-
-        .. button-ref:: ../../PythonAPI/Chronos/tsdataset
-            :color: primary
-            :expand:
-            :outline:
-
-            API Document
-
-Forecaster
-----------------------
-We have implemented quite a few algorithms among traditional statistics to deep learning for time series forecasting in ``bigdl.chronos.forecaster`` package. Users may train these forecasters on history time series and use them to predict future time series.
-
-To import a specific forecaster, you may use {algorithm name} + "Forecaster", and call ``fit`` to train the forecaster and ``predict`` to predict future data.
-
-.. code-block:: python
-
-    from bigdl.chronos.forecaster import TCNForecaster  # TCN is algorithm name
-    from bigdl.chronos.data import get_public_dataset
-
-    if __name__ == "__main__":
-        # use nyc_taxi public dataset
-        train_data, _, test_data = get_public_dataset("nyc_taxi")
-        for data in [train_data, test_data]:
-            # use 100 data point in history to predict 1 data point in future
-            data.roll(lookback=100, horizon=1)
-
-        # create a forecaster
-        forecaster = TCNForecaster.from_tsdataset(train_data)
-
-        # train the forecaster
-        forecaster.fit(train_data)
-
-        # predict with the trained forecaster
-        pred = forecaster.predict(test_data)
-
-
-AutoTSEstimator
---------------------------
-For time series forecasting, we also provide an ``AutoTSEstimator`` for distributed hyperparameter tunning as an extention to ``Forecaster``. Users only need to create a ``AutoTSEstimator`` and call ``fit`` to train the estimator. A ``TSPipeline`` will be returned for users to predict future data.
-
-.. code-block:: python
-
-    from bigdl.orca.automl import hp
-    from bigdl.chronos.data import get_public_dataset
-    from bigdl.chronos.autots import AutoTSEstimator
-    from bigdl.orca import init_orca_context, stop_orca_context
-    from sklearn.preprocessing import StandardScaler
-
-    if __name__ == "__main__":
-        # initial orca context
-        init_orca_context(cluster_mode="local", cores=4, memory="8g", init_ray_on_spark=True)
-
-        # load dataset
-        tsdata_train, tsdata_val, tsdata_test = get_public_dataset(name='nyc_taxi')
-
-        # dataset preprocessing
-        stand = StandardScaler()
-        for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
-            tsdata.gen_dt_feature().impute()\
-                .scale(stand, fit=tsdata is tsdata_train)
-
-        # AutoTSEstimator initalization
-        autotsest = AutoTSEstimator(model="tcn",
-                                    future_seq_len=10)
-
-        # AutoTSEstimator fitting
-        tsppl = autotsest.fit(data=tsdata_train,
-                            validation_data=tsdata_val)
-
-        # Prediction
-        pred = tsppl.predict(tsdata_test)
-
-        # stop orca context
-        stop_orca_context()
-
-.. grid:: 3
-    :gutter: 2
-
-    .. grid-item-card::
-
-        .. button-ref:: ../QuickStart/chronos-tsdataset-forecaster-quickstart
-            :color: primary
-            :expand:
-            :outline:
-
-            Quick Start
-
-    .. grid-item-card::
-
-        .. button-ref:: ./forecasting
-            :color: primary
-            :expand:
-            :outline:
-
-            Tutorial
-
-    .. grid-item-card::
-
-        .. button-ref:: ../../PythonAPI/Chronos/forecasters
-            :color: primary
-            :expand:
-            :outline:
-
-            API Document
-
-Detector
--------------------
-We have implemented quite a few algorithms among traditional statistics to deep learning for time series anomaly detection in ``bigdl.chronos.detector.anomaly`` package.
-
-To import a specific detector, you may use {algorithm name} + "Detector", and call ``fit`` to train the detector and ``anomaly_indexes`` to get anomaly data points' indexs.
-
-.. code-block:: python
-
-    from bigdl.chronos.detector.anomaly import DBScanDetector  # DBScan is algorithm name
-    from bigdl.chronos.data import get_public_dataset
-
-    if __name__ == "__main__":
-        # use nyc_taxi public dataset
-        train_data = get_public_dataset("nyc_taxi", with_split=False)
-
-        # create a detector
-        detector = DBScanDetector()
-
-        # fit a detector
-        detector.fit(train_data.to_pandas()['value'].to_numpy())
-
-        # find the anomaly points
-        anomaly_indexes = detector.anomaly_indexes()
-
-.. grid:: 3
-    :gutter: 2
-
-    .. grid-item-card::
-
-        .. button-ref:: ../QuickStart/chronos-anomaly-detector
-            :color: primary
-            :expand:
-            :outline:
-
-            Quick Start
-
-    .. grid-item-card::
-
-        .. button-ref:: ./anomaly_detection
-            :color: primary
-            :expand:
-            :outline:
-
-            Tutorial
-
-    .. grid-item-card::
-
-        .. button-ref:: ../../PythonAPI/Chronos/anomaly_detectors
-            :color: primary
-            :expand:
-            :outline:
-
-            API Document
-
-Simulator(experimental)
---------------------
-Simulator is still under activate development with unstable API.
-
-.. grid:: 2
-    :gutter: 2
-
-    .. grid-item-card::
-
-        .. button-ref:: ./simulation
-            :color: primary
-            :expand:
-            :outline:
-
-            Tutorial
-
-    .. grid-item-card::
-
-        .. button-ref:: ../../PythonAPI/Chronos/simulator
-            :color: primary
-            :expand:
-            :outline:
-
-            API Document
--- a/docs/readthedocs/source/doc/Chronos/Overview/simulation.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/simulation.md
@ -1,18 +0,0 @@
-# Synthetic Data Generation
-
-Chronos provides simulators to generate synthetic time series data for users who want to conquer limited data access in a deep learning/machine learning project or only want to generate some synthetic data to play with.
-
-```eval_rst
-.. note::
-     ``DPGANSimulator`` is the only simulator chronos provides at the moment, more simulators are on their way.
-```
-
-## 1. DPGANSimulator
-`DPGANSimulator` adopt DoppelGANger raised in [Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions](http://arxiv.org/abs/1909.13403). The method is data-driven unsupervised method based on deep learning model with GAN (Generative Adversarial Networks) structure. The model features a pair of separate attribute generator and feature generator and their corresponding discriminators `DPGANSimulator` also supports a rich and comprehensive input data (training data) format and outperform other algorithms in many evaluation metrics.
-
-```eval_rst
-.. note::
-     We reimplement this model by pytorch(original implementation was based on tf1) for better performance(both speed and memory).
-```
-
-Users may refer to detailed [API doc](../../PythonAPI/Chronos/simulator.html#module-bigdl.chronos.simulator.doppelganger_simulator).
--- a/docs/readthedocs/source/doc/Chronos/Overview/speed_up.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/speed_up.md
@ -1,143 +0,0 @@
-# Accelerated Training and Inference
-
-Chronos provides transparent acceleration for Chronos built-in models and customized time-series models. In this deep-dive page, we will introduce how to enable/disable them.
-
-We will focus on **single node acceleration for forecasting models' training and inferencing** in this page. Other topic such as:
-
- Distributed time series data processing - [XShardsTSDataset (based on Spark, powered by `bigdl.orca.data`)](./useful_functionalities.html#xshardstsdataset)
- Distributed training on a cluster - [Distributed training (based on Ray/Spark/Horovod, powered by `bigdl.orca.learn`)](./useful_functionalities.html#distributed-training)
- Non-forecasting models / non-deep-learning models - [Prophet with intel python](./forecasting.html#prophetforecaster), [DBScan Detector with intel Sklearn](./anomaly_detection.html#dbscandetector), [DPGANSimulator pytorch implementation](./simulation.html#dpgansimulator).
-
-You may refer to other pages listed above.
-
-### 1. Overview
-Time series model, especially those deep learning models, often suffers slow training speed and unsatisfying inference speed. Chronos is adapted to integrate many optimized library and best known methods(BKMs) for performance improvement on built-in models and customized models.
-
-### 2. Training Acceleration
-Training Acceleration is transparent in Chronos's API. Transparentness means that Chronos users will enjoy the acceleration without changing their code(unless some expert users want to set some advanced settings).
-```eval_rst
-.. note::
-    **Write your script under** ``if __name__=="__main__":``:
-
-     Chronos will automatically utilize the computation resources on the hardware. This may include multi-process training on a single node. Use this header will prevent many strange behavior.
-```
-#### 2.1 `Forecaster` Training Acceleration
-Currently, transparent acceleration for `LSTMForecaster`, `Seq2SeqForecaster`, `TCNForecaster` and `NBeatsForecaster` is **automatically enabled** and tested. Chronos will set various environment variables and config multi-processing training according to the hardware paremeters(e.g. cores number, ...).
-
-Currently, this function is under active development and **some expert users may want to change some config or disable some acceleration tricks**. Here are some instructions.
-
-Users may unset the environment by:
-```bash
-source bigdl-nano-unset-env
-```
-Users may set the the number of process to use in training by:
-```python
-print(forecaster.num_processes)  # num_processes is automatically optimized by Chronos
-forecaster.num_processes = 1  # disable multi-processing training
-forecaster.num_processes = 10  # You may set it to any number you want
-```
-Users may set the IPEX(Intel Pytorch Extension) availbility to use in training by:
-```python
-print(forecaster.use_ipex)  # use_ipex is automatically optimized by Chronos
-forecaster.use_ipex = True  # enable ipex during training
-forecaster.use_ipex = False  # disable ipex during training
-```
-
-#### 2.2 Customized Model Training Acceleration
-We provide an optimized pytorch-lightning Trainer, `TSTrainer`, to accelerate customized time series model defined by pytorch. A typical use-case can be using `pytorch-forecasting`'s built-in models(they are defined in pytorch-lightning LightningModule) and Chronos `TSTrainer` to accelerate the training process.
-
-`TSTrainer` requires very few code changes to your original code. Here is a quick guide:
-```python
-# from pytorch-lightning import Trainer
-from bigdl.chronos.pytorch import TSTrainer as Trainer
-
-trainer = Trainer(...
-                  # set number of processes for training
-                  num_processes=8,
-                  # disable GPU training, TSTrainer currently only available for CPU
-                  gpus=0,
-                  ...)
-```
-
-We have examples adapted from `pytorch-forecasting`'s examples to show the significant speed-up by using `TSTrainer` in our [use-case](https://github.com/intel-analytics/BigDL/tree/main/python/chronos/use-case/pytorch-forecasting).
-
-#### 2.3 Auto Tuning Acceleration
-We are working on the acceleration of `AutoModel` and `AutoTSEstimator`. Please unset the environment by:
-```bash
-source bigdl-nano-unset-env
-```
-
-### 3. Inference Acceleration
-Inference has become a critical part for time series model's performance. This may be divided to two parts:
- Throughput: how many samples can be predicted in a certain amount of time.
- Latency: how much time is used to predict 1 sample.
-
-Typically, throughput and latency is a trade-off pair. We have three optimization options for inferencing in Chronos.
- **Default**: Generally useful for both throughput and latency.
- **ONNX Runtime**: Users may export their trained(w/wo auto tuning) model to ONNX file and deploy it on other service. Chronos also provides an internal onnxruntime inference support for those users who pursue low latency and higher throughput during inference on a single node.
- **Quantization**: Quantization refers to processes that enable lower precision inference. In Chronos, post-training quantization is supported relied on [Intel® Neural Compressor](https://intel.github.io/neural-compressor/README.html).
-```eval_rst
-.. note::
-    **Additional Dependencies**:
-
-    You need to install ``neural-compressor`` to enable quantization related methods.
-
-    ``pip install neural-compressor==1.8.1``
-```
-#### 3.1 `Forecaster` Inference Acceleration
-##### 3.1.1 Default Acceleration
-Nothing needs to be done. Chronos has deployed accleration for inferencing. **some expert users may want to change some config or disable some acceleration tricks**. Here are some instructions:
-
-Users may unset the environment by:
-```bash
-source bigdl-nano-unset-env
-```
-##### 3.1.2 ONNX Runtime
-LSTM, TCN, Seq2seq and NBeats has supported onnx in their forecasters. When users use these built-in models, they may call `predict_with_onnx`/`evaluate_with_onnx` for prediction or evaluation. They may also call `export_onnx_file` to export the onnx model file and `build_onnx` to change the onnxruntime's setting(not necessary).
-```python
-f = Forecaster(...)
-f.fit(...)
-f.predict_with_onnx(...)
-```
-##### 3.1.3 Quantization
-LSTM, TCN and NBeats has supported quantization in their forecasters.
-```python
-# init
-f = Forecaster(...)
-
-# train the forecaster
-f.fit(train_data, ...)
-
-# quantize the forecaster
-f.quantize(train_data, ..., framework=...)
-
-# predict with int8 model with better inference throughput
-f.predict/predict_with_onnx(test_data, quantize=True)
-
-# predict with fp32
-f.predict/predict_with_onnx(test_data, quantize=False)
-
-# save
-f.save(checkpoint_file="fp32.model"
-       quantize_checkpoint_file="int8.model")
-
-# load
-f.load(checkpoint_file="fp32.model"
-       quantize_checkpoint_file="int8.model")
-```
-Please refer to [Forecaster API Docs](../../PythonAPI/Chronos/forecasters.html) for details.
-
-#### 3.2 `TSPipeline` Inference Acceleration
-Basically same to [`Forecaster`](#31-forecaster-inference-acceleration)
-##### 3.2.1 Default Acceleration
-Basically same to [`Forecaster`](#31-forecaster-inference-acceleration)
-##### 3.2.2 ONNX Runtime
-```python
-tsppl.predict_with_onnx(...)
-```
-##### 3.2.3 Quantization
-```python
-tsppl.quantize(...)
-tsppl.predict/predict_with_onnx(test_data, quantize=True/False)
-```
-Please refer to [TSPipeline API doc](../../PythonAPI/Chronos/autotsestimator.html#tspipeline) for details.
--- a/docs/readthedocs/source/doc/Chronos/Overview/useful_functionalities.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/useful_functionalities.md
@ -1,33 +0,0 @@
-# Distributed Processing
-
-
-#### Distributed training
-LSTM, TCN and Seq2seq users can easily train their forecasters in a distributed fashion to **handle extra large dataset and utilize a cluster**. The functionality is powered by Project Orca.
-```python
-f = Forecaster(..., distributed=True)
-f.fit(...)
-f.predict(...)
-f.to_local()  # collect the forecaster to single node
-f.predict_with_onnx(...)  # onnxruntime only supports single node
-```
-#### Distributed Data processing: XShardsTSDataset
-```eval_rst
-.. warning::
-    ``XShardsTSDataset`` is still experimental.
-```
-`TSDataset` is a single thread lib with reasonable speed on large datasets(~10G). When you handle an extra large dataset or limited memory on a single node, `XShardsTSDataset` can be involved to handle the exact same functionality and usage as `TSDataset` in a distributed fashion.
-
-```python
-# a fully distributed forecaster pipeline
-from orca.data.pandas import read_csv
-from bigdl.chronos.data.experimental import XShardsTSDataset
-
-shards = read_csv("hdfs://...")
-tsdata, _, test_tsdata = XShardsTSDataset.from_xshards(...)
-tsdata_xshards = tsdata.roll(...).to_xshards()
-test_tsdata_xshards = test_tsdata.roll(...).to_xshards()
-
-f = Forecaster(..., distributed=True)
-f.fit(tsdata_xshards, ...)
-f.predict(test_tsdata_xshards, ...)
-```
--- a/docs/readthedocs/source/doc/Chronos/Overview/visualization.md
+++ b/docs/readthedocs/source/doc/Chronos/Overview/visualization.md
@ -1,49 +0,0 @@
-# AutoML Visualization
-
-AutoML visualization provides two kinds of visualization. You may use them while fitting on auto models or AutoTS pipeline.
-* During the searching process, the visualizations of each trail are shown and updated every 30 seconds. (Monitor view)
-* After the searching process, a leaderboard of each trail's configs and metrics is shown. (Leaderboard view)
-
-**Note**: AutoML visualization is based on tensorboard and tensorboardx. They should be installed properly before the training starts.
-
-<span id="monitor_view">**Monitor view**</span>
-
-Before training, start the tensorboard server through
-
-```python
-tensorboard --logdir=<logs_dir>/<name>
-```
-
-`logs_dir` is the log directory you set for your predictor(e.g. `AutoTSEstimator`, `AutoTCN`, etc.). `name ` is the name parameter you set for your predictor.
-
-The data in SCALARS tag will be updated every 30 seconds for users to see the training progress.
-
-![](../Image/automl_monitor.png)
-
-After training, start the tensorboard server through
-
-```python
-tensorboard --logdir=<logs_dir>/<name>_leaderboard/
-```
-
-where `logs_dir` and `name` are the same as stated in [Monitor view](#monitor_view).
-
-A dashboard of each trail's configs and metrics is shown in the SCALARS tag.
-
-![](../Image/automl_scalars.png)
-
-A leaderboard of each trail's configs and metrics is shown in the HPARAMS tag.
-
-![](../Image/automl_hparams.png)
-
-**Use visualization in Jupyter Notebook**
-
-You can enable a tensorboard view in jupyter notebook by the following code.
-
-```python
-%load_ext tensorboard
-# for scalar view
-%tensorboard --logdir <logs_dir>/<name>/
-# for leaderboard view
-%tensorboard --logdir <logs_dir>/<name>_leaderboard/
-```
--- a/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-anomaly-detector.md
+++ b/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-anomaly-detector.md
@ -1,50 +0,0 @@
-# Detect Anomaly Point in Real Time Traffic Data
-
---
-
-![](../../../../image/colab_logo_32px.png)[Run in Google Colab][chronos_minn_traffic_anomaly_detector_colab] &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub][chronos_minn_traffic_anomaly_detector]
-
---
-
-**In this guide we will demonstrate how to use _Chronos Anomaly Detector_ for time seires anomaly detection in 3 simple steps.**
-
-### Step 0: Prepare Environment
-
-We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../Overview/chronos.html#install) for more details.
-
-```bash
-conda create -n my_env python=3.7 # "my_env" is conda environment name, you can use any name you like.
-conda activate my_env
-pip install bigdl-chronos
-```
-
-## Step 1: Prepare dataset
-For demonstration, we use the publicly available real time traffic data from the Twin Cities Metro area in Minnesota, collected by the Minnesota Department of Transportation. The detailed information can be found [here](https://github.com/numenta/NAB/blob/master/data/realTraffic/speed_7578.csv)
-
-Now we need to do data cleaning and preprocessing on the raw data. Note that this part could vary for different dataset. 
-For the machine_usage data, the pre-processing contains 2 parts: <br>
-1. Change the time interval from irregular to 5 minutes.<br>
-2. Check missing values and handle missing data.
-
-```python
-from bigdl.chronos.data import TSDataset
-
-tsdata = TSDataset.from_pandas(df, dt_col="timestamp", target_col="value")
-df = tsdata.resample("5min")\
-           .impute(mode="linear")\
-           .to_pandas()
-```
-
-## Step 2: Use Chronos Anomaly Detector
-Chronos provides many anomaly detector for anomaly detection, here we use DBScan as an example. More anomaly detector can be found [here](../../PythonAPI/Chronos/anomaly_detectors.html).
-
-```python
-from bigdl.chronos.detector.anomaly import DBScanDetector
-
-ad = DBScanDetector(eps=0.3, min_samples=6)
-ad.fit(df['value'].to_numpy())
-anomaly_indexes = ad.anomaly_indexes()
-```
-
-[chronos_minn_traffic_anomaly_detector_colab]: <https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_minn_traffic_anomaly_detector.ipynb>
-[chronos_minn_traffic_anomaly_detector]: <https://github.com/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_minn_traffic_anomaly_detector.ipynb>
--- a/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-autotsest-quickstart.md
+++ b/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-autotsest-quickstart.md
@ -1,119 +0,0 @@
-# Tune a Forecasting Task Automatically
-
---
-
-![](../../../../image/colab_logo_32px.png)[Run in Google Colab][chronos_autots_nyc_taxi_colab] &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub][chronos_autots_nyc_taxi]
-
---
-
-**In this guide we will demonstrate how to use _Chronos AutoTSEstimator_ and _Chronos TSPipeline_ to auto tune a time seires forecasting task and handle the whole model development process easily.**
-
-### Introduction
-
-Chronos provides `AutoTSEstimator` as a highly integrated solution for time series forecasting task with hyperparameter autotuning, auto feature selection and auto preprocessing. Users can prepare a `TSDataset`(recommended, used in this notebook) or their own data creator as input data. By constructing a `AutoTSEstimator` and calling `fit` on the data, a `TSPipeline` contains the best model and pre/post data processing will be returned for further development of deployment.
-
-`AutoTSEstimator` only support LSTM, TCN, and Seq2seq built-in models and 3rd party models for now.
-
-### Step 0: Prepare Environment
-
-We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../Overview/chronos.html#install) for more details.
-
-```bash
-conda create -n my_env python=3.7
-conda activate my_env
-pip install --pre --upgrade bigdl-chronos[all]
-```
-
-### Step 1: Init Orca Context
-```python
-if args.cluster_mode == "local":
-    init_orca_context(cluster_mode="local", cores=4) # run in local mode
-elif args.cluster_mode == "k8s":
-    init_orca_context(cluster_mode="k8s", num_nodes=2, cores=2) # run on K8s cluster
-elif args.cluster_mode == "yarn":
-    init_orca_context(cluster_mode="yarn-client", num_nodes=2, cores=2) # run on Hadoop YARN cluster
-```
-This is the only place where you need to specify local or distributed mode. View [Orca Context](../../Orca/Overview/orca-context.md) for more details.
-
-**Note:** You should `export HADOOP_CONF_DIR=/path/to/hadoop/conf/dir` when running on Hadoop YARN cluster. View [Hadoop User Guide](../../UserGuide/hadoop.md) for more details.
-
-### Step 2: Prepare a TSDataset
-Prepare a `TSDataset` and call necessary operations on it.
-```python
-from bigdl.chronos.data import TSDataset
-from sklearn.preprocessing import StandardScaler
-
-tsdata_train, tsdata_val, tsdata_test\
-    = TSDataset.from_pandas(df, dt_col="timestamp", target_col="value", with_split=True, val_ratio=0.1, test_ratio=0.1)
-
-standard_scaler = StandardScaler()
-for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
-    tsdata.gen_dt_feature()\
-          .impute(mode="last")\
-          .scale(standard_scaler, fit=(tsdata is tsdata_train))
-```
-There is no need to call `.roll()` or `.to_torch_data_loader()` in this step, which is the largest difference between the usage of `AutoTSEstimator` and _Chronos Forecaster_. `AutoTSEstimator` will do that automatically and tune the parameters as well.
-
-Please call `.gen_dt_feature()`(recommended), `.gen_rolling_feature()`, and `gen_global_feature()` to generate all candidate features to be selected by `AutoTSEstimator` as well as your input extra feature.
-
-Detailed information please refer to [TSDataset API doc](../../PythonAPI/Chronos/tsdataset.html) and [Time series data basic concepts](../Overview/data_processing_feature_engineering.html).
-
-### Step 3: Create an AutoTSEstimator
-
-```python
-import bigdl.orca.automl.hp as hp
-from bigdl.chronos.autots import AutoTSEstimator
-auto_estimator = AutoTSEstimator(model='lstm', # the model name used for training
-                                 search_space='normal', # a default hyper parameter search space
-                                 past_seq_len=hp.randint(1, 10), # hp sampling function of past_seq_len for auto-tuning
-) 
-```
-We prebuild three defualt search space for each build-in model, which you can use the by setting `search_space` to "minimal"，"normal", or "large" or define your own search space in a dictionary. The larger the search space, the better accuracy you will get and the more time will be cost.
-
-`past_seq_len` can be set as a hp sample function, the proper range is highly related to your data. A range between 0.5 cycle and 3 cycle is reasonable.
-
-Detailed information please refer to [AutoTSEstimator API doc](../../PythonAPI/Chronos/autotsestimator.html#autotsestimator) and basic concepts [here](../Overview/forecasting.html#use-autots-pipeline).
-
-### Step 4: Fit with AutoTSEstimator
-```python
-# fit with AutoTSEstimator for a returned TSPipeline
-ts_pipeline = auto_estimator.fit(data=tsdata_train, # train dataset
-                                 validation_data=tsdata_val, # validation dataset
-                                 epochs=5) # number of epochs to train in each trial
-```
-Detailed information please refer to [AutoTSEstimator API doc](../../PythonAPI/Chronos/autotsestimator.html#autotsestimator).
-### Step 5: Further deployment with TSPipeline
-The `TSPipeline` will reply the same preprcessing and corresponding postprocessing operations on the test data. You may carry out predict, evaluate or save/load for further development.
-```python
-# predict with the best trial
-y_pred = ts_pipeline.predict(tsdata_test)
-```
-
-```python
-# evaluate the result pipeline
-mse, smape = ts_pipeline.evaluate(tsdata_test, metrics=["mse", "smape"])
-print("Evaluate: the mean square error is", mse)
-print("Evaluate: the smape value is", smape)
-```
-
-```python
-# save the pipeline
-my_ppl_file_path = "/tmp/saved_pipeline"
-ts_pipeline.save(my_ppl_file_path)
-# restore the pipeline for further deployment
-from bigdl.chronos.autots import TSPipeline
-loaded_ppl = TSPipeline.load(my_ppl_file_path)
-```
-Detailed information please refer to [TSPipeline API doc](../../PythonAPI/Chronos/tsdataset.html).
-
-### Optional: Examine the leaderboard visualization
-To view the evaluation result of "not chosen" trails and find some insight or even possibly improve you search space for a new autotuning task. We provide a leaderboard through tensorboard.
-```python
-# show a tensorboard view
-%load_ext tensorboard
-%tensorboard --logdir /tmp/autots_estimator/autots_estimator_leaderboard/
-```
-Detailed information please refer to [Visualization](../Overview/useful_functionalities.html#automl-visualization).
-
-[chronos_autots_nyc_taxi_colab]: <https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_autots_nyc_taxi.ipynb>
-[chronos_autots_nyc_taxi]: <https://github.com/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_autots_nyc_taxi.ipynb>
--- a/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-tsdataset-forecaster-quickstart.md
+++ b/docs/readthedocs/source/doc/Chronos/QuickStart/chronos-tsdataset-forecaster-quickstart.md
@ -1,92 +0,0 @@
-# Predict Number of Taxi Passengers with Chronos Forecaster
-
---
-
-![](../../../../image/colab_logo_32px.png)[Run in Google Colab][chronos_nyc_taxi_tsdataset_forecaster_colab] &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub][chronos_nyc_taxi_tsdataset_forecaster]
-
---
-
-**In this guide we will demonstrate how to use _Chronos TSDataset_ and _Chronos Forecaster_ for time seires processing and forecasting in 4 simple steps.**
-
-### Step 0: Prepare Environment
-
-We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../Overview/chronos.html#install) for more details.
-
-```bash
-conda create -n my_env python=3.7 # "my_env" is conda environment name, you can use any name you like.
-conda activate my_env
-pip install bigdl-chronos[all]
-```
-
-### Step 1: Data transformation and feature engineering using Chronos TSDataset
-
-[TSDataset](../Overview/data_processing_feature_engineering.html) is our abstract of time series dataset for data transformation and feature engineering. Here we use it to preprocess the data.
-
-Initialize train, valid and test tsdataset from raw pandas dataframe.
-
-```python
-from bigdl.chronos.data import TSDataset
-from sklearn.preprocessing import StandardScaler
-
-tsdata_train, tsdata_valid, tsdata_test = TSDataset.from_pandas(df, dt_col="timestamp", target_col="value",
-                                                                with_split=True, val_ratio=0.1, test_ratio=0.1)
-```
-Preprocess the datasets. Here we perform:
-
- deduplicate: remove those identical data records
- impute: fill the missing values
- gen_dt_feature: generate feature from datetime (e.g. month, day...)
- scale: scale each feature to standard distribution.
- roll: sample the data with sliding window.
- For forecasting task, we will look back 3 hours' historical data (6 records) and predict the value of next 30 miniutes (1 records).
-
-We perform the same transformation processes on train, valid and test set.
-
-```python
-lookback, horizon = 6, 1
-
-scaler = StandardScaler()
-for tsdata in [tsdata_train, tsdata_valid, tsdata_test]:
-    tsdata.deduplicate().impute().gen_dt_feature()\
-          .scale(scaler, fit=(tsdata is tsdata_train))\
-          .roll(lookback=lookback, horizon=horizon)
-```
-
-### Step 2: Time series forecasting using Chronos Forecaster
-
-After preprocessing the datasets. We can use [Chronos Forecaster](../Overview/forecasting.html#use-standalone-forecaster-pipeline) to handle the forecasting tasks.
-
-Transform TSDataset to sampled numpy ndarray and feed them to forecaster.
-
-```python
-x, y = tsdata_train.to_numpy() 
-x_val, y_val = tsdata_valid.to_numpy() 
-# x.shape = (num of sample, lookback, num of input feature)
-# y.shape = (num of sample, horizon, num of output feature)
-
-forecaster = TCNForecaster(past_seq_len=lookback,  # number of steps to look back
-                           future_seq_len=horizon,  # number of steps to predict
-                           input_feature_num=x.shape[-1],  # number of feature to use
-                           output_feature_num=y.shape[-1])  # number of feature to predict
-res = forecaster.fit(data=(x, y), epochs=3)
-```
-
-### Step 3: Further deployment with fitted forecaster
-
-Use fitted forecaster to predict test data
-
-```python
-x_test, y_test = tsdata_test.to_numpy()
-pred = forecaster.predict(x_test)
-pred_unscale, groundtruth_unscale = tsdata_test.unscale_numpy(pred), tsdata_test.unscale_numpy(y_test)
-```
-
-Save & restore the forecaster.
-
-```python
-forecaster.save("nyc_taxi.fxt")
-forecaster.restore("nyc_taxi.fxt")
-```
-
-[chronos_nyc_taxi_tsdataset_forecaster_colab]:<https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_nyc_taxi_tsdataset_forecaster.ipynb>
-[chronos_nyc_taxi_tsdataset_forecaster]:<https://github.com/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_nyc_taxi_tsdataset_forecaster.ipynb>
--- a/docs/readthedocs/source/doc/Chronos/QuickStart/index.md
+++ b/docs/readthedocs/source/doc/Chronos/QuickStart/index.md
@ -1,372 +0,0 @@
-# Chronos Examples
-
-```eval_rst
-.. raw:: html
-
-    <link rel="stylesheet" type="text/css" href="../../../_static/css/chronos_tutorial.css" />
-
-    <div id="tutorial">
-        <h3 style="text-align:left">Filter:</h3>
-        <p>Please <span style="font-weight:bold;">check</span> the checkboxes or <span style="font-weight:bold;">click</span> tag buttons to show the related examples. Reclick or uncheck will hide corresponding examples. If nothing is checked or clicked, all the examples will be displayed. </p>
-        <div class="border">
-            <div class="choiceline">
-                <div class="choicebox"><input type="checkbox" class="checkboxes" name="choice" value="forecast" id="forecast">forecast </div>
-                <div class="choicebox"><input type="checkbox" class="checkboxes" name="choice" value="anomaly_detection" id="anomaly_detection">anomaly detection</div>
-                <div class="choicebox"><input type="checkbox" class="checkboxes" name="choice" value="simulation" id="simulation">simulation</div>
-                <div class="choicebox"><input type="checkbox" class="checkboxes" name="choice" value="hyperparameter_tuning" id="hyperparameter_tuning">AutoML</div>
-            </div>
-            <div class="choiceline">
-                <div class="choicebox"><input type="checkbox" class="checkboxes" name="choice" value="onnxruntime" id="onnxruntime">onnxruntime </div>
-                <div class="choicebox"><input type="checkbox" class="checkboxes" name="choice" value="quantization" id="quantization">quantization</div>
-                <div class="choicebox"><input type="checkbox" class="checkboxes" name="choice" value="distributed" id="distributed">distributed</div>
-                <div class="choicebox"><input type="checkbox" class="checkboxes" name="choice" value="customized_model" id="customized_model">customized model</div>
-            </div>
-            <div class="hiddenline">
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="TCNForecaster" id="TCNForecaster">TCNForecaster</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="AutoTSEstimator" id="AutoTSEstimator">AutoTSEstimator</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="DBScanDetector" id="DBScanDetector">DBScanDetector</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="LSTMForecaster" id="LSTMForecaster">LSTMForecaster</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="AutoProphet" id="AutoProphet">AutoProphet</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="MTNetForecaster" id="MTNetForecaster">MTNetForecaster</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="DeepAR" id="DeepAR">DeepAR</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="AutoLSTM" id="AutoLSTM">AutoLSTM</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="Seq2SeqForecaster" id="Seq2SeqForecaster">Seq2SeqForecaster</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="DPGANSimulator" id="DPGANSimulator">DPGANSimulator</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="TCMFForecaster" id="TCMFForecaster">TCMFForecaster</div>
-                <div class="choicebox"><input type="checkbox" class="forecasters" name="forecasters" value="TFT_model" id="TFT_model">TFT_model</div>
-            </div>
-        </div>
-        </br>
-        <div class="showingForecaster">Currently showing forcaster:&nbsp;<i>All Forecasters</i>&nbsp;&nbsp;(<span style="font-weight:bold;">Reclick</span> the tag of these forecasters to undo.)</div>
-        </br>
-
-        <details id="ChronosForecaster">
-            <summary>
-                <a href="./chronos-tsdataset-forecaster-quickstart.html">Predict Number of Taxi Passengers with Chronos Forecaster</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="TCNForecaster" class="roundbutton">TCNForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/colab_logo_32px.png"><a href="https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_nyc_taxi_tsdataset_forecaster.ipynb">Run in Google Colab</a>
-            &nbsp;
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_nyc_taxi_tsdataset_forecaster.ipynb">View source on GitHub</a>
-            <p>In this guide we will demonstrate how to use <span>Chronos TSDataset</span> and <span>Chronos Forecaster</span> for time series processing and predict number of taxi passengers.</p>
-        </details>
-        <hr>
-
-        <details id="TuneaForecasting">
-            <summary>
-                <a href="./chronos-autotsest-quickstart.html">Tune a Forecasting Task Automatically</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="hyperparameter_tuning">AutoML</button>&nbsp;
-                    <button value="AutoTSEstimator" class="roundbutton">AutoTSEstimator</button>
-                </p>
-            </summary>
-            <img src="../../../_images/colab_logo_32px.png"><a href="https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_autots_nyc_taxi.ipynb">Run in Google Colab</a>
-            &nbsp;
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_autots_nyc_taxi.ipynb">View source on GitHub</a>
-            <p>In this guide we will demonstrate how to use <span>Chronos AutoTSEstimator</span> and <span>Chronos TSPipeline</span> to auto tune a time seires forecasting task and handle the whole model development process easily.</p>
-        </details>
-        <hr>
-
-        <details id="DetectAnomaly">
-            <summary>
-                <a href="./chronos-anomaly-detector.html">Detect Anomaly Point in Real Time Traffic Data</a>
-                <p>Tag: 
-                    <button value="anomaly_detection">anomaly detection</button>&nbsp;
-                    <button value="DBScanDetector" class="roundbutton">DBScanDetector</button>
-                </p>
-            </summary>
-            <img src="../../../_images/colab_logo_32px.png"><a href="https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_minn_traffic_anomaly_detector.ipynb">Run in Google Colab</a>
-            &nbsp;
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/colab-notebook/chronos_minn_traffic_anomaly_detector.ipynb">View source on GitHub</a>
-            <p>In this guide we will demonstrate how to use <span>Chronos Anomaly Detector</span> for real time traffic data from the Twin Cities Metro area in Minnesota anomaly detection.</p>
-        </details>
-        <hr>
-
-        <details id="AutoTS">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_autots_customized_model.ipynb">Tune a Customized Time Series Forecasting Model with AutoTSEstimator.</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="hyperparameter_tuning">AutoML</button>&nbsp;
-                    <button value="customized_model">customized model</button>&nbsp;
-                    <button value="AutoTSEstimator" class="roundbutton">AutoTSEstimator</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_autots_customized_model.ipynb">View source on GitHub</a>
-            <p>In this notebook, we demonstrate a reference use case where we use the network traffic KPI(s) in the past to predict traffic KPI(s) in the future. We demonstrate how to use <span>AutoTSEstimator</span> to adjust the parameters of a customized model.</p>
-        </details>
-        <hr>
-
-        <details id="AutoWIDE">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_autots_forecasting.ipynb">Auto Tune the Prediction of Network Traffic at the Transit Link of WIDE</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="hyperparameter_tuning">AutoML</button>&nbsp;
-                    <button value="AutoTSEstimator" class="roundbutton">AutoTSEstimator</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_autots_forecasting.ipynb">View source on GitHub</a>
-            <p>In this notebook, we demostrate a reference use case where we use the network traffic KPI(s) in the past to predict traffic KPI(s) in the future. We demostrate how to use <span>AutoTS</span> in project <span><a href="https://github.com/intel-analytics/bigdl/tree/main/python/chronos/src/bigdl/chronos">Chronos</a></span> to do time series forecasting in an automated and distributed way.</p>
-        </details>
-        <hr>
-
-        <details id="MultvarWIDE">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_model_forecasting.ipynb">Multivariate Forecasting of Network Traffic at the Transit Link of WIDE</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="LSTMForecaster" class="roundbutton">LSTMForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_model_forecasting.ipynb">View source on GitHub</a>
-            <p>In this notebook, we demonstrate a reference use case where we use the network traffic KPI(s) in the past to predict traffic KPI(s) in the future. We demostrate how to do univariate forecasting (predict only 1 series), and multivariate forecasting (predicts more than 1 series at the same time) using Project <span><a href="https://github.com/intel-analytics/bigdl/tree/main/python/chronos/src/bigdl/chronos">Chronos</a></span>.</p>
-        </details>
-        <hr>
-
-        <details id="MultstepWIDE">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_multivariate_multistep_tcnforecaster.ipynb">Multistep Forecasting of Network Traffic at the Transit Link of WIDE</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="TCNForecaster" class="roundbutton">TCNForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/network_traffic/network_traffic_multivariate_multistep_tcnforecaster.ipynb">View source on GitHub</a>
-            <p>In this notebook, we demonstrate a reference use case where we use the network traffic KPI(s) in the past to predict traffic KPI(s) in the future. We demostrate how to do multivariate multistep forecasting using Project <span><a href="https://github.com/intel-analytics/bigdl/tree/main/python/chronos/src/bigdl/chronos">Chronos</a></span>.</p>
-        </details>
-        <hr>
-
-        <details id="LSTMF">
-            <summary>
-            <a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/fsi/stock_prediction.ipynb">Stock Price Prediction with LSTMForecaster</a>
-            <p>Tag: 
-                <button value="forecast">forecast</button>&nbsp;
-                <button value="LSTMForecaster" class="roundbutton">LSTMForecaster</button>
-            </p>
-            </summary>          
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/fsi/stock_prediction.ipynb">View source on GitHub</a>
-            <p>In this notebook, we demonstrate a reference use case where we use historical stock price data to predict the future price. The dataset we use is the daily stock price of S&P500 stocks during 2013-2018 (data source). We demostrate how to do univariate forecasting using the past 80% of the total days' MMM price to predict the future 20% days' daily price.</p>
-            <p>Reference: <span><a href="https://github.com/jwkanggist/tf-keras-stock-pred">https://github.com/jwkanggist/tf-keras-stock-pred</a></span></p>
-        </details>
-        <hr>
-
-        <details id="AutoPr">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/fsi/stock_prediction_prophet.ipynb">Stock Price Prediction with ProphetForecaster and AutoProphet</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="hyperparameter_tuning">AutoML</button>&nbsp;
-                    <button value="AutoProphet" class="roundbutton">AutoProphet</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/fsi/stock_prediction_prophet.ipynb">View source on GitHub</a>
-            <p>In this notebook, we demonstrate a reference use case where we use historical stock price data to predict the future price using the ProphetForecaster and AutoProphet. The dataset we use is the daily stock price of S&P500 stocks during 2013-2018 <span><a href="https://www.kaggle.com/camnugent/sandp500/">data source</a></span>.</p>
-            <p>Reference: <span><a href="https://facebook.github.io/prophet">https://facebook.github.io/prophet</a></span>, <span><a href="https://github.com/jwkanggist/tf-keras-stock-pred">https://github.com/jwkanggist/tf-keras-stock-pred</a></span></p>
-        </details>
-        <hr>
-
-        <details id="Unsupervised">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/AIOps/AIOps_anomaly_detect_unsupervised.ipynb">Unsupervised Anomaly Detection for CPU Usage</a>
-                <p>Tag: 
-                    <button value="anomaly_detection">anomaly detection</button>&nbsp;
-                    <button value="DBScanDetector" class="roundbutton">DBScanDetector</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/AIOps/AIOps_anomaly_detect_unsupervised.ipynb">View source on GitHub</a>
-            <p>We demonstrates how to perform anomaly detection based on Chronos's built-in <span><a href="../../PythonAPI/Chronos/anomaly_detectors.html#dbscandetector">DBScanDetector</a></span>, <span><a href="../../PythonAPI/Chronos/anomaly_detectors.html#aedetector">AEDetector</a></span> and <span><a href="../../PythonAPI/Chronos/anomaly_detectors.html#thresholddetector">ThresholdDetector</a></span>.</p>
-        </details>
-        <hr>
-
-        <details id="AnomalyDetection">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/AIOps/AIOps_anomaly_detect_unsupervised_forecast_based.ipynb">Anomaly Detection for CPU Usage Based on Forecasters</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="anomaly_detection">anomaly detection</button>&nbsp;
-                    <button value="MTNetForecaster" class="roundbutton">MTNetForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/blob/main/python/chronos/use-case/AIOps/AIOps_anomaly_detect_unsupervised_forecast_based.ipynb">View source on GitHub</a>
-            <p>We demonstrates how to leverage Chronos's built-in models ie. MTNet, to do time series forecasting. Then perform anomaly detection on predicted value with <span><a href="../../PythonAPI/Chronos/anomaly_detectors.html#thresholddetector">ThresholdDetector</a></span>.</p>
-        </details>
-        <hr>
-
-        <details id="DeepARmodel">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/use-case/pytorch-forecasting/DeepAR">Help pytorch-forecasting improve the training speed of DeepAR model</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="customized_model">customized model</button>&nbsp;
-                    <button value="DeepAR" class="roundbutton">DeepAR</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/use-case/pytorch-forecasting/DeepAR">View source on GitHub</a>
-            <p>Chronos can help a 3rd party time series lib to improve the performance (both training and inferencing) and accuracy. This use-case shows Chronos can easily help pytorch-forecasting speed up the training of DeepAR model.</p>
-        </details>
-        <hr>
-
-        <details id="TFTmodel">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/use-case/pytorch-forecasting/TFT">Help pytorch-forecasting improve the training speed of TFT model</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="customized_model">customized model</button>&nbsp;
-                    <button value="TFT_model" class="roundbutton">TFT Model</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/use-case/pytorch-forecasting/TFT">View source on GitHub</a>
-            <p>Chronos can help a 3rd party time series lib to improve the performance (both training and inferencing) and accuracy. This use-case shows Chronos can easily help pytorch-forecasting speed up the training of TFT model.</p>
-        </details>
-        <hr>
-
-        <details id="hyperparameter">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/hpo/muti_objective_hpo_with_builtin_latency_tutorial.ipynb">Tune a Time Series Forecasting Model with multi-objective hyperparameter optimization.</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="hyperparameter_tuning">AutoML</button>&nbsp;
-                    <button value="TCNForecaster" class="roundbutton">TCNForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/hpo/muti_objective_hpo_with_builtin_latency_tutorial.ipynb">View source on GitHub</a>
-            <p>In this notebook, we demostrate how to use <span>multi-objective hyperparameter optimization with built-in latency metric</span> in project <span><a href="https://github.com/intel-analytics/bigdl/tree/main/python/chronos/src/bigdl/chronos">Chronos</a></span> to do time series forecasting and achieve good tradeoff between performance and latency.</p>
-        </details>
-        <hr>
-
-        <details id="taxiDataset">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/auto_model">Auto tuning prophet on nyc taxi dataset</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="hyperparameter_tuning">AutoML</button>&nbsp;
-                    <button value="AutoLSTM" class="roundbutton">AutoLSTM</button>&nbsp;
-                    <button value="AutoProphet" class="roundbutton">AutoProphet</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/auto_model">View source on GitHub</a>
-            <p>This example collection will demonstrate Chronos auto models (i.e. autolstm & autoprophet) perform automatic time series forecasting on nyc_taxi dataset. The auto model will search the best hyperparameters automatically.</p>
-        </details>
-        <hr>
-
-        <details id="distributedFashion">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/distributed">Use Chronos forecasters in a distributed fashion</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="distributed">distributed</button>&nbsp;
-                    <button value="Seq2SeqForecaster" class="roundbutton">Seq2SeqForecaster</button>&nbsp;
-                    <button value="TCNForecaster" class="roundbutton">TCNForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/distributed">View source on GitHub</a>
-            <p>Users can easily train their forecasters in a distributed fashion to handle extra large dataset and speed up the process (training and data processing) by utilizing a cluster or pseudo-distribution on a single node. The functionality is powered by Project Orca.</p>
-        </details>
-        <hr>
-
-        <details id="ONNX">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/onnx">Use ONNXRuntime to speed-up forecasters' inferecing</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="onnxruntime">onnxruntime</button>&nbsp;
-                    <button value="hyperparameter_tuning">AutoML</button>&nbsp;
-                    <button value="AutoTSEstimator" class="roundbutton">AutoTSEstimator</button>&nbsp;
-                    <button value="Seq2SeqForecaster" class="roundbutton">Seq2SeqForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/onnx">View source on GitHub</a>
-            <p>This example will demonstrate how to use ONNX to speed up the inferencing(prediction/evaluation) on forecasters and AutoTSEstimator. In this example, onnx speed up the inferencing for ~4X.</p>
-        </details>
-        <hr>
-
-        <details id="Quantize">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/quantization">Quantize Chronos forecasters method to speed-up inference</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="quantization">quantization</button>&nbsp;
-                    <button value="TCNForecaster" class="roundbutton">TCNForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/quantization">View source on GitHub</a>
-            <p>Users can easily quantize their forecasters to low precision and speed up the inference process (both throughput and latency) by on a single node. The functionality is powered by Project Nano.</p>
-        </details>
-        <hr>
-
-        <details id="SimualateTimeSeriesData">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/simulator">Simualate time series data with similar pattern as example data</a>
-                <p>Tag: 
-                    <button value="simulation">simulation</button>&nbsp;
-                    <button value="DPGANSimulator" class="roundbutton">DPGANSimulator</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/simulator">View source on GitHub</a>
-            <p>This example shows how to generate synthetic data with similar distribution as training data with the fast and easy DPGANSimulator API provided by Chronos.</p>
-        </details>
-        <hr>
-
-        <details id="TCMF">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/tcmf">High dimension time series forecasting with Chronos TCMFForecaster</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="distributed">distributed</button>&nbsp;
-                    <button value="TCMFForecaster" class="roundbutton">TCMFForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/tcmf">View source on GitHub</a>
-            <p>This example demonstrates how to use BigDL Chronos TCMFForecaster to run distributed training and inference for high dimension time series forecasting task.</p>
-        </details>
-        <hr>
-
-        <details id="PenalizeUnderestimation">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/loss/penalize_underestimation.ipynb">Penalize underestimation with LinexLoss</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="TCNForecaster" class="roundbutton">TCNForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/loss/penalize_underestimation.ipynb">View source on GitHub</a>
-            <p>This example demonstrates how to use TCNForecaster to penalize underestimation based on a built-in loss function LinexLoss.</p>
-        </details>
-        <hr>
-
-        <details id="GPUtrainingCPUacceleration">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/inference-acceleration">Accelerate the inference speed of model trained on other platform</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="customized_model">customized model</button>&nbsp;
-                    <button value="TCNForecaster" class="roundbutton">TCNForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/inference-acceleration">View source on GitHub</a>
-            <p>In this example, we show an example to train the model on GPU and accelerate the model by using onnxruntime on CPU.</p>
-        </details>
-        <hr>
-
-        <details id="ServeForecaster">
-            <summary>
-                <a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/serving">Serve Chronos forecaster and predict through TorchServe</a>
-                <p>Tag: 
-                    <button value="forecast">forecast</button>&nbsp;
-                    <button value="TCNForecaster" class="roundbutton">TCNForecaster</button>
-                </p>
-            </summary>
-            <img src="../../../_images/GitHub-Mark-32px.png"><a href="https://github.com/intel-analytics/BigDL/tree/main/python/chronos/example/serving">View source on GitHub</a>
-            <p>In this example, we show how to serve Chronos forecaster and predict through TorchServe.</p>
-        </details>
-        <hr>
-
-    </div>
-
-    <script src="../../../_static/js/chronos_tutorial.js"></script> 
-```
--- a/docs/readthedocs/source/doc/Chronos/index.rst
+++ b/docs/readthedocs/source/doc/Chronos/index.rst
@ -1,89 +0,0 @@
-BigDL-Chronos
-========================
-
-**BigDL-Chronos** (**Chronos** for short) is an application framework for building a fast, accurate and scalable time series analysis application.
-
-You can use **Chronos** for:
-
-.. grid:: 1 3 3 3
-
-    .. grid-item::
-
-        .. image:: ./Image/forecasting.svg
-            :alt: Forcasting example diagram
-
-        **Forecasting:** Predict future using history data.
-
-    .. grid-item::
-
-        .. image:: ./Image/anomaly_detection.svg
-            :alt: Anomaly Detection example diagram
-
-        **Anomaly Detection:** Discover unexpected items in data.
-
-    .. grid-item::
-
-        .. image:: ./Image/simulation.svg
-            :alt: Simulation example diagram
-
-        **Simulation:** Generate similar data as history data.
-
-------
-
-
-.. grid:: 1 2 2 2
-    :gutter: 2
-
-    .. grid-item-card::
-
-        **Get Started**
-        ^^^
-
-        You may understand the basic usage of Chronos' components and learn to write the first runnable application in this quick tour page.
-
-        +++
-        :bdg-link:`Chronos in 5 minutes <./Overview/quick-tour.html>` |
-        :bdg-link:`Installation <./Overview/install.html>`
-
-    .. grid-item-card::
-
-        **Key Features Guide**
-        ^^^
-
-        Our user guides provide you with in-depth information, concepts and knowledges about Chronos.
-
-        +++
-
-        :bdg-link:`Data <./Overview/data_processing_feature_engineering.html>` |
-        :bdg-link:`Forecast <./Overview/forecasting.html>` |
-        :bdg-link:`Detect <./Overview/anomaly_detection.html>` |
-        :bdg-link:`Simulate <./Overview/simulation.html>`
-
-    .. grid-item-card::
-
-        **How-to-Guide** / **Tutorials**
-        ^^^
-
-        If you are meeting with some specific problems during the usage, how-to guides are good place to be checked.
-        Examples provides short, high quality use case that users can emulated in their own works.
-
-        +++
-
-        :bdg-link:`How-to-Guide <./Howto/index.html>` | :bdg-link:`Example <./QuickStart/index.html>`
-
-    .. grid-item-card::
-
-        **API Document**
-        ^^^
-
-        API Document provides you with a detailed description of the Chronos APIs.
-
-        +++
-
-        :bdg-link:`API Document <../PythonAPI/Chronos/index.html>`
-
-
-..  toctree::
-    :hidden:
-
-    BigDL-Chronos Document <self>
--- a/docs/readthedocs/source/doc/DLlib/Image/tensorboard-histo1.png
+++ b/docs/readthedocs/source/doc/DLlib/Image/tensorboard-histo1.png
--- a/docs/readthedocs/source/doc/DLlib/Image/tensorboard-histo2.png
+++ b/docs/readthedocs/source/doc/DLlib/Image/tensorboard-histo2.png
--- a/docs/readthedocs/source/doc/DLlib/Image/tensorboard-scalar.png
+++ b/docs/readthedocs/source/doc/DLlib/Image/tensorboard-scalar.png
--- a/docs/readthedocs/source/doc/DLlib/Overview/dllib.md
+++ b/docs/readthedocs/source/doc/DLlib/Overview/dllib.md
@ -1,139 +0,0 @@
-# DLlib in 5 minutes
-
-## Overview
-
-DLlib is a distributed deep learning library for Apache Spark; with DLlib, users can write their deep learning applications as standard Spark programs (using either Scala or Python APIs).
-
-It includes the functionalities of the [original BigDL](https://github.com/intel-analytics/BigDL/tree/branch-0.14) project, and provides following high-level APIs for distributed deep learning on Spark:
-
-* [Keras-like API](keras-api.md)
-* [Spark ML pipeline support](nnframes.md)
-
-
---
-
-## Scala Example
-
-This section show a single example of how to use dllib to build a deep learning application on Spark, using Keras APIs
-
-#### LeNet Model on MNIST using Keras-Style API
-
-This tutorial is an explanation of what is happening in the [lenet](https://github.com/intel-analytics/BigDL/tree/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/keras) example
-
-A bigdl-dllib program starts with initialize as follows.
-````scala
-val conf = Engine.createSparkConf()
-  .setAppName("Train Lenet on MNIST")
-  .set("spark.task.maxFailures", "1")
-val sc = new SparkContext(conf)
-Engine.init
-````
-
-After the initialization, we need to:
-
-1. Load train and validation data by _**creating the [```DataSet```](https://github.com/intel-analytics/BigDL/blob/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/feature/dataset/DataSet.scala)**_ (e.g., ````SampleToGreyImg````, ````GreyImgNormalizer```` and ````GreyImgToBatch````):
-   ````scala
-   val trainSet = (if (sc.isDefined) {
-       DataSet.array(load(trainData, trainLabel), sc.get, param.nodeNumber)
-     } else {
-       DataSet.array(load(trainData, trainLabel))
-     }) -> SampleToGreyImg(28, 28) -> GreyImgNormalizer(trainMean, trainStd) -> GreyImgToBatch(
-       param.batchSize)
-
-   val validationSet = DataSet.array(load(validationData, validationLabel), sc) ->
-       BytesToGreyImg(28, 28) -> GreyImgNormalizer(testMean, testStd) -> GreyImgToBatch(
-       param.batchSize)
-   ````
-
-2. We then define Lenet model using Keras-style api
-   ````scala
-   val input = Input(inputShape = Shape(28, 28, 1))
-   val reshape = Reshape(Array(1, 28, 28)).inputs(input)
-   val conv1 = Convolution2D(6, 5, 5, activation = "tanh").setName("conv1_5x5").inputs(reshape)
-   val pool1 = MaxPooling2D().inputs(conv1)
-   val conv2 = Convolution2D(12, 5, 5, activation = "tanh").setName("conv2_5x5").inputs(pool1)
-   val pool2 = MaxPooling2D().inputs(conv2)
-   val flatten = Flatten().inputs(pool2)
-   val fc1 = Dense(100, activation = "tanh").setName("fc1").inputs(flatten)
-   val fc2 = Dense(classNum, activation = "softmax").setName("fc2").inputs(fc1)
-   Model(input, fc2)
-   ````
-
-3. After that, we configure the learning process. Set the ````optimization method```` and the ````Criterion```` (which, given input and target, computes gradient per given loss function):
-   ````scala
-   model.compile(optimizer = optimMethod,
-           loss = ClassNLLCriterion[Float](logProbAsInput = false),
-           metrics = Array(new Top1Accuracy[Float](), new Top5Accuracy[Float](), new Loss[Float]))
-   ````
-
-Finally we _**train the model**_ by calling ````model.fit````:
-````scala
-model.fit(trainSet, nbEpoch = param.maxEpoch, validationData = validationSet)
-````
-
---
-
-## Python Example
-
-#### Initialize NN Context
-
-`NNContext` is the main entry for provisioning the dllib program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
-
-An dlllib program usually starts with the initialization of `NNContext` as follows:
-
-```python
-from bigdl.dllib.nncontext import *
-init_nncontext()
-```
-
-In `init_nncontext`, the user may specify cluster mode for the dllib program:
-
- *Cluster mode=*: "local", "yarn-client", "yarn-cluster", "k8s-client", "standalone" and "spark-submit". Default to be "local".
-
-The dllib program simply runs `init_nncontext` on the local machine, which will automatically provision the runtime Python environment and distributed execution engine on the underlying computing environment (such as a single laptop, a large K8s or Hadoop cluster, etc.).
-
-
-#### Autograd Examples using bigdl-dllb keras Python API
-
-This tutorial describes the [Autograd](https://github.com/intel-analytics/BigDL/tree/branch-2.0/python/dllib/examples/autograd).
-
-The example first do the initializton using `init_nncontext()`:
-```python
-sc = init_nncontext()
-```
-
-It then generate the input data X_, Y_
-
-```python
-data_len = 1000
-X_ = np.random.uniform(0, 1, (1000, 2))
-Y_ = ((2 * X_).sum(1) + 0.4).reshape([data_len, 1])
-```
-
-It then define the custom loss
-
-```python
-def mean_absolute_error(y_true, y_pred):
-    result = mean(abs(y_true - y_pred), axis=1)
-    return result
-```
-
-After that, the example creates the model as follows and set the criterion as the custom loss:
-```python
-a = Input(shape=(2,))
-b = Dense(1)(a)
-c = Lambda(function=add_one_func)(b)
-model = Model(input=a, output=c)
-
-model.compile(optimizer=SGD(learningrate=1e-2),
-              loss=mean_absolute_error)
-```
-Finally the example trains the model by calling `model.fit`:
-
-```python
-model.fit(x=X_,
-          y=Y_,
-          batch_size=32,
-          nb_epoch=int(options.nb_epoch),
-          distributed=False)
-```
--- a/docs/readthedocs/source/doc/DLlib/Overview/index.rst
+++ b/docs/readthedocs/source/doc/DLlib/Overview/index.rst
@ -1,6 +0,0 @@
-DLLib Key Features
-================================
-
-* `Keras-like API <keras-api.html>`_
-* `Spark ML Pipeline Support <nnframes.html>`_
-* `Visualization <visualization.html>`_
--- a/docs/readthedocs/source/doc/DLlib/Overview/install.md
+++ b/docs/readthedocs/source/doc/DLlib/Overview/install.md
@ -1,41 +0,0 @@
-# Installation
-
-
-## Scala
-
-Refer to [BigDl Install guide for Scala](../../UserGuide/scala.md).
-
-
-## Python
-
-
-### Install a Stable Release
-
-Run below command to install _bigdl-dllib_.
-
-```bash
-conda create -n my_env python=3.7
-conda activate my_env
-pip install bigdl-dllib
-```
-
-### Install Nightly build version
-
-You can install the latest nightly build of bigdl-dllib as follows:
-```bash
-pip install --pre --upgrade bigdl-dllib
-```
-
-### Verify your install
-
-You may verify if the installation is successful using the interactive Python shell as follows:
-
-* Type `python` in the command line to start a REPL.
-* Try to run the example code below to verify the installation:
-
-  ```python
-  from bigdl.dllib.utils.nncontext import *
-
-  sc = init_nncontext()  # Initiation of bigdl-dllib on the underlying cluster.
-  ```
-
--- a/docs/readthedocs/source/doc/DLlib/Overview/keras-api.md
+++ b/docs/readthedocs/source/doc/DLlib/Overview/keras-api.md
@ -1,187 +0,0 @@
-# Keras-Like API
-
-## 1. Introduction
-[DLlib](dllib.md) provides __Keras-like API__ based on [__Keras 1.2.2__](https://faroit.github.io/keras-docs/1.2.2/) for distributed deep learning on Apache Spark. Users can easily use the Keras-like API to create a neural network model, and train, evaluate or tune it in a distributed fashion on Spark.
-
-To define a model in Scala using the Keras-like API, one just needs to import the following packages:
-
-```scala
-import com.intel.analytics.bigdl.dllib.keras.layers._
-import com.intel.analytics.bigdl.dllib.keras.models._
-import com.intel.analytics.bigdl.dllib.utils.Shape
-```
-
-One of the highlighted features with regard to the new API is __shape inference__. Users only need to specify the input shape (a `Shape` object __excluding__ batch dimension, for example, `inputShape=Shape(3, 4)` for 3D input) for the first layer of a model and for the remaining layers, the input dimension will be automatically inferred.
-
---
-## 2. LeNet Example
-Here we use the Keras-like API to define a LeNet CNN model and train it on the MNIST dataset:
-
-```scala
-import com.intel.analytics.bigdl.numeric.NumericFloat
-import com.intel.analytics.bigdl.dllib.keras.layers._
-import com.intel.analytics.bigdl.dllib.keras.models._
-import com.intel.analytics.bigdl.dllib.utils.Shape
-
-val model = Sequential()
-model.add(Reshape(Array(1, 28, 28), inputShape = Shape(28, 28, 1)))
-model.add(Convolution2D(6, 5, 5, activation = "tanh").setName("conv1_5x5"))
-model.add(MaxPooling2D())
-model.add(Convolution2D(12, 5, 5, activation = "tanh").setName("conv2_5x5"))
-model.add(MaxPooling2D())
-model.add(Flatten())
-model.add(Dense(100, activation = "tanh").setName("fc1"))
-model.add(Dense(10, activation = "softmax").setName("fc2"))
-
-model.getInputShape().toSingle().toArray // Array(-1, 28, 28, 1)
-model.getOutputShape().toSingle().toArray // Array(-1, 10)
-```
---
-## 3. Shape
-Input and output shapes of a model in the Keras-like API are described by the `Shape` object in Scala, which can be classified into `SingleShape` and `MultiShape`.
-
-`SingleShape` is just a list of Int indicating shape dimensions while `MultiShape` is essentially a list of `Shape`.
-
-Example code to create a shape:
-```scala
-// create a SingleShape
-val shape1 = Shape(3, 4)
-// create a MultiShape consisting of two SingleShape
-val shape2 = Shape(List(Shape(1, 2, 3), Shape(4, 5, 6)))
-```
-You can use method `toSingle()` to cast a `Shape` to a `SingleShape`. Similarly, use `toMulti()` to cast a `Shape` to a `MultiShape`.
-
---
-## 4. Define a model
-You can define a model either using [Sequential API](#sequential-api) or [Functional API](#functional-api). Remember to specify the input shape for the first layer.
-
-After creating a model, you can call the following __methods__:
-
-```scala
-getInputShape()
-```
-```scala
-getOutputShape()
-```
-* Return the input or output shape of a model, which is a [`Shape`](#2-shape) object. For `SingleShape`, the first entry is `-1` representing the batch dimension. For a model with multiple inputs or outputs, it will return a `MultiShape`.
-
-```scala
-setName(name)
-```
-* Set the name of the model.
-
---
-## 5. Sequential API
-The model is described as a linear stack of layers in the Sequential API. Layers can be added into the `Sequential` container one by one and the order of the layers in the model will be the same as the insertion order.
-
-To create a sequential container:
-```scala
-Sequential()
-```
-
-Example code to create a sequential model:
-```scala
-import com.intel.analytics.bigdl.dllib.keras.layers.{Dense, Activation}
-import com.intel.analytics.bigdl.dllib.keras.models.Sequential
-import com.intel.analytics.bigdl.dllib.utils.Shape
-
-val model = Sequential[Float]()
-model.add(Dense[Float](32, inputShape = Shape(128)))
-model.add(Activation[Float]("relu"))
-```
-
---
-## 6. Functional API
-The model is described as a graph in the Functional API. It is more convenient than the Sequential API when defining some complex model (for example, a model with multiple outputs).
-
-To create an input node:
-```scala
-Input(inputShape = null, name = null)
-```
-Parameters:
-
-* `inputShape`: A [`Shape`](#shape) object indicating the shape of the input node, not including batch.
-* `name`: String to set the name of the input node. If not specified, its name will by default to be a generated string.
-
-To create a graph container:
-```scala
-Model(input, output)
-```
-Parameters:
-
-* `input`: An input node or an array of input nodes.
-* `output`: An output node or an array of output nodes.
-
-To merge a list of input __nodes__ (__NOT__ layers), following some merge mode in the Functional API:
-```scala
-import com.intel.analytics.bigdl.dllib.keras.layers.Merge.merge
-
-merge(inputs, mode = "sum", concatAxis = -1) // This will return an output NODE.
-```
-
-Parameters:
-
-* `inputs`: A list of node instances. Must be more than one node.
-* `mode`: Merge mode. String, must be one of: 'sum', 'mul', 'concat', 'ave', 'cos', 'dot', 'max'. Default is 'sum'.
-* `concatAxis`: Int, axis to use when concatenating nodes. Only specify this when merge mode is 'concat'. Default is -1, meaning the last axis of the input.
-
-Example code to create a graph model:
-```scala
-import com.intel.analytics.bigdl.dllib.keras.layers.{Dense, Input}
-import com.intel.analytics.bigdl.dllib.keras.layers.Merge.merge
-import com.intel.analytics.bigdl.dllib.keras.models.Model
-import com.intel.analytics.bigdl.dllib.utils.Shape
-
-// instantiate input nodes
-val input1 = Input[Float](inputShape = Shape(8))
-val input2 = Input[Float](inputShape = Shape(6))
-// call inputs() with an input node and get an output node
-val dense1 = Dense[Float](10).inputs(input1)
-val dense2 = Dense[Float](10).inputs(input2)
-// merge two nodes following some merge mode
-val output = merge(inputs = List(dense1, dense2), mode = "sum")
-// create a graph container
-val model = Model[Float](Array(input1, input2), output)
-```
-
-## 7. Persistence
-This section describes how to save and load the Keras-like API.
-
-### 7.1 save
-To save a Keras model, you call the method `saveModel(path)`.
-
-**Scala:**
-```scala
-import com.intel.analytics.bigdl.dllib.keras.layers.{Dense, Activation}
-import com.intel.analytics.bigdl.dllib.keras.models.Sequential
-
-val model = Sequential[Float]()
-model.add(Dense[Float](32, inputShape = Shape(128)))
-model.add(Activation[Float]("relu"))
-model.saveModel("/tmp/seq.model")
-```
-**Python:**
-```python
-import bigdl.dllib.keras.Sequential
-from bigdl.dllib.keras.layer import Dense
-
-model = Sequential()
-model.add(Dense(input_shape=(32, )))
-model.saveModel("/tmp/seq.model")
-```
-
-### 7.2 load
-To load a saved Keras model, you call the method `load_model(path)`.
-
-**Scala:**
-```scala
-import com.intel.analytics.bigdl.dllib.keras.Models
-
-val model = Models.loadModel[Float]("/tmp/seq.model")
-```
-
-**Python:**
-```python
-from bigdl.dllib.keras.models
-model = load_model("/tmp/seq.model")
-```
--- a/docs/readthedocs/source/doc/DLlib/Overview/nnframes.md
+++ b/docs/readthedocs/source/doc/DLlib/Overview/nnframes.md
@ -1,441 +0,0 @@
-# Spark ML Pipeline Support
-
-## 1. NNFrames Overview
-
-`NNFrames` in [DLlib](dllib.md) provides Spark DataFrame and ML Pipeline support of distributed deep learning on Apache Spark. It includes both Python and Scala interfaces, and is compatible with both Spark 2.x and Spark 3.x.
-
-**Examples**
-
-The examples are included in the DLlib source code.
-
- image classification: model inference using pre-trained Inception v1 model. (See [Python version](https://github.com/intel-analytics/BigDL/tree/branch-2.0/python/dllib/examples/nnframes/imageInference))
- image classification: transfer learning from pre-trained Inception v1 model. (See [Python version](https://github.com/intel-analytics/BigDL/tree/branch-2.0/python/dllib/examples/nnframes/imageTransferLearning))
-
-## 2. Primary APIs
-
- **NNEstimator and NNModel**
-
-  BigDL DLLib provides `NNEstimator` for model training with Spark DataFrame, which provides high level API for training a BigDL Model with the Apache Spark [Estimator](https://spark.apache.org/docs/2.1.1/ml-pipeline.html#estimators) and [Transfomer](https://spark.apache.org/docs/2.1.1/ml-pipeline.html#transformers) pattern, thus users can conveniently fit BigDL DLLib into a ML pipeline. The fit result of `NNEstimator` is a NNModel, which is a Spark ML Transformer.
-
- **NNClassifier and NNClassifierModel**
-
-  `NNClassifier` and `NNClassifierModel`extends `NNEstimator` and `NNModel` and focus on classification tasks, where both label column and prediction column are of Double type.
-
- **NNImageReader**
-
-  NNImageReader loads image into Spark DataFrame.
-
---
-### 2.1 NNEstimator
-
-**Scala:**
-
-```scala
-val estimator = NNEstimator(model, criterion)
-```
-
-**Python:**
-
-```python
-estimator = NNEstimator(model, criterion)
-```
-
-`NNEstimator` extends `org.apache.spark.ml.Estimator` and supports training a BigDL model with Spark DataFrame data. It can be integrated into a standard Spark ML Pipeline
-to allow users to combine the components of BigDL and Spark MLlib. 
-
-`NNEstimator` supports different feature and label data types through `Preprocessing`. During fit (training), NNEstimator will extract feature and label data from input DataFrame and use the `Preprocessing` to convert data for the model, typically converts the feature and label to Tensors or converts the (feature, option[Label]) tuple to a BigDL `Sample`. 
-
-Each`Preprocessing` conducts a data conversion step in the preprocessing phase, multiple `Preprocessing` can be combined into a `ChainedPreprocessing`. Some pre-defined 
-`Preprocessing` for popular data types like Image, Array or Vector are provided in package `com.intel.analytics.bigdl.dllib.feature`, while user can also develop customized `Preprocessing`.
-
-NNEstimator and NNClassifier also supports setting the caching level for the training data. Options are "DRAM", "PMEM" or "DISK_AND_DRAM". If DISK_AND_DRAM(numSlice) is used, only 1/numSlice data will be loaded into memory during training time. By default, DRAM mode is used and all data are cached in memory.
-
-By default, `SeqToTensor` is used to convert an array or Vector to a 1-dimension Tensor. Using the `Preprocessing` allows `NNEstimator` to cache only the raw data and decrease the memory consumption during feature conversion and training, it also enables the model to digest extra data types that DataFrame does not support currently.
-
-More concrete examples are available in package `com.intel.analytics.bigdl.dllib.examples.nnframes`
-
-`NNEstimator` can be created with various parameters for different scenarios.
-
- `NNEstimator(model, criterion)`
-
-   Takes only model and criterion and use `SeqToTensor` as feature and label `Preprocessing`. `NNEstimator` will extract the data from feature and label columns (only Scalar, Array[_] or Vector data type are supported) and convert each feature/label to 1-dimension Tensor. The tensors will be combined into BigDL `Sample` and send to model for training.
-
- `NNEstimator(model, criterion, featureSize: Array[Int], labelSize: Array[Int])`
-
-   Takes model, criterion, featureSize(Array of Int) and labelSize(Array of Int). `NNEstimator` will extract the data from feature and label columns (only Scalar, Array[_] or Vector data type are supported) and convert each feature/label to Tensor according to the specified Tensor size.
-
- `NNEstimator(model, criterion, featureSize: Array[Array[Int]], labelSize: Array[Int])`
-
-   This is the interface for multi-input model. It takes model, criterion, featureSize(Array of Int Array) and labelSize(Array of Int). `NNEstimator` will extract the data from feature and label columns (only Scalar, Array[_] or Vector data type are supported) and convert each feature/label to Tensor according to the specified Tensor size.
-
- `NNEstimator(model, criterion, featurePreprocessing: Preprocessing[F, Tensor[T]],
-labelPreprocessing: Preprocessing[F, Tensor[T]])`
-
-   Takes model, criterion, featurePreprocessing and labelPreprocessing.  `NNEstimator` will extract the data from feature and label columns and convert each feature/label to Tensor with the featurePreprocessing and labelPreprocessing. This constructor provides more flexibility in supporting extra data types.
-
-Meanwhile, for advanced use cases (e.g. model with multiple input tensor), `NNEstimator` supports: `setSamplePreprocessing(value: Preprocessing[(Any, Option[Any]), Sample[T]])` to directly compose Sample according to user-specified Preprocessing.
-
-
-**Scala Example:**
-```scala
-import com.intel.analytics.bigdl.dllib.nn._
-import com.intel.analytics.bigdl.dllib.nnframes.NNEstimator
-import com.intel.analytics.bigdl.dllib.tensor.TensorNumericMath.TensorNumeric.NumericFloat
-
-val model = Sequential().add(Linear(2, 2))
-val criterion = MSECriterion()
-val estimator = NNEstimator(model, criterion)
-  .setLearningRate(0.2)
-  .setMaxEpoch(40)
-val data = sc.parallelize(Seq(
-  (Array(2.0, 1.0), Array(1.0, 2.0)),
-  (Array(1.0, 2.0), Array(2.0, 1.0)),
-  (Array(2.0, 1.0), Array(1.0, 2.0)),
-  (Array(1.0, 2.0), Array(2.0, 1.0))))
-val df = sqlContext.createDataFrame(data).toDF("features", "label")
-val nnModel = estimator.fit(df)
-nnModel.transform(df).show(false)
-```
-
-**Python Example:**
-```python
-from bigdl.dllib.nn.layer import *
-from bigdl.dllib.nn.criterion import *
-from bigdl.dllib.utils.common import *
-from bigdl.dllib.nnframes.nn_classifier import *
-from bigdl.dllib.feature.common import *
-
-data = self.sc.parallelize([
-    ((2.0, 1.0), (1.0, 2.0)),
-    ((1.0, 2.0), (2.0, 1.0)),
-    ((2.0, 1.0), (1.0, 2.0)),
-    ((1.0, 2.0), (2.0, 1.0))])
-
-schema = StructType([
-    StructField("features", ArrayType(DoubleType(), False), False),
-    StructField("label", ArrayType(DoubleType(), False), False)])
-df = self.sqlContext.createDataFrame(data, schema)
-model = Sequential().add(Linear(2, 2))
-criterion = MSECriterion()
-estimator = NNEstimator(model, criterion, SeqToTensor([2]), ArrayToTensor([2]))\
-    .setBatchSize(4).setLearningRate(0.2).setMaxEpoch(40) \
-nnModel = estimator.fit(df)
-res = nnModel.transform(df)
-```
-
-***Example with multi-inputs Model.***
-This example trains a model with 3 inputs. And users can use VectorAssembler from Spark MLlib to combine different fields. With the specified sizes for each model input, NNEstiamtor and NNClassifer will split the input features data and send tensors to corresponding inputs.
-
-```python
-
-from bigdl.dllib.utils.common import *
-from bigdl.dllib.nnframes.nn_classifier import *
-from bigdl.dllib.feature.common import *
-from bigdl.dllib.keras.objectives import SparseCategoricalCrossEntropy
-from bigdl.dllib.keras.optimizers import Adam
-from bigdl.dllib.keras.layers import *
-from bigdl.dllib.nncontext import *
-
-from pyspark.ml.linalg import Vectors
-from pyspark.ml.feature import VectorAssembler
-from pyspark.sql import SparkSession
-
-sparkConf = init_spark_conf().setAppName("testNNEstimator").setMaster('local[1]')
-sc = init_nncontext(sparkConf)
-spark = SparkSession\
-    .builder\
-    .getOrCreate()
-
-df = spark.createDataFrame(
-    [(1, 35, 109.0, Vectors.dense([2.0, 5.0, 0.5, 0.5]), 0.0),
-     (2, 58, 2998.0, Vectors.dense([4.0, 10.0, 0.5, 0.5]), 1.0),
-     (3, 18, 123.0, Vectors.dense([3.0, 15.0, 0.5, 0.5]), 0.0)],
-    ["user", "age", "income", "history", "label"])
-
-assembler = VectorAssembler(
-    inputCols=["user", "age", "income", "history"],
-    outputCol="features")
-
-df = assembler.transform(df)
-
-x1 = Input(shape=(1,))
-x2 = Input(shape=(2,))
-x3 = Input(shape=(2, 2,))
-
-user_embedding = Embedding(5, 10)(x1)
-flatten = Flatten()(user_embedding)
-dense1 = Dense(2)(x2)
-gru = LSTM(4, input_shape=(2, 2))(x3)
-
-merged = merge([flatten, dense1, gru], mode="concat")
-zy = Dense(2)(merged)
-
-zmodel = Model([x1, x2, x3], zy)
-criterion = SparseCategoricalCrossEntropy()
-classifier = NNEstimator(zmodel, criterion, [[1], [2], [2, 2]]) \
-    .setOptimMethod(Adam()) \
-    .setLearningRate(0.1)\
-    .setBatchSize(2) \
-    .setMaxEpoch(10)
-
-nnClassifierModel = classifier.fit(df)
-print(nnClassifierModel.getBatchSize())
-res = nnClassifierModel.transform(df).collect()
-
-```
-
---
-
-### 2.2 NNModel
-**Scala:**
-```scala
-val nnModel = NNModel(bigDLModel)
-```
-
-**Python:**
-```python
-nn_model = NNModel(bigDLModel)
-```
-
-`NNModel` extends Spark's ML
-[Transformer](https://spark.apache.org/docs/2.1.1/ml-pipeline.html#transformers). User can invoke `fit` in `NNEstimator` to get a `NNModel`, or directly compose a `NNModel` from BigDLModel. It enables users to wrap a pre-trained BigDL Model into a NNModel, and use it as a transformer in your Spark ML pipeline to predict the results for `DataFrame (DataSet)`. 
-
-`NNModel` can be created with various parameters for different scenarios.
-
- `NNModel(model)`
-
-   Takes only model and use `SeqToTensor` as feature Preprocessing. `NNModel` will extract the data from feature column (only Scalar, Array[_] or Vector data type are supported) and convert each feature to 1-dimension Tensor. The tensors will be sent to model for inference.
-
- `NNModel(model, featureSize: Array[Int])`
-
-   Takes model and featureSize(Array of Int). `NNModel` will extract the data from feature column (only Scalar, Array[_] or Vector data type are supported) and convert each feature to Tensor according to the specified Tensor size. User can also set featureSize as Array[Array[Int]] for multi-inputs model.
-
- `NNModel(model, featurePreprocessing: Preprocessing[F, Tensor[T]])`
-
-   Takes model and featurePreprocessing. `NNModel` will extract the data from feature column and convert each feature to Tensor with the featurePreprocessing. This constructor provides more flexibility in supporting extra data types.
-
-Meanwhile, for advanced use cases (e.g. model with multiple input tensor), `NNModel` supports: `setSamplePreprocessing(value: Preprocessing[Any, Sample[T]])`to directly compose Sample according to user-specified Preprocessing.
-
-We can get model from `NNModel` by: 
-
-**Scala:**
-```scala
-val model = nnModel.getModel()
-```
-
-**Python:**
-```python
-model = nn_model.getModel()
-```
-
---
-
-### 2.3 NNClassifier
-**Scala:**
-```scala
-val classifer =  NNClassifer(model, criterion)
-```
-
-**Python:**
-```python
-classifier = NNClassifer(model, criterion)
-```
-
-`NNClassifier` is a specialized `NNEstimator` that simplifies the data format for classification tasks where the label space is discrete. It only supports label column of
-DoubleType, and the fitted `NNClassifierModel` will have the prediction column of DoubleType.
-
-* `model` BigDL module to be optimized in the fit() method
-* `criterion` the criterion used to compute the loss and the gradient
-
-`NNClassifier` can be created with various parameters for different scenarios.
-
- `NNClassifier(model, criterion)`
-
-   Takes only model and criterion and use `SeqToTensor` as feature and label Preprocessing. `NNClassifier` will extract the data from feature and label columns (only Scalar, Array[_] or Vector data type are supported) and convert each feature/label to 1-dimension Tensor. The tensors will be combined into BigDL samples and send to model for   training.
-
- `NNClassifier(model, criterion, featureSize: Array[Int])`
-
-   Takes model, criterion, featureSize(Array of Int). `NNClassifier` will extract the data from feature and label columns and convert each feature to Tensor according to the specified Tensor size. `ScalarToTensor` is used to convert the label column. User can also set featureSize as Array[Array[Int]] for multi-inputs model.
-
- `NNClassifier(model, criterion, featurePreprocessing: Preprocessing[F, Tensor[T]])`
-
-   Takes model, criterion and featurePreprocessing.  `NNClassifier` will extract the data from feature and label columns and convert each feature to Tensor with the featurePreprocessing. This constructor provides more flexibility in supporting extra data types.
-
-Meanwhile, for advanced use cases (e.g. model with multiple input tensor), `NNClassifier` supports `setSamplePreprocessing(value: Preprocessing[(Any, Option[Any]), Sample[T]])` to directly compose Sample with user-specified Preprocessing.
-
-**Scala example:**
-```scala
-import com.intel.analytics.bigdl.dllib.nn._
-import com.intel.analytics.bigdl.dllib.nnframes.NNClassifier
-import com.intel.analytics.bigdl.dllib.tensor.TensorNumericMath.TensorNumeric.NumericFloat
-
-val model = Sequential().add(Linear(2, 2))
-val criterion = MSECriterion()
-val estimator = NNClassifier(model, criterion)
-  .setLearningRate(0.2)
-  .setMaxEpoch(40)
-val data = sc.parallelize(Seq(
-  (Array(0.0, 1.0), 1.0),
-  (Array(1.0, 0.0), 2.0),
-  (Array(0.0, 1.0), 1.0),
-  (Array(1.0, 0.0), 2.0)))
-val df = sqlContext.createDataFrame(data).toDF("features", "label")
-val dlModel = estimator.fit(df)
-dlModel.transform(df).show(false)
-```
-
-**Python Example:**
-
-```python
-from bigdl.dllib.nn.layer import *
-from bigdl.dllib.nn.criterion import *
-from bigdl.dllib.utils.common import *
-from bigdl.dllib.nnframes.nn_classifier import *
-from pyspark.sql.types import *
-
-#Logistic Regression with BigDL layers and NNClassifier
-model = Sequential().add(Linear(2, 2)).add(LogSoftMax())
-criterion = ClassNLLCriterion()
-estimator = NNClassifier(model, criterion, [2]).setBatchSize(4).setMaxEpoch(10)
-data = sc.parallelize([
-    ((0.0, 1.0), 1.0),
-    ((1.0, 0.0), 2.0),
-    ((0.0, 1.0), 1.0),
-    ((1.0, 0.0), 2.0)])
-
-schema = StructType([
-    StructField("features", ArrayType(DoubleType(), False), False),
-    StructField("label", DoubleType(), False)])
-df = spark.createDataFrame(data, schema)
-dlModel = estimator.fit(df)
-res = dlModel.transform(df).collect()
-```
-
-### 2.4 NNClassifierModel ##
-
-**Scala:**
-```scala
-val nnClassifierModel = NNClassifierModel(model, featureSize)
-```
-
-**Python:**
-```python
-nn_classifier_model = NNClassifierModel(model)
-```
-
-NNClassifierModel is a specialized `NNModel` for classification tasks. Both label and prediction column will have the datatype of Double.
-
-`NNClassifierModel` can be created with various parameters for different scenarios.
-
- `NNClassifierModel(model)`
-
-   Takes only model and use `SeqToTensor` as feature Preprocessing. `NNClassifierModel` will extract the data from feature column (only Scalar, Array[_] or Vector data type are supported) and convert each feature to 1-dimension Tensor. The tensors will be sent to model for inference.
-
- `NNClassifierModel(model, featureSize: Array[Int])`
-
-   Takes model and featureSize(Array of Int). `NNClassifierModel` will extract the data from feature column (only Scalar, Array[_] or Vector data type are supported) and convert each feature to Tensor according to the specified Tensor size. User can also set featureSize as Array[Array[Int]] for multi-inputs model.
-
- `NNClassifierModel(model, featurePreprocessing: Preprocessing[F, Tensor[T]])`
-
-   Takes model and featurePreprocessing. `NNClassifierModel` will extract the data from feature column and convert each feature to Tensor with the featurePreprocessing. This constructor provides more flexibility in supporting extra data types.
-
-Meanwhile, for advanced use cases (e.g. model with multiple input tensor), `NNClassifierModel` supports: `setSamplePreprocessing(value: Preprocessing[Any, Sample[T]])`to directly compose Sample according to user-specified Preprocessing.
-
---
-
-### 2.5 Hyperparameter Setting
-
-Prior to the commencement of the training process, you can modify the optimization algorithm, batch size, the epoch number of your training, and learning rate to meet your goal or `NNEstimator`/`NNClassifier` will use the default value.
-
-Continue the codes above, NNEstimator and NNClassifier can be set in the same way.
-
-**Scala:**
-
-```scala
-//for esitmator
-estimator.setBatchSize(4).setMaxEpoch(10).setLearningRate(0.01).setOptimMethod(new Adam())
-//for classifier
-classifier.setBatchSize(4).setMaxEpoch(10).setLearningRate(0.01).setOptimMethod(new Adam())
-```
-**Python:**
-
-```python
-# for esitmator
-estimator.setBatchSize(4).setMaxEpoch(10).setLearningRate(0.01).setOptimMethod(Adam())
-# for classifier
-classifier.setBatchSize(4).setMaxEpoch(10).setLearningRate(0.01).setOptimMethod(Adam())
-
-```
-
-### 2.6 Training
-
-NNEstimator/NNCLassifer supports training with Spark's [DataFrame/DataSet](https://spark.apache.org/docs/latest/sql-programming-guide.html#datasets-and-dataframes)
-
-Suppose `df` is the training data, simple call `fit` method and let BigDL DLLib train the model for you.
-
-**Scala:**
-
-```scala
-//get a NNClassifierModel
-val nnClassifierModel = classifier.fit(df)
-```
-
-**Python:**
-
-```python
-# get a NNClassifierModel
-nnClassifierModel = classifier.fit(df)
-```
-User may also set validation DataFrame and validation frequency through `setValidation` method. Train summay and validation summary can also be configured to log the training process for visualization in Tensorboard.
-
-
-### 2.7 Prediction
-
-Since `NNModel`/`NNClassifierModel` inherits from Spark's `Transformer` abstract class, simply call `transform` method on `NNModel`/`NNClassifierModel` to make prediction.
-
-**Scala:**
-
-```scala
-nnModel.transform(df).show(false)
-```
-
-**Python:**
-
-```python
-nnModel.transform(df).show(false)
-```
-
-For the complete examples of NNFrames, please refer to: 
-[Scala examples](https://github.com/intel-analytics/BigDL/tree/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/nnframes)
-[Python examples](https://github.com/intel-analytics/BigDL/tree/branch-2.0/python/dllib/examples/nnframes)
-
-
-### 2.8 NNImageReader
-
-`NNImageReader` is the primary DataFrame-based image loading interface, defining API to read images into DataFrame.
-
-Scala:
-```scala
-val imageDF = NNImageReader.readImages(imageDirectory, sc)
-```
-
-Python:
-```python
-image_frame = NNImageReader.readImages(image_path, self.sc)
-```
-
-The output DataFrame contains a sinlge column named "image". The schema of "image" column can be accessed from `com.intel.analytics.bigdl.dllib.nnframes.DLImageSchema.byteSchema`. Each record in "image" column represents one image record, in the format of Row(origin, height, width, num of channels, mode, data), where origin contains the URI for the image file, and `data` holds the original file bytes for the image file. `mode` represents the OpenCV-compatible type: CV_8UC3, CV_8UC1 in most cases.
-
-```scala
-val byteSchema = StructType(
-  StructField("origin", StringType, true) ::
-    StructField("height", IntegerType, false) ::
-    StructField("width", IntegerType, false) ::
-    StructField("nChannels", IntegerType, false) ::
-    // OpenCV-compatible type: CV_8UC3, CV_32FC3 in most cases
-    StructField("mode", IntegerType, false) ::
-    // Bytes in OpenCV-compatible order: row-wise BGR in most cases
-    StructField("data", BinaryType, false) :: Nil)
-```
-
-After loading the image, user can compose the preprocess steps with the `Preprocessing` defined in `com.intel.analytics.bigdl.dllib.feature.image`.
--- a/docs/readthedocs/source/doc/DLlib/Overview/visualization.md
+++ b/docs/readthedocs/source/doc/DLlib/Overview/visualization.md
@ -1,40 +0,0 @@
-## Visualizing training with TensorBoard
-With the summary info generated, we can then use [TensorBoard](https://pypi.python.org/pypi/tensorboard) to visualize the behaviors of the BigDL program.
-
-* **Installing TensorBoard**
-
-  Prerequisites:
-
-  1. Python verison: 2.7, 3.4, 3.5, or 3.6
-  2. Pip version >= 9.0.1
-
-     To install TensorBoard using Python 2, you may run the command:
-     ```bash
-     pip install tensorboard==1.0.0a4
-     ```
-
-     To install TensorBoard using Python 3, you may run the command:
-     ```bash
-     pip3 install tensorboard==1.0.0a4
-     ```
-
-     Please refer to [this page](https://github.com/intel-analytics/BigDL/tree/master/spark/dl/src/main/scala/com/intel/analytics/bigdl/visualization#known-issues) for possible issues when installing TensorBoard.
-
-* **Launching TensorBoard**
-
-  You can launch TensorBoard using the command below:
-  ```bash
-  tensorboard --logdir=/tmp/bigdl_summaries
-  ```
-  After that, navigate to the TensorBoard dashboard using a browser. You can find the URL in the console output after TensorBoard is successfully launched; by default the URL is http://your_node:6006
-
-* **Visualizations in TensorBoard**
-
-  Within the TensorBoard dashboard, you will be able to read the visualizations of each run, including the “Loss” and “Throughput” curves under the SCALARS tab (as illustrated below):
-  ![](../Image/tensorboard-scalar.png)
-
-  And “weights”, “bias”, “gradientWeights” and “gradientBias” under the DISTRIBUTIONS and HISTOGRAMS tabs (as illustrated below):
-  ![](../Image/tensorboard-histo1.png)
-  ![](../Image/tensorboard-histo2.png)
-
---
--- a/docs/readthedocs/source/doc/DLlib/QuickStart/dllib-quickstart.md
+++ b/docs/readthedocs/source/doc/DLlib/QuickStart/dllib-quickstart.md
@ -1,70 +0,0 @@
-# DLlib Quickstarts
-
---
-
-![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/branch-2.0/python/dllib/colab-notebook/dllib_keras_api.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/branch-2.0/python/dllib/colab-notebook/dllib_keras_api.ipynb)
-
---
-
-**In this guide we will demonstrate how to use _DLlib keras style api_ and _DLlib NNClassifier_ for classification.**
-
-### **Step 0: Prepare Environment**
-
-We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../Overview/chronos.html#install) for more details.
-
-```bash
-conda create -n my_env python=3.7 # "my_env" is conda environment name, you can use any name you like.
-conda activate my_env
-pip install bigdl-dllib
-```
-
-### Step 1: Data loading and processing using Spark DataFrame
-
-```python
-df = spark.read.csv(path, sep=',', inferSchema=True).toDF("num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age", "class")
-```
-
-We process the data using Spark API and split the data into train and test set.
-
-```python
-vecAssembler = VectorAssembler(outputCol="features")
-vecAssembler.setInputCols(["num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age"])
-train_df = vecAssembler.transform(df)
-
-changedTypedf = train_df.withColumn("label", train_df["class"].cast(DoubleType())+lit(1))\
-    .select("features", "label")
-(trainingDF, validationDF) = changedTypedf.randomSplit([0.9, 0.1])
-```
-
-### Step 3: Define classification model using DLlib keras style api
-
-```python
-x1 = Input(shape=(8,))
-dense1 = Dense(12, activation='relu')(x1)
-dense2 = Dense(8, activation='relu')(dense1)
-dense3 = Dense(2)(dense2)
-model = Model(x1, dense3)
-```
-
-### Step 4: Create NNClassifier and Fit NNClassifier
-
-```python
-classifier = NNClassifier(model, CrossEntropyCriterion(), [8]) \
-    .setOptimMethod(Adam()) \
-    .setBatchSize(32) \
-    .setMaxEpoch(150)
-
-nnModel = classifier.fit(trainingDF)
-```
-
-### Step 5: Evaluate the trained model
-
-```python
-predictionDF = nnModel.transform(validationDF).cache()
-predictionDF.sample(False, 0.1).show()
-
-
-evaluator = MulticlassClassificationEvaluator(
-    labelCol="label", predictionCol="prediction", metricName="accuracy")
-accuracy = evaluator.evaluate(predictionDF)
-```
--- a/docs/readthedocs/source/doc/DLlib/QuickStart/index.md
+++ b/docs/readthedocs/source/doc/DLlib/QuickStart/index.md
@ -1,9 +0,0 @@
-# DLlib Tutorial
-
-
- [**Python Quickstart Notebook**](./python-getting-started.html)
-
-    > ![](../../../../image/colab_logo_32px.png)[Run in Google Colab](https://colab.research.google.com/github/intel-analytics/BigDL/blob/branch-2.0/python/dllib/colab-notebook/dllib_keras_api.ipynb) &nbsp;![](../../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/blob/branch-2.0/python/dllib/colab-notebook/dllib_keras_api.ipynb)
-
-    In this guide we will demonstrate how to use _DLlib keras style api_ and _DLlib NNClassifier_ for classification.
-
--- a/docs/readthedocs/source/doc/DLlib/QuickStart/python-getting-started.md
+++ b/docs/readthedocs/source/doc/DLlib/QuickStart/python-getting-started.md
@ -1,218 +0,0 @@
-# DLLib Python Getting Start Guide
-
-## 1. Code initialization
-```nncontext``` is the main entry for provisioning the dllib program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
-
-It is recommended to initialize `nncontext` at the beginning of your program:
-```
-from bigdl.dllib.nncontext import *
-sc = init_nncontext()
-```
-For more information about ```nncontext```, please refer to [nncontext](../Overview/dllib.md#initialize-nn-context)
-
-## 2. Distributed Data Loading
-
-#### Using Spark Dataframe APIs
-DLlib supports Spark Dataframes as the input to the distributed training, and as
-the input/output of the distributed inference. Consequently, the user can easily
-process large-scale dataset using Apache Spark, and directly apply AI models on
-the distributed (and possibly in-memory) Dataframes without data conversion or serialization
-
-We create Spark session so we can use Spark API to load and process the data
-```
-spark = SQLContext(sc)
-```
-
-1. We can use Spark API to load the data into Spark DataFrame, eg. read csv file into Spark DataFrame
-   ```
-   path = "pima-indians-diabetes.data.csv"
-   spark.read.csv(path)
-   ```
-
-   If the feature column for the model is a Spark ML Vector. Please assemble related columns into a Vector and pass it to the model. eg.
-   ```
-   from pyspark.ml.feature import VectorAssembler
-   vecAssembler = VectorAssembler(outputCol="features")
-   vecAssembler.setInputCols(["num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age"])
-   assemble_df = vecAssembler.transform(df)
-   assemble_df.withColumn("label", col("class").cast(DoubleType) + lit(1))
-   ```
-
-2. If the training data is image, we can use DLLib api to load image into Spark DataFrame. Eg.
-   ```
-   imgPath = "cats_dogs/"
-   imageDF = NNImageReader.readImages(imgPath, sc)
-   ```
-
-   It will load the images and generate feature tensors automatically. Also we need generate labels ourselves. eg:
-   ```
-   labelDF = imageDF.withColumn("name", getName(col("image"))) \
-           .withColumn("label", getLabel(col('name')))
-   ```
-
-   Then split the Spark DataFrame into traing part and validation part
-   ```
-   (trainingDF, validationDF) = labelDF.randomSplit([0.9, 0.1])
-   ```
-
-## 3. Model Definition
-
-#### Using Keras-like APIs
-
-To define a model, you can use the [Keras Style API](../Overview/keras-api.md).
-```
-x1 = Input(shape=[8])
-dense1 = Dense(12, activation="relu")(x1)
-dense2 = Dense(8, activation="relu")(dense1)
-dense3 = Dense(2)(dense2)
-dmodel = Model(input=x1, output=dense3)
-```
-
-After creating the model, you will have to decide which loss function to use in training.
-
-Now you can use `compile` function of the model to set the loss function, optimization method.
-```
-dmodel.compile(optimizer = "adam", loss = "sparse_categorical_crossentropy")
-```
-
-Now the model is built and ready to train.
-
-## 4. Distributed Model Training
-Now you can use 'fit' begin the training, please set the label columns. Model Evaluation can be performed periodically during a training.
-1. If the dataframe is generated using Spark apis, you also need set the feature columns. eg.
-   ```
-   model.fit(df, feature_cols=["features"], label_cols=["label"], batch_size=4, nb_epoch=1)
-   ```
-   Note: Above model accepts single input(column `features`) and single output(column `label`).
-
-   If your model accepts multiple inputs(eg. column `f1`, `f2`, `f3`), please set the features as below:
-   ```
-   model.fit(df, feature_cols=["f1", "f2"], label_cols=["label"], batch_size=4, nb_epoch=1)
-   ```
-
-   Similarly, if the model accepts multiple outputs(eg. column `label1`, `label2`), please set the label columns as below:
-   ```
-   model.fit(df, feature_cols=["features"], label_cols=["l1", "l2"], batch_size=4, nb_epoch=1)
-   ```
-
-2. If the dataframe is generated using DLLib `NNImageReader`, we don't need set `feature_cols`, we can set `transform` to config how to process the images before training. Eg.
-   ```
-   from bigdl.dllib.feature.image import transforms
-   transformers = transforms.Compose([ImageResize(50, 50), ImageMirror()])
-   model.fit(image_df, label_cols=["label"], batch_size=1, nb_epoch=1, transform=transformers)
-   ```
-   For more details about how to use DLLib keras api to train image data, you may want to refer [ImageClassification](https://github.com/intel-analytics/BigDL/tree/main/python/dllib/examples/keras/image_classification.py)
-
-## 5. Model saving and loading
-When training is finished, you may need to save the final model for later use.
-
-BigDL allows you to save your BigDL model on local filesystem, HDFS, or Amazon s3.
- **save**
-  ```
-  modelPath = "/tmp/demo/keras.model"
-  dmodel.saveModel(modelPath)
-  ```
-
- **load**
-  ```
-  loadModel = Model.loadModel(modelPath)
-  preDF = loadModel.predict(df, feature_cols=["features"], prediction_col="predict")
-  ```
-
-You may want to refer [Save/Load](../Overview/keras-api.html#save)
-
-## 6. Distributed evaluation and inference
-After training finishes, you can then use the trained model for prediction or evaluation.
-
- **inference**
-  1. For dataframe generated by Spark API, please set `feature_cols` and `prediction_col`
-     ```
-     dmodel.predict(df, feature_cols=["features"], prediction_col="predict")
-     ```
-  2. For dataframe generated by `NNImageReader`, please set `prediction_col` and you can set `transform` if needed
-     ```
-     model.predict(df, prediction_col="predict", transform=transformers)
-     ```
-
- **evaluation**
-
-  Similary for dataframe generated by Spark API, the code is as below:
-  ```
-  dmodel.evaluate(df, batch_size=4, feature_cols=["features"], label_cols=["label"])
-  ```
-
-  For dataframe generated by `NNImageReader`:
-  ```
-  model.evaluate(image_df, batch_size=1, label_cols=["label"], transform=transformers)
-  ```
-
-## 7. Checkpointing and resuming training
-You can configure periodically taking snapshots of the model.
-```
-cpPath = "/tmp/demo/cp"
-dmodel.set_checkpoint(cpPath)
-```
-You can also set ```over_write``` to ```true``` to enable overwriting any existing snapshot files
-
-After training stops, you can resume from any saved point. Choose one of the model snapshots to resume (saved in checkpoint path, details see Checkpointing). Use Models.loadModel to load the model snapshot into an model object.
-```
-loadModel = Model.loadModel(path)
-```
-
-## 8. Monitor your training
-
- **Tensorboard**
-
-  BigDL provides a convenient way to monitor/visualize your training progress. It writes the statistics collected during training/validation. Saved summary can be viewed via TensorBoard.
-
-  In order to take effect, it needs to be called before fit.
-  ```
-  dmodel.set_tensorboard("./", "dllib_demo")
-  ```
-  For more details, please refer [visulization](../Overview/visualization.md)
-
-## 9. Transfer learning and finetuning
-
- **freeze and trainable**
-
-  BigDL DLLib supports exclude some layers of model from training.
-  ```
-  dmodel.freeze(layer_names)
-  ```
-  Layers that match the given names will be freezed. If a layer is freezed, its parameters(weight/bias, if exists) are not changed in training process.
-
-  BigDL DLLib also support unFreeze operations. The parameters for the layers that match the given names will be trained(updated) in training process
-  ```
-  dmodel.unFreeze(layer_names)
-  ```
-  For more information, you may refer [freeze](../../PythonAPI/DLlib/freeze.md)
-
-## 10. Hyperparameter tuning
- **optimizer**
-
-  DLLib supports a list of optimization methods.
-  For more details, please refer [optimization](../../PythonAPI/DLlib/optim-Methods.md)
-
- **learning rate scheduler**
-
-  DLLib supports a list of learning rate scheduler.
-  For more details, please refer [lr_scheduler](../../PythonAPI/DLlib/learningrate-Scheduler.md)
-
- **batch size**
-
-  DLLib supports set batch size during training and prediction. We can adjust the batch size to tune the model's accuracy.
-
- **regularizer**
-
-  DLLib supports a list of regularizers.
-  For more details, please refer [regularizer](../../PythonAPI/DLlib/regularizers.md)
-
- **clipping**
-
-  DLLib supports gradient clipping operations.
-  For more details, please refer [gradient_clip](../../PythonAPI/DLlib/clipping.md)
-
-## 11. Running program
-```
-python you_app_code.py
-```
--- a/docs/readthedocs/source/doc/DLlib/QuickStart/scala-getting-started.md
+++ b/docs/readthedocs/source/doc/DLlib/QuickStart/scala-getting-started.md
@ -1,303 +0,0 @@
-# DLLib Scala Getting Start Guide
-
-## 1. Creating dev environment
-
-#### Scala project (maven & sbt)
-
- **Maven**
-
-  To use BigDL DLLib to build your own deep learning application, you can use maven to create your project and add bigdl-dllib to your dependency. Please add below code to your pom.xml to add BigDL DLLib as your dependency:
-  ```
-  <dependency>
-      <groupId>com.intel.analytics.bigdl</groupId>
-      <artifactId>bigdl-dllib-spark_2.4.6</artifactId>
-      <version>0.14.0</version>
-  </dependency>
-  ```
-
- **SBT**
-  ```
-  libraryDependencies += "com.intel.analytics.bigdl" % "bigdl-dllib-spark_2.4.6" % "0.14.0"
-  ```
-  For more information about how to add BigDL dependency, please refer [scala docs](../../UserGuide/scala.md#build-a-scala-project)
-
-#### IDE (Intelij)
-Open up IntelliJ and click File => Open
-
-Navigate to your project. If you have add BigDL DLLib as dependency in your pom.xml.
-The IDE will automatically download it from maven and you are able to run your application.
-
-For more details about how to setup IDE for BigDL project, please refer [IDE Setup Guide](../../UserGuide/develop.html#id2)
-
-
-## 2. Code initialization
-```NNContext``` is the main entry for provisioning the dllib program on the underlying cluster (such as K8s or Hadoop cluster), or just on a single laptop.
-
-It is recommended to initialize `NNContext` at the beginning of your program:
-```
-import com.intel.analytics.bigdl.dllib.NNContext
-import com.intel.analytics.bigdl.dllib.keras.Model
-import com.intel.analytics.bigdl.dllib.keras.models.Models
-import com.intel.analytics.bigdl.dllib.keras.optimizers.Adam
-import com.intel.analytics.bigdl.dllib.nn.ClassNLLCriterion
-import com.intel.analytics.bigdl.dllib.utils.Shape
-import com.intel.analytics.bigdl.dllib.keras.layers._
-import com.intel.analytics.bigdl.numeric.NumericFloat
-import org.apache.spark.ml.feature.VectorAssembler
-import org.apache.spark.sql.SQLContext
-import org.apache.spark.sql.functions._
-import org.apache.spark.sql.types.DoubleType
-
-val sc = NNContext.initNNContext("dllib_demo")
-```
-For more information about ```NNContext```, please refer to [NNContext](../Overview/dllib.md#initialize-nn-context)
-
-## 3. Distributed Data Loading
-
-#### Using Spark Dataframe APIs
-DLlib supports Spark Dataframes as the input to the distributed training, and as
-the input/output of the distributed inference. Consequently, the user can easily
-process large-scale dataset using Apache Spark, and directly apply AI models on
-the distributed (and possibly in-memory) Dataframes without data conversion or serialization
-
-We create Spark session so we can use Spark API to load and process the data
-```
-val spark = new SQLContext(sc)
-```
-
-1. We can use Spark API to load the data into Spark DataFrame, eg. read csv file into Spark DataFrame
-   ```
-   val path = "pima-indians-diabetes.data.csv"
-   val df = spark.read.options(Map("inferSchema"->"true","delimiter"->",")).csv(path)
-         .toDF("num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age", "class")
-   ```
-
-   If the feature column for the model is a Spark ML Vector. Please assemble related columns into a Vector and pass it to the model. eg.
-   ```
-   val assembler = new VectorAssembler()
-     .setInputCols(Array("num_times_pregrant", "plasma_glucose", "blood_pressure", "skin_fold_thickness", "2-hour_insulin", "body_mass_index", "diabetes_pedigree_function", "age"))
-     .setOutputCol("features")
-   val assembleredDF = assembler.transform(df)
-   val df2 = assembleredDF.withColumn("label",col("class").cast(DoubleType) + lit(1))
-   ```
-
-2. If the training data is image, we can use DLLib api to load image into Spark DataFrame. Eg.
-   ```
-   val createLabel = udf { row: Row =>
-   if (new Path(row.getString(0)).getName.contains("cat")) 1 else 2
-   }
-   val imagePath = "cats_dogs/"
-   val imgDF = NNImageReader.readImages(imagePath, sc)
-   ```
-
-   It will load the images and generate feature tensors automatically. Also we need generate labels ourselves. eg:
-   ```
-   val df = imgDF.withColumn("label", createLabel(col("image")))
-   ```
-
-   Then split the Spark DataFrame into traing part and validation part
-   ```
-   val Array(trainDF, valDF) = df.randomSplit(Array(0.8, 0.2))
-   ```
-
-## 4. Model Definition
-
-#### Using Keras-like APIs
-
-To define a model, you can use the [Keras Style API](../Overview/keras-api.md).
-```
-val x1 = Input(Shape(8))
-val dense1 = Dense(12, activation="relu").inputs(x1)
-val dense2 = Dense(8, activation="relu").inputs(dense1)
-val dense3 = Dense(2).inputs(dense2)
-val dmodel = Model(x1, dense3)
-```
-
-After creating the model, you will have to decide which loss function to use in training.
-
-Now you can use `compile` function of the model to set the loss function, optimization method.
-```
-dmodel.compile(optimizer = new Adam(), loss = ClassNLLCriterion())
-```
-
-Now the model is built and ready to train.
-
-## 5. Distributed Model Training
-Now you can use 'fit' begin the training, please set the label columns. Model Evaluation can be performed periodically during a training.
-1. If the dataframe is generated using Spark apis, you also need set the feature columns. eg.
-   ```
-   model.fit(x=trainDF, batchSize=4, nbEpoch = 2,
-     featureCols = Array("feature1"), labelCols = Array("label"), valX=valDF)
-   ```
-   Note: Above model accepts single input(column `feature1`) and single output(column `label`).
-
-   If your model accepts multiple inputs(eg. column `f1`, `f2`, `f3`), please set the features as below:
-   ```
-   model.fit(x=dataframe, batchSize=4, nbEpoch = 2,
-     featureCols = Array("f1", "f2", "f3"), labelCols = Array("label"))
-   ```
-
-   Similarly, if the model accepts multiple outputs(eg. column `label1`, `label2`), please set the label columns as below:
-   ```
-   model.fit(x=dataframe, batchSize=4, nbEpoch = 2,
-     featureCols = Array("f1", "f2", "f3"), labelCols = Array("label1", "label2"))
-   ```
-
-2. If the dataframe is generated using DLLib `NNImageReader`, we don't need set `featureCols`, we can set `transform` to config how to process the images before training. Eg.
-   ```
-   val transformers = transforms.Compose(Array(ImageResize(50, 50),
-     ImageMirror()))
-   model.fit(x=dataframe, batchSize=4, nbEpoch = 2,
-     labelCols = Array("label"), transform = transformers)
-   ```
-   For more details about how to use DLLib keras api to train image data, you may want to refer [ImageClassification](https://github.com/intel-analytics/BigDL/blob/main/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/keras/ImageClassification.scala)
-
-## 6. Model saving and loading
-When training is finished, you may need to save the final model for later use.
-
-BigDL allows you to save your BigDL model on local filesystem, HDFS, or Amazon s3.
- **save**
-  ```
-  val modelPath = "/tmp/demo/keras.model"
-  dmodel.saveModel(modelPath)
-  ```
-
- **load**
-  ```
-  val loadModel = Models.loadModel(modelPath)
-
-  val preDF2 = loadModel.predict(valDF, featureCols = Array("features"), predictionCol = "predict")
-  ```
-
-You may want to refer [Save/Load](../Overview/keras-api.html#save)
-
-## 7. Distributed evaluation and inference
-After training finishes, you can then use the trained model for prediction or evaluation.
-
- **inference**
-  1. For dataframe generated by Spark API, please set `featureCols`
-     ```
-     dmodel.predict(trainDF, featureCols = Array("features"), predictionCol = "predict")
-     ```
-  2. For dataframe generated by `NNImageReader`, no need to set `featureCols` and you can set `transform` if needed
-     ```
-     model.predict(imgDF, predictionCol = "predict", transform = transformers)
-     ```
-
- **evaluation**
-
-  Similary for dataframe generated by Spark API, the code is as below:
-  ```
-  dmodel.evaluate(trainDF, batchSize = 4, featureCols = Array("features"),
-    labelCols = Array("label"))
-  ```
-
-  For dataframe generated by `NNImageReader`:
-  ```
-  model.evaluate(imgDF, batchSize = 1, labelCols = Array("label"), transform = transformers)
-  ```
-
-## 8. Checkpointing and resuming training
-You can configure periodically taking snapshots of the model.
-```
-val cpPath = "/tmp/demo/cp"
-dmodel.setCheckpoint(cpPath, overWrite=false)
-```
-You can also set ```overWrite``` to ```true``` to enable overwriting any existing snapshot files
-
-After training stops, you can resume from any saved point. Choose one of the model snapshots to resume (saved in checkpoint path, details see Checkpointing). Use Models.loadModel to load the model snapshot into an model object.
-```
-val loadModel = Models.loadModel(path)
-```
-
-## 9. Monitor your training
-
- **Tensorboard**
-
-  BigDL provides a convenient way to monitor/visualize your training progress. It writes the statistics collected during training/validation. Saved summary can be viewed via TensorBoard.
-
-  In order to take effect, it needs to be called before fit.
-  ```
-  dmodel.setTensorBoard("./", "dllib_demo")
-  ```
-  For more details, please refer [visulization](../Overview/visualization.md)`
-
-## 10. Transfer learning and finetuning
-
- **freeze and trainable**
-
-  BigDL DLLib supports exclude some layers of model from training.
-  ```
-  dmodel.freeze(layer_names)
-  ```
-  Layers that match the given names will be freezed. If a layer is freezed, its parameters(weight/bias, if exists) are not changed in training process.
-
-  BigDL DLLib also support unFreeze operations. The parameters for the layers that match the given names will be trained(updated) in training process
-  ```
-  dmodel.unFreeze(layer_names)
-  ```
-  For more information, you may refer [freeze](../../PythonAPI/DLlib/freeze.md)
-
-## 11. Hyperparameter tuning
- **optimizer**
-
-  DLLib supports a list of optimization methods.
-  For more details, please refer [optimization](../../PythonAPI/DLlib/optim-Methods.md)
-
- **learning rate scheduler**
-
-  DLLib supports a list of learning rate scheduler.
-  For more details, please refer [lr_scheduler](../../PythonAPI/DLlib/learningrate-Scheduler.md)
-
- **batch size**
-
-  DLLib supports set batch size during training and prediction. We can adjust the batch size to tune the model's accuracy.
-
- **regularizer**
-
-  DLLib supports a list of regularizers.
-  For more details, please refer [regularizer](../../PythonAPI/DLlib/regularizers.md)
-
- **clipping**
-
-  DLLib supports gradient clipping operations.
-  For more details, please refer [gradient_clip](../../PythonAPI/DLlib/clipping.md)
-
-## 12. Running program
-You can run a bigdl-dllib program as a standard Spark program (running on either a local machine or a distributed cluster) as follows:
-```
-# Spark local mode
-${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
-  --master local[2] \
-  --class class_name \
-  jar_path
-
-# Spark standalone mode
-## ${SPARK_HOME}/sbin/start-master.sh
-## check master URL from http://localhost:8080
-${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
-  --master spark://... \
-  --executor-cores cores_per_executor \
-  --total-executor-cores total_cores_for_the_job \
-  --class class_name \
-  jar_path
-
-# Spark yarn client mode
-${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
- --master yarn \
- --deploy-mode client \
- --executor-cores cores_per_executor \
- --num-executors executors_number \
- --class class_name \
- jar_path
-
-# Spark yarn cluster mode
-${BIGDL_HOME}/bin/spark-submit-with-dllib.sh \
- --master yarn \
- --deploy-mode cluster \
- --executor-cores cores_per_executor \
- --num-executors executors_number \
- --class class_name
- jar_path
-```
-For more detail about how to run BigDL scala application, please refer to [Scala UserGuide](../../UserGuide/scala.md)
--- a/docs/readthedocs/source/doc/DLlib/index.rst
+++ b/docs/readthedocs/source/doc/DLlib/index.rst
@ -1,62 +0,0 @@
-BigDL-DLlib
-=========================
-
-**BigDL-DLlib** (or **DLlib** for short) is a distributed deep learning library for Apache Spark; with DLlib, users can write their deep learning applications as standard Spark programs (using either Scala or Python APIs).
-
-------
-
-
-.. grid:: 1 2 2 2
-    :gutter: 2
-
-    .. grid-item-card::
-
-        **Get Started**
-        ^^^
-
-        Documents in these sections helps you getting started quickly with DLLib.
-
-        +++
-        :bdg-link:`DLlib in 5 minutes <./Overview/dllib.html>` |
-        :bdg-link:`Installation <./Overview/install.html>`
-
-    .. grid-item-card::
-
-        **Key Features Guide**
-        ^^^
-
-        Each guide in this section provides you with in-depth information, concepts and knowledges about DLLib key features.
-
-        +++
-
-        :bdg-link:`Keras-Like API <./Overview/keras-api.html>` |
-        :bdg-link:`Spark ML Pipeline <./Overview/nnframes.html>`
-
-    .. grid-item-card::
-
-        **Examples**
-        ^^^
-
-        DLLib Examples and Tutorials.
-
-        +++
-
-        :bdg-link:`Tutorials <./QuickStart/index.html>`
-
-    .. grid-item-card::
-
-        **API Document**
-        ^^^
-
-        API Document provides detailed description of DLLib APIs.
-
-        +++
-
-        :bdg-link:`API Document <../PythonAPI/DLlib/index.html>`
-
-
-..  toctree::
-    :hidden:
-
-    BigDL-DLlib Document <self>
-
--- a/docs/readthedocs/source/doc/Friesian/examples.md
+++ b/docs/readthedocs/source/doc/Friesian/examples.md
@ -1,70 +0,0 @@
-### Use Cases
-
-
- **Train a DeepFM model using recsys data**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/deep_fm)
-
---------------------------
-
- **Run DeepRec with BigDL**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/deeprec)
-
---------------------------
-
- **Train DIEN using the Amazon Book Reviews dataset**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/dien)
-
---------------------------
-
- **Preprocess the Criteo dataset for DLRM Model**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/dlrm)
-
---------------------------
-
- **Train an LightGBM model using Twitter dataset**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/lightGBM)
-
---------------------------
-
- **Running Friesian listwise example**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/listwise_ranking)
-
---------------------------
-
- **Multi-task Recommendation with BigDL**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/multi_task)
-
---------------------------
-
- **Train an NCF model on MovieLens**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/ncf)
-
-
---------------------------
-
- **Offline Recall with Faiss on Spark**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/recall)
-
-
---------------------------
-
- **Recommend items using Friesian-Serving Framework**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/serving)
-
-
---------------------------
-
- **Train a two tower model using recsys data**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/two_tower)
-
---------------------------
-
- **Preprocess the Criteo dataset for WideAndDeep Model**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/wnd)
-
-
---------------------------
-
- **Train an XGBoost model using Twitter dataset**
->![](../../../image/GitHub-Mark-32px.png)[View source on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/friesian/example/xgb)
-
--- a/docs/readthedocs/source/doc/Friesian/index.rst
+++ b/docs/readthedocs/source/doc/Friesian/index.rst
@ -1,66 +0,0 @@
-BigDL-Friesian
-=========================
-
-
-
-BigDL Friesian is an application framework for building optimized large-scale recommender solutions. The recommending workflows built on top of Friesian can seamlessly scale out to distributed big data clusters in the production environment.
-
-Friesian provides end-to-end support for three typical stages in a modern recommendation system:
-
- Offline stage: distributed feature engineering and model training.
- Nearline stage: Feature and model updates.
- Online stage: Recall and ranking.
-
-------
-
-.. grid:: 1 2 2 2
-    :gutter: 2
-
-    .. grid-item-card::
-
-        **Get Started**
-        ^^^
-
-        Documents in these sections helps you getting started quickly with Friesian.
-
-        +++
-
-        :bdg-link:`Introduction <./intro.html>`
-
-    .. grid-item-card::
-
-        **Key Features Guide**
-        ^^^
-
-        Each guide in this section provides you with in-depth information, concepts and knowledges about Friesian key features.
-
-        +++
-
-        :bdg-link:`Serving <./serving.html>`
-
-    .. grid-item-card::
-
-        **Use Cases**
-        ^^^
-
-        Use Cases and Examples.
-
-        +++
-
-        :bdg-link:`Use Cases <./examples.html>`
-
-    .. grid-item-card::
-
-        **API Document**
-        ^^^
-
-        API Document provides detailed description of Nano APIs.
-
-        +++
-
-        :bdg-link:`API Document <../PythonAPI/Friesian/index.html>`
-
-..  toctree::
-    :hidden:
-
-    BigDL-Friesian Document <self>
--- a/docs/readthedocs/source/doc/Friesian/intro.rst
+++ b/docs/readthedocs/source/doc/Friesian/intro.rst
@ -1,17 +0,0 @@
-Friesian Introduction
-==========================
-
-BigDL Friesian is an application framework for building optimized large-scale recommender solutions. The recommending workflows built on top of Friesian can seamlessly scale out to distributed big data clusters in the production environment.
-
-Friesian provides end-to-end support for three typical stages in a modern recommendation system:
-
- Offline stage: distributed feature engineering and model training.
- Nearline stage: Feature and model updates.
- Online stage: Recall and ranking.
-
-The overall architecture of Friesian is shown in the following diagram:
-
-
-.. image:: ../../../image/friesian_architecture.png
-
-
--- a/docs/readthedocs/source/doc/Friesian/serving.md
+++ b/docs/readthedocs/source/doc/Friesian/serving.md
@ -1,600 +0,0 @@
-## Serving Recommendation Framework
-
-### Architecture of the serving pipelines
-
-The diagram below demonstrates the components of the friesian serving system, which typically consists of three stages:
-
- Offline: Preprocess the data to get user/item DNN features and user/item Embedding features. Then use the embedding features and embedding model to get embedding vectors.
- Nearline: Retrieve user/item profiles and keep them in the Key-Value store. Retrieve item embedding vectors and build the faiss index. Make updates to the profiles from time to time.
- Online: Trigger the recommendation process whenever a user comes. Recall service generate candidates from millions of items based on embeddings and the deep learning model ranks the candidates for the final recommendation results.
-
-![](../../../image/friesian_architecture.png)
-
-
-### Services and APIs
-The friesian serving system consists of 4 types of services:
- Ranking Service: performs model inference and returns the results.
-  - `rpc doPredict(Content) returns (Prediction) {}`
-    - Input: The `encodeStr` is a Base64 string encoded from a bigdl [Activity](https://github.com/intel-analytics/BigDL/blob/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/nn/abstractnn/Activity.scala) serialized byte array.
-    ```bash
-    message Content {
-        string encodedStr = 1;
-    }
-    ```
-    - Output: The `predictStr` is a Base64 string encoded from a bigdl [Activity](https://github.com/intel-analytics/BigDL/blob/branch-2.0/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/nn/abstractnn/Activity.scala) (the inference result) serialized byte array.
-    ```bash
-    message Prediction {
-        string predictStr = 1;
-    }
-    ```
- Feature Service: searches user embeddings, user features or item features in Redis, and returns the features.
-  - `rpc getUserFeatures(IDs) returns (Features) {}` and `rpc getItemFeatures(IDs) returns (Features) {}`
-    - Input: The user/item id list for searching.
-    ```bash
-    message IDs {
-        repeated int32 ID = 1;
-    }
-    ```
-    - Output: `colNames` is a string list of the column names. `b64Feature` is a list of Base64 string, each string is encoded from java serialized array of objects. `ID` is a list of ids corresponding `b64Feature`.
-    ```bash
-    message Features {
-        repeated string colNames = 1;
-        repeated string b64Feature = 2;
-        repeated int32 ID = 3;
-    }
-    ```
- Recall Service: searches item candidates in the built faiss index and returns candidates id list.
-  - `rpc searchCandidates(Query) returns (Candidates) {}`
-    - Input: `userID` is the id of the user to search similar item candidates. `k` is the number of candidates.
-    ```bash
-    message Query {
-        int32 userID = 1;
-        int32 k = 2;
-    }
-    ```
-    - Output: `candidate` is the list of ids of item candidates.
-    ```bash
-    message Candidates {
-        repeated int32 candidate = 1;
-    }
-    ```
- Recommender Service: gets candidates from the recall service, calls the feature service to get the user and item candidate's features, then sorts the inference results from ranking service and returns the top recommendNum items.
-  - `rpc getRecommendIDs(RecommendRequest) returns (RecommendIDProbs) {}`
-    - Input: `ID` is a list of user ids to recommend. `recommendNum` is the number of items to recommend. `candidateNum` is the number of generated candidates to inference in ranking service.
-    ```bash
-    message RecommendRequest {
-        int32 recommendNum = 1;
-        int32 candidateNum = 2;
-        repeated int32 ID = 3;
-    }
-    ```
-    - Output: `IDProbList` is a list of results corresponding to user `ID` in input. Each `IDProbs` consists of `ID` and `prob`, `ID` is the list of item ids, and `prob` is the corresponding probability.
-    ```bash
-    message RecommendIDProbs {
-        repeated IDProbs IDProbList = 1;
-    }
-    message IDProbs {
-        repeated int32 ID = 1;
-        repeated float prob = 2;
-    }
-    ```
-
-### Quick Start
-You can run Friesian Serving Recommendation Framework using the official Docker images.
-
-You can follow the following steps to run the WnD demo.
-
-1. Pull docker image from dockerhub
-   ```bash
-   docker pull intelanalytics/friesian-grpc:0.0.2
-   ```
-
-2. Run & enter docker container
-   ```bash
-   docker run -itd --name friesian --net=host intelanalytics/friesian-grpc:0.0.2
-   docker exec -it friesian bash
-   ```
-
-3. Add vec_feature_user_prediction.parquet, vec_feature_item_prediction.parquet, wnd model,
-   wnd_item.parquet and wnd_user.parquet (You can check [the schema of the parquet files](#schema-of-the-parquet-files))
-
-4. Start ranking service
-   ```bash
-   export OMP_NUM_THREADS=1
-   java -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.ranking.RankingServer -c config_ranking.yaml > logs/inf.log 2>&1 &
-   ```
-
-5. Start feature service for recommender service
-   ```bash
-   ./redis-5.0.5/src/redis-server &
-   java -Dspark.master=local[*] -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.feature.FeatureServer -c config_feature.yaml > logs/feature.log 2>&1 &
-   ```
-
-6. Start feature service for recall service
-   ```bash
-   java -Dspark.master=local[*] -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.feature.FeatureServer -c config_feature_vec.yaml > logs/fea_recall.log 2>&1 &
-   ```
-
-7. Start recall service
-   ```bash
-   java -Dspark.master=local[*] -Dspark.driver.maxResultSize=2G -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.recall.RecallServer -c config_recall.yaml > logs/vec.log 2>&1 &
-   ```
-
-8. Start recommender service
-   ```bash
-   java -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.recommender.RecommenderServer -c config_recommender.yaml > logs/rec.log 2>&1 &
-   ```
-
-9. Check if the services are running
-   ```bash
-   ps aux|grep friesian
-   ```
-   You will see 5 processes start with 'java'
-
-10. Run client to test
-    ```bash
-    java -Dspark.master=local[*] -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.recommender.RecommenderMultiThreadClient -target localhost:8980 -dataDir wnd_user.parquet -k 50 -clientNum 4 -testNum 2
-    ```
-11. Close services
-    ```bash
-    ps aux|grep friesian (find the service pid)
-    kill xxx (pid of the service which should be closed)
-    ```
-
-### Schema of the parquet files
-
-#### The schema of the user and item embedding files
-The embedding parquet files should contain at least 2 columns, id column and prediction column.
-The id column should be IntegerType and the column name should be specified in the config files.
-The prediction column should be DenseVector type, and you can transfer your existing embedding vectors using pyspark:
-```python
-from pyspark.sql import SparkSession
-from pyspark.sql.functions import udf, col
-from pyspark.ml.linalg import VectorUDT, DenseVector
-
-spark = SparkSession.builder \
-        .master("local[*]") \
-        .config("spark.driver.memory", "2g") \
-        .getOrCreate()
-
-df = spark.read.parquet("data_path")
-
-def trans_densevector(data):
-   return DenseVector(data)
-
-vector_udf = udf(lambda x: trans_densevector(x), VectorUDT())
-# suppose the embedding column (ArrayType(FloatType,true)) is the existing user/item embedding.
-df = df.withColumn("prediction", vector_udf(col("embedding")))
-df.write.parquet("output_file_path", mode="overwrite")
-```
-
-#### The schema of the recommendation model feature files
-The feature parquet files should contain at least 2 columns, the id column and other feature columns.
-The feature columns can be int, float, double, long and array of int, float, double and long.
-Here is an example of the WideAndDeep model feature.
-```bash
-+-------------+--------+--------+----------+--------------------------------+---------------------------------+------------+-----------+---------+----------------------+-----------------------------+
-|present_media|language|tweet_id|tweet_type|engaged_with_user_follower_count|engaged_with_user_following_count|len_hashtags|len_domains|len_links|present_media_language|engaged_with_user_is_verified|
-+-------------+--------+--------+----------+--------------------------------+---------------------------------+------------+-----------+---------+----------------------+-----------------------------+
-|            9|      43|     924|         2|                               6|                                3|         0.0|        0.1|      0.1|                    45|                            1|
-|            0|       6| 4741724|         2|                               3|                                3|         0.0|        0.0|      0.0|                   527|                            0|
-+-------------+--------+--------+----------+--------------------------------+---------------------------------+------------+-----------+---------+----------------------+-----------------------------+
-```
-
-### The data schema in Redis
-The user features, item features and user embedding vectors are saved in Redis.
-The data saved in Redis is a key-value set.
-
-#### Key in Redis
-The key in Redis consists of 3 parts: key prefix, data type, and data id.
- Key prefix is `redisKeyPrefix` specified in the feature service config file.
- Data type is one of `user` or `item`.
- Data id is the value of `userIDColumn` or `itemIDColumn`.
-Here is an example of key: `2tower_user:29`
-
-#### Value in Redis
-A row in the input parquet file will be converted to java array of object, then serialized into byte array, and encoded into Base64 string.
-
-#### Data schema entry
-Every key prefix and data type combination has its data schema entry to save the corresponding column names. The key of the schema entry is `keyPrefix + dataType`, such as `2tower_user`. The value of the schema entry is a string of column names separated by `,`, such as `enaging_user_follower_count,enaging_user_following_count,enaging_user_is_verified`.
-
-### Config for different service
-You can pass some important information to services using `-c config.yaml`
-```bash
-java -Dspark.master=local[*] -Dspark.driver.maxResultSize=2G -cp bigdl-friesian-serving-spark_2.4.6-0.14.0-SNAPSHOT.jar com.intel.analytics.bigdl.friesian.serving.recall.RecallServer -c config_recall.yaml
-```
-
-#### Ranking Service Config
-Config with example:
-```yaml
-# Default: 8980, which port to create the server
-servicePort: 8083
-
-# Default: 0, open a port for prometheus monitoring tool, if set, user can check the
-# performance using prometheus
-monitorPort: 1234
-
-# model path must be provided
-modelPath: /home/yina/Documents/model/recys2021/wnd_813/recsys_wnd
-
-# default: null, savedmodel input list if the model is tf savedmodel. If not provided, the inputs
-# of the savedmodel will be arranged in alphabetical order
-savedModelInputs: serving_default_input_1:0, serving_default_input_2:0, serving_default_input_3:0, serving_default_input_4:0, serving_default_input_5:0, serving_default_input_6:0, serving_default_input_7:0, serving_default_input_8:0, serving_default_input_9:0, serving_default_input_10:0, serving_default_input_11:0, serving_default_input_12:0, serving_default_input_13:0
-
-# default: 1, number of models used in inference service
-modelParallelism: 4
-```
-
-##### Feature Service Config
-Config with example:
-1. load data into redis. Search data from redis
-   ```yaml
-   ### Basic setting
-   # Default: 8980, which port to create the server
-   servicePort: 8082
-
-   # Default: null, open a port for prometheus monitoring tool, if set, user can check the
-   # performance using prometheus
-   monitorPort: 1235
-
-   # 'kv' or 'inference' default: kv
-   serviceType: kv
-
-   # default: false, if need to load initial data to redis, set true
-   loadInitialData: true
-
-   # default: "", prefix for redis key
-   redisKeyPrefix:
-
-   # default: 0, item slot type on redis cluster. 0 means slot number use the default value 16384, 1 means all keys save to same slot, 2 means use the last character of id as hash tag.
-   redisClusterItemSlotType: 2
-
-   # default: null, if loadInitialData=true, initialUserDataPath or initialItemDataPath must be
-   # provided. Only support parquet file
-   initialUserDataPath: /home/yina/Documents/data/recsys/preprocess_output/wnd_user.parquet
-   initialItemDataPath: /home/yina/Documents/data/recsys/preprocess_output/wnd_exp1/wnd_item.parquet
-
-   # default: null, if loadInitialData=true and initialUserDataPath != null, userIDColumn and
-   # userFeatureColumns must be provided
-   userIDColumn: enaging_user_id
-   userFeatureColumns: enaging_user_follower_count,enaging_user_following_count
-
-   # default: null, if loadInitialData=true and initialItemDataPath != null, userIDColumn and
-   # userFeatureColumns must be provided
-   itemIDColumn: tweet_id
-   itemFeatureColumns: present_media, language, tweet_id, hashtags, present_links, present_domains, tweet_type, engaged_with_user_follower_count,engaged_with_user_following_count, len_hashtags, len_domains, len_links, present_media_language, tweet_id_engaged_with_user_id
-
-   # default: null, user model path or item model path must be provided if serviceType
-   # contains 'inference'. If serviceType=kv, usermodelPath, itemModelPath and modelParallelism will
-   # be ignored
-   # userModelPath:
-
-   # default: null, user model path or item model path must be provided if serviceType
-   # contains 'inference'. If serviceType=kv, usermodelPath, itemModelPath and modelParallelism will
-   # be ignored
-   # itemModelPath:
-
-   # default: 1, number of models used for inference
-   # modelParallelism:
-
-   ### Redis Configuration
-   # default: localhost:6379
-   # redisUrl:
-
-   # default: 256, JedisPoolMaxTotal
-   # redisPoolMaxTotal:
-   ```
-
-2. load user features into redis. Get features from redis, use model at 'userModelPath' to do
-   inference and get the user embedding
-   ```yaml
-   ### Basic setting
-   # Default: 8980, which port to create the server
-   servicePort: 8085
-
-   # Default: null, open a port for prometheus monitoring tool, if set, user can check the
-   # performance using prometheus
-   monitorPort: 1236
-
-   # 'kv' or 'inference' default: kv
-   serviceType: kv, inference
-
-   # default: false, if need to load initial data to redis, set true
-   loadInitialData: true
-
-   # default: ""
-   redisKeyPrefix: 2tower_
-
-   # default: 0, item slot type on redis cluster. 0 means slot number use the default value 16384, 1 means all keys save to same slot, 2 means use the last character of id as hash tag.
-   redisClusterItemSlotType: 2
-
-   # default: null, if loadInitialData=true, initialDataPath must be provided. Only support parquet
-   # file
-   initialUserDataPath: /home/yina/Documents/data/recsys/preprocess_output/guoqiong/vec_feature_user.parquet
-   # initialItemDataPath:
-
-   # default: null, if loadInitialData=true and initialUserDataPath != null, userIDColumn and
-   # userFeatureColumns must be provided
-   #userIDColumn: user
-   userIDColumn: enaging_user_id
-   userFeatureColumns: user
-
-   # default: null, if loadInitialData=true and initialItemDataPath != null, userIDColumn and
-   # userFeatureColumns must be provided
-   # itemIDColumn:
-   # itemFeatureColumns:
-
-   # default: null, user model path or item model path must be provided if serviceType
-   # includes 'inference'. If serviceType=kv, usermodelPath, itemModelPath and modelParallelism will
-   # be ignored
-   userModelPath: /home/yina/Documents/model/recys2021/2tower/guoqiong/user-model
-
-   # default: null, user model path or item model path must be provided if serviceType
-   # contains 'inference'. If serviceType=kv, usermodelPath, itemModelPath and modelParallelism will
-   # be ignored
-   # itemModelPath:
-
-   # default: 1, number of models used for inference
-   # modelParallelism:
-
-   ### Redis Configuration
-   # default: localhost:6379
-   # redisUrl:
-
-   # default: 256, JedisPoolMaxTotal
-   # redisPoolMaxTotal:
-   ```
-
-#### Recall Service Config
-Config with example:
-
-1. load initial item vector from vec_feature_item.parquet and item-model to build faiss index.
-   ```yaml
-   # Default: 8980, which port to create the server
-   servicePort: 8084
-
-   # Default: null, open a port for prometheus monitoring tool, if set, user can check the
-   # performance using prometheus
-   monitorPort: 1238
-
-   # default: 128, the dimensionality of the embedding vectors
-   indexDim: 50
-
-   # default: false, if load saved index, set true
-   # loadSavedIndex: true
-
-   # default: false, if true, the built index will be saved to indexPath. Ignored when
-   # loadSavedIndex=true
-   saveBuiltIndex: true
-
-   # default: null, path to saved index path, must be provided if loadSavedIndex=true
-   indexPath: ./2tower_item_full.idx
-
-   # default: false
-   getFeatureFromFeatureService: true
-
-   # default: localhost:8980, feature service target
-   featureServiceURL: localhost:8085
-
-   itemIDColumn: tweet_id
-   itemFeatureColumns: item
-
-   # default: null, user model path must be provided if getFeatureFromFeatureService=false
-   # userModelPath:
-
-   # default: null, item model path must be provided if loadSavedIndex=false and initialDataPath is
-   # not orca predict result
-   itemModelPath: /home/yina/Documents/model/recys2021/2tower/guoqiong/item-model
-
-   # default: null,  Only support parquet file
-   initialDataPath: /home/yina/Documents/data/recsys/preprocess_output/guoqiong/vec_feature_item.parquet
-
-   # default: 1, number of models used in inference service
-   modelParallelism: 1
-   ```
-
-2. load existing faiss index
-   ```yaml
-   # Default: 8980, which port to create the server
-   servicePort: 8084
-
-   # Default: null, open a port for prometheus monitoring tool, if set, user can check the
-   # performance using prometheus
-   monitorPort: 1238
-
-   # default: 128, the dimensionality of the embedding vectors
-   # indexDim:
-
-   # default: false, if load saved index, set true
-   loadSavedIndex: true
-
-   # default: null, path to saved index path, must be provided if loadSavedIndex=true
-   indexPath: ./2tower_item_full.idx
-
-   # default: false
-   getFeatureFromFeatureService: true
-
-   # default: localhost:8980, feature service target
-   featureServiceURL: localhost:8085
-
-   # itemIDColumn:
-   # itemFeatureColumns:
-
-   # default: null, user model path must be provided if getFeatureFromFeatureService=false
-   # userModelPath:
-
-   # default: null, item model path must be provided if loadSavedIndex=false and initialDataPath is
-   # not orca predict result
-   # itemModelPath:
-
-   # default: null,  Only support parquet file
-   # initialDataPath:
-
-   # default: 1, number of models used in inference service
-   # modelParallelism:
-   ```
-#### Recommender Service Config
-Config with example:
-
-```yaml
- Default: 8980, which port to create the server
- servicePort: 8980
-
- # Default: null, open a port for prometheus monitoring tool, if set, user can check the
- # performance using prometheus
- monitorPort: 1237
-
- # default: null, must be provided, item column name
- itemIDColumn: tweet_id
-
-# default: null, must be provided, column names for inference, order related.
-inferenceColumns: present_media_language, present_media, tweet_type, language, hashtags, present_links, present_domains, tweet_id_engaged_with_user_id, engaged_with_user_follower_count, engaged_with_user_following_count, enaging_user_follower_count, enaging_user_following_count, len_hashtags, len_domains, len_links
-
- # default: 0, if set, ranking service request will be divided
-inferenceBatch: 0
-
-# default: localhost:8980, recall service target
-recallServiceURL: localhost:8084
-
-# default: localhost:8980, feature service target
-featureServiceURL: localhost:8082
-
-# default: localhost:8980, inference service target
-rankingServiceURL: localhost:8083
-```
-
-### Run Java Client
-
-#### Generate proto java files
-You should init a maven project and use proto files in [friesian gRPC project](https://github.com/analytics-zoo/friesian/tree/recsys-grpc/src/main/proto)
-Make sure to add the following extensions and plugins in your pom.xml, and replace
-*protocExecutable* with your own protoc executable.
-```xml
-<build>
-    <extensions>
-        <extension>
-            <groupId>kr.motd.maven</groupId>
-            <artifactId>os-maven-plugin</artifactId>
-            <version>1.6.2</version>
-        </extension>
-    </extensions>
-    <plugins>
-        <plugin>
-            <groupId>org.apache.maven.plugins</groupId>
-            <artifactId>maven-compiler-plugin</artifactId>
-            <version>3.8.0</version>
-            <configuration>
-                <source>8</source>
-                <target>8</target>
-            </configuration>
-        </plugin>
-        <plugin>
-            <groupId>org.xolstice.maven.plugins</groupId>
-            <artifactId>protobuf-maven-plugin</artifactId>
-            <version>0.6.1</version>
-            <configuration>
-                <protocArtifact>com.google.protobuf:protoc:3.12.0:exe:${os.detected.classifier}</protocArtifact>
-                <pluginId>grpc-java</pluginId>
-                <pluginArtifact>io.grpc:protoc-gen-grpc-java:1.37.0:exe:${os.detected.classifier}</pluginArtifact>
-                <protocExecutable>/home/yina/Documents/protoc/bin/protoc</protocExecutable>
-            </configuration>
-            <executions>
-                <execution>
-                    <goals>
-                        <goal>compile</goal>
-                        <goal>compile-custom</goal>
-                    </goals>
-                </execution>
-            </executions>
-        </plugin>
-    </plugins>
-</build>
-```
-Then you can generate the gRPC files with
-```bash
-mvn clean install
-```
-#### Call recommend service function using blocking stub
-You can check the [Recommend service client example](https://github.com/analytics-zoo/friesian/blob/recsys-grpc/src/main/java/grpc/recommend/RecommendClient.java) on Github
-
-```java
-import com.intel.analytics.bigdl.friesian.serving.grpc.generated.recommender.RecommenderGrpc;
-import com.intel.analytics.bigdl.friesian.serving.grpc.generated.recommender.RecommenderProto.*;
-
-public class RecommendClient {
-    public static void main(String[] args) {
-        // Create a channel
-        ManagedChannel channel = ManagedChannelBuilder.forTarget(targetURL).usePlaintext().build();
-        // Init a recommend service blocking stub
-        RecommenderGrpc.RecommenderBlockingStub blockingStub = RecommenderGrpc.newBlockingStub(channel);
-        // Construct a request
-        int[] userIds = new int[]{1};
-        int candidateNum = 50;
-        int recommendNum = 10;
-        RecommendRequest.Builder request = RecommendRequest.newBuilder();
-        for (int id : userIds) {
-            request.addID(id);
-        }
-        request.setCandidateNum(candidateNum);
-        request.setRecommendNum(recommendNum);
-        RecommendIDProbs recommendIDProbs = null;
-        try {
-            recommendIDProbs = blockingStub.getRecommendIDs(request.build());
-            logger.info(recommendIDProbs.getIDProbListList());
-        } catch (StatusRuntimeException e) {
-            logger.warn("RPC failed: " + e.getStatus().toString());
-        }
-    }
-}
-```
-
-### Run Python Client
-Install the python packages listed below (you may encounter [pyspark error](https://stackoverflow.com/questions/58700384/how-to-fix-typeerror-an-integer-is-required-got-type-bytes-error-when-tryin) if you have python>=3.8 installed, try to downgrade to python<=3.7 and try again).
-```bash
-pip install jupyter notebook==6.1.4 grpcio grpcio-tools pandas fastparquet pyarrow
-```
-After you activate your server successfully, you can
-
-#### Generate proto python files
-Generate the files with
-```bash
-python -m grpc_tools.protoc -I../../protos --python_out=<path_to_output_folder> --grpc_python_out=<path_to_output_folder> <path_to_friesian>/src/main/proto/*.proto
-```
-
-#### Call recommend service function using blocking stub
-You can check the [Recommend service client example](https://github.com/analytics-zoo/friesian/blob/recsys-grpc/Serving/WideDeep/recommend_client.ipynb) on Github
-```python
-# create a channel
-channel = grpc.insecure_channel('localhost:8980')
-# create a recommend service stub
-stub = recommender_pb2_grpc.RecommenderStub(channel)
-request = recommender_pb2.RecommendRequest(recommendNum=10, candidateNum=50, ID=[36407])
-results = stub.getRecommendIDs(request)
-print(results.IDProbList)
-
-```
-### Scale-out for Big Data
-#### Redis Cluster
-For large data set, Redis standalone has no enough memory to store whole data set, data sharding and Redis cluster are supported to handle it. You only need to set up a Redis Cluster to get it work.
-
-First, start N Redis instance on N machines.
-```
-redis-server --cluster-enabled yes --cluster-config-file nodes-0.conf --cluster-node-timeout 50000 --appendonly no --save "" --logfile 0.log --daemonize yes --protected-mode no --port 6379
-```
-on each machine, choose a different port and start another M instances(M>=1), as the slave nodes of above N instances.
-
-Then, call initialization command on one machine, if you choose M=1 above, use `--cluster-replicas 1`
-```
-redis-cli --cluster create 172.168.3.115:6379 172.168.3.115:6380 172.168.3.116:6379 172.168.3.116:6380 172.168.3.117:6379 172.168.3.117:6380 --cluster-replicas 1
-```
-and the Redis cluster would be ready.
-
-#### Scale Service with Envoy
-Each of the services could be scaled out. It is recommended to use the same resource, e.g. single machine with same CPU and memory, to test which service is bottleneck. From empirical observations, vector search and inference usually be.
-
-##### How to run envoy:
-1. [download](https://www.envoyproxy.io/docs/envoy/latest/start/install) and deploy envoy(below use docker as example):
-   * download: `docker pull envoyproxy/envoy-dev:21df5e8676a0f705709f0b3ed90fc2dbbd63cfc5`
-2. run command: `docker run --rm -it  -p 9082:9082 -p 9090:9090 envoyproxy/envoy-dev:79ade4aebd02cf15bd934d6d58e90aa03ef6909e --config-yaml "$(cat path/to/service-specific-envoy.yaml)" --parent-shutdown-time-s 1000000`
-3. validate: run `netstat -tnlp` to see if the envoy process is listening to the corresponding port in the envoy config file.
-4. For details on envoy and sample procedure, read [envoy](envoy.md).
--- a/docs/readthedocs/source/doc/GetStarted/index.rst
+++ b/docs/readthedocs/source/doc/GetStarted/index.rst
@ -1,6 +0,0 @@
-User Guide
-=========================
-
-
-Getting Started
-===========================================
--- a/docs/readthedocs/source/doc/GetStarted/install.rst
+++ b/docs/readthedocs/source/doc/GetStarted/install.rst
@ -1,2 +0,0 @@
-Install Locally
-=========================
--- a/docs/readthedocs/source/doc/GetStarted/paper.md
+++ b/docs/readthedocs/source/doc/GetStarted/paper.md
@ -1,28 +0,0 @@
-# Paper
-
-
-## Paper
-
-* Dai, Jason Jinquan, et al. "BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022. [paper](https://arxiv.org/ftp/arxiv/papers/2204/2204.01715.pdf) [video]() [demo]()
-
-* Dai, Jason Jinquan, et al. "BigDL: A distributed deep learning framework for big data." Proceedings of the ACM Symposium on Cloud Computing. 2019. [paper](https://arxiv.org/abs/1804.05839)
-
-
-
-
-## Citing
-If you've found BigDL useful for your project, you may cite the [paper](https://arxiv.org/abs/1804.05839) as follows:
-
-```
-@inproceedings{SOCC2019_BIGDL,
-  title={BigDL: A Distributed Deep Learning Framework for Big Data},
-  author={Dai, Jason (Jinquan) and Wang, Yiheng and Qiu, Xin and Ding, Ding and Zhang, Yao and Wang, Yanzhang and Jia, Xianyan and Zhang, Li (Cherry) and Wan, Yan and Li, Zhichao and Wang, Jiao and Huang, Shengsheng and Wu, Zhongyuan and Wang, Yang and Yang, Yuhao and She, Bowen and Shi, Dongjie and Lu, Qi and Huang, Kai and Song, Guoqiong},
-  booktitle={Proceedings of the ACM Symposium on Cloud Computing},
-  publisher={Association for Computing Machinery},
-  pages={50--60},
-  year={2019},
-  series={SoCC'19},
-  doi={10.1145/3357223.3362707},
-  url={https://arxiv.org/pdf/1804.05839.pdf}
-}
-```
--- a/docs/readthedocs/source/doc/GetStarted/usecase.rst
+++ b/docs/readthedocs/source/doc/GetStarted/usecase.rst
@ -1,2 +0,0 @@
-Use Cases
-============================
--- a/docs/readthedocs/source/doc/GetStarted/videos.md
+++ b/docs/readthedocs/source/doc/GetStarted/videos.md
--- a/docs/readthedocs/source/doc/LLM/Inference/Self_Speculative_Decoding.md
+++ b/docs/readthedocs/source/doc/LLM/Inference/Self_Speculative_Decoding.md
@ -4,10 +4,10 @@
 In [speculative](https://arxiv.org/abs/2302.01318) [decoding](https://arxiv.org/abs/2211.17192), a small (draft) model quickly generates multiple draft tokens, which are then verified in parallel by the large (target) model. While speculative decoding can effectively speed up the target model, ***in practice it is difficult to maintain or even obtain a proper draft model***, especially when the target model is finetuned with customized data. 

 ### Self-Speculative Decoding 
-Built on top of the concept of “[self-speculative decoding](https://arxiv.org/abs/2309.08168)”, BigDL-LLM can now accelerate the original FP16 or BF16 model ***without the need of a separate draft model or model finetuning***; instead, it automatically converts the original model to INT4, and uses the INT4 model as the draft model behind the scene. In practice, this brings ***~30% speedup*** for FP16 and BF16 LLM inference latency on Intel GPU and CPU respectively.
+Built on top of the concept of “[self-speculative decoding](https://arxiv.org/abs/2309.08168)”, IPEX-LLM can now accelerate the original FP16 or BF16 model ***without the need of a separate draft model or model finetuning***; instead, it automatically converts the original model to INT4, and uses the INT4 model as the draft model behind the scene. In practice, this brings ***~30% speedup*** for FP16 and BF16 LLM inference latency on Intel GPU and CPU respectively.

-### Using BigDL-LLM Self-Speculative Decoding
-Please refer to BigDL-LLM self-speculative decoding code snippets below, and the detailed [GPU](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/Speculative-Decoding) and [CPU](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Speculative-Decoding) examples in the project repo.
+### Using IPEX-LLM Self-Speculative Decoding
+Please refer to IPEX-LLM self-speculative decoding code snippets below, and the detailed [GPU](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Speculative-Decoding) and [CPU](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Speculative-Decoding) examples in the project repo.

 ```python
 model = AutoModelForCausalLM.from_pretrained(model_path,
--- a/docs/readthedocs/source/doc/LLM/Overview/FAQ/faq.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/FAQ/faq.md
@ -2,44 +2,44 @@

 ## General Info & Concepts

-### GGUF format usage with BigDL-LLM?
+### GGUF format usage with IPEX-LLM?

-BigDL-LLM supports running GGUF/AWQ/GPTQ models on both [CPU](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Advanced-Quantizations) and [GPU](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations).
-Please also refer to [here](https://github.com/intel-analytics/BigDL?tab=readme-ov-file#latest-update-) for our latest support.
+IPEX-LLM supports running GGUF/AWQ/GPTQ models on both [CPU](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Advanced-Quantizations) and [GPU](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Advanced-Quantizations).
+Please also refer to [here](https://github.com/intel-analytics/ipex-llm?tab=readme-ov-file#latest-update-) for our latest support.

 ## How to Resolve Errors

-### Fail to install `bigdl-llm` through `pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu`
+### Fail to install `ipex-llm` through `pip install --pre --upgrade ipex-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu`

-You could try to install BigDL-LLM dependencies for Intel XPU from source archives:
- For Windows system, refer to [here](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#install-bigdl-llm-from-wheel) for the steps.
- For Linux system, refer to [here](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#id3) for the steps.
+You could try to install IPEX-LLM dependencies for Intel XPU from source archives:
+- For Windows system, refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#install-ipex-llm-from-wheel) for the steps.
+- For Linux system, refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#id3) for the steps.

 ### PyTorch is not linked with support for xpu devices

-1. Before running on Intel GPUs, please make sure you've prepared environment follwing [installation instruction](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html).
-2. If you are using an older version of `bigdl-llm` (specifically, older than 2.5.0b20240104), you need to manually add `import intel_extension_for_pytorch as ipex` at the beginning of your code.
-3. After optimizing the model with BigDL-LLM, you need to move model to GPU through `model = model.to('xpu')`.
-4. If you have mutil GPUs, you could refer to [here](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/KeyFeatures/multi_gpus_selection.html) for details about GPU selection.
+1. Before running on Intel GPUs, please make sure you've prepared environment follwing [installation instruction](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html).
+2. If you are using an older version of `ipex-llm` (specifically, older than 2.5.0b20240104), you need to manually add `import intel_extension_for_pytorch as ipex` at the beginning of your code.
+3. After optimizing the model with IPEX-LLM, you need to move model to GPU through `model = model.to('xpu')`.
+4. If you have mutil GPUs, you could refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/KeyFeatures/multi_gpus_selection.html) for details about GPU selection.
 5. If you do inference using the optimized model on Intel GPUs, you also need to set `to('xpu')` for input tensors.

 ### Import `intel_extension_for_pytorch` error on Windows GPU

-Please refer to [here](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#error-loading-intel-extension-for-pytorch) for detailed guide. We list the possible missing requirements in environment which could lead to this error.
+Please refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#error-loading-intel-extension-for-pytorch) for detailed guide. We list the possible missing requirements in environment which could lead to this error.

 ### XPU device count is zero

 It's recommended to reinstall driver:
- For Windows system, refer to [here](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#prerequisites) for the steps.
- For Linux system, refer to [here](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#id1) for the steps.
+- For Windows system, refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#prerequisites) for the steps.
+- For Linux system, refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#id1) for the steps.

 ### Error such as `The size of tensor a (33) must match the size of tensor b (17) at non-singleton dimension 2` duing attention forward function

-If you are using BigDL-LLM PyTorch API, please try to set `optimize_llm=False` manually when call `optimize_model` function to work around. As for BigDL-LLM `transformers`-style API, you could try to set `optimize_model=False` manually when call `from_pretrained` function to work around.
+If you are using IPEX-LLM PyTorch API, please try to set `optimize_llm=False` manually when call `optimize_model` function to work around. As for IPEX-LLM `transformers`-style API, you could try to set `optimize_model=False` manually when call `from_pretrained` function to work around.

 ### ValueError: Unrecognized configuration class

-This error is not quite relevant to BigDL-LLM. It could be that you're using the incorrect AutoClass, or the transformers version is not updated, or transformers does not support using AutoClasses to load this model. You need to refer to the model card in huggingface to confirm these information. Besides, if you load the model from local path, please also make sure you download the complete model files.
+This error is not quite relevant to IPEX-LLM. It could be that you're using the incorrect AutoClass, or the transformers version is not updated, or transformers does not support using AutoClasses to load this model. You need to refer to the model card in huggingface to confirm these information. Besides, if you load the model from local path, please also make sure you download the complete model files.

 ### `mixed dtype (CPU): expect input to have scalar type of BFloat16` during inference

@ -62,7 +62,7 @@ You may encounter this error during finetuning on multi GPUs. Please try `sudo a

 ### Random and unreadable output of Gemma-7b-it on Arc770 ubuntu 22.04 due to driver and OneAPI missmatching.

-If driver and OneAPI missmatching, it will lead to some error when BigDL use XMX(short prompts) for speeding up.
+If driver and OneAPI missmatching, it will lead to some error when IPEX-LLM uses XMX(short prompts) for speeding up.
 The output of `What's AI?` may like below:
 ```
 wiedzy Artificial Intelligence meliti: Artificial Intelligence undenti beng beng beng beng beng beng beng beng beng beng beng beng beng beng beng beng beng beng beng beng beng beng
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/cli.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/cli.md
@ -4,7 +4,7 @@

 .. note:: 

-   Currently ``bigdl-llm`` CLI supports *LLaMA* (e.g., vicuna), *GPT-NeoX* (e.g., redpajama), *BLOOM* (e.g., pheonix) and *GPT2* (e.g., starcoder) model architecture; for other models, you may use the ``transformers``-style or LangChain APIs.
+   Currently ``ipex-llm`` CLI supports *LLaMA* (e.g., vicuna), *GPT-NeoX* (e.g., redpajama), *BLOOM* (e.g., pheonix) and *GPT2* (e.g., starcoder) model architecture; for other models, you may use the ``transformers``-style or LangChain APIs.
 ```

 ## Convert Model
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/finetune.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/finetune.md
@ -1,6 +1,6 @@
 # Finetune (QLoRA)

-We also support finetuning LLMs (large language models) using QLoRA with BigDL-LLM 4bit optimizations on Intel GPUs.
+We also support finetuning LLMs (large language models) using QLoRA with IPEX-LLM 4bit optimizations on Intel GPUs.

 ```eval_rst
 .. note::
@ -15,7 +15,7 @@ To help you better understand the finetuning process, here we use model [Llama-2
 ```eval_rst
 .. note::

-   If you are using an older version of ``bigdl-llm`` (specifically, older than 2.5.0b20240104), you need to manually add ``import intel_extension_for_pytorch as ipex`` at the beginning of your code.
+   If you are using an older version of ``ipex-llm`` (specifically, older than 2.5.0b20240104), you need to manually add ``import intel_extension_for_pytorch as ipex`` at the beginning of your code.
 ```

 First, load model using `transformers`-style API and **set it to `to('xpu')`**. We specify `load_in_low_bit="nf4"` here to apply 4-bit NormalFloat optimization. According to the [QLoRA paper](https://arxiv.org/pdf/2305.14314.pdf), using `"nf4"` could yield better model quality than `"int4"`.
@ -54,11 +54,11 @@ model = get_peft_model(model, config)
 ```eval_rst
 .. important::

-   Instead of ``from peft import prepare_model_for_kbit_training, get_peft_model`` as we did for regular QLoRA using bitandbytes and cuda, we import them from ``ipex_llm.transformers.qlora`` here to get a BigDL-LLM compatible Peft model. And the rest is just the same as regular LoRA finetuning process using ``peft``.
+   Instead of ``from peft import prepare_model_for_kbit_training, get_peft_model`` as we did for regular QLoRA using bitandbytes and cuda, we import them from ``ipex_llm.transformers.qlora`` here to get a IPEX-LLM compatible Peft model. And the rest is just the same as regular LoRA finetuning process using ``peft``.
 ```

 ```eval_rst
 .. seealso::

-   See the complete examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU>`_
+   See the complete examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU>`_
 ```
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/gpu_supports.rst
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/gpu_supports.rst
@ -1,7 +1,7 @@
 GPU Supports
 ================================

-BigDL-LLM not only supports running large language models for inference, but also supports QLoRA finetuning on Intel GPUs.
+IPEX-LLM not only supports running large language models for inference, but also supports QLoRA finetuning on Intel GPUs.

 * |inference_on_gpu|_
 * `Finetune (QLoRA) <./finetune.html>`_
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/hugging_face_format.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/hugging_face_format.md
@ -25,7 +25,7 @@ output = tokenizer.batch_decode(output_ids)
 ```eval_rst
 .. seealso::

-   See the complete CPU examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels>`_ and GPU examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels>`_.
+   See the complete CPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels>`_ and GPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels>`_.

 .. note::

@ -35,7 +35,7 @@ output = tokenizer.batch_decode(output_ids)

      model = AutoModelForCausalLM.from_pretrained('/path/to/model/', load_in_low_bit="sym_int5")

-   See the CPU example `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/More-Data-Types>`_ and GPU example `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types>`_.
+   See the CPU example `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/More-Data-Types>`_ and GPU example `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types>`_.
 ```

 ## Save & Load
@ -50,5 +50,5 @@ new_model = AutoModelForCausalLM.load_low_bit(model_path)
 ```eval_rst
 .. seealso::

-   See the CPU example `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Save-Load>`_ and GPU example `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Save-Load>`_
+   See the CPU example `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Save-Load>`_ and GPU example `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Save-Load>`_
 ```
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/index.rst
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/index.rst
@ -1,7 +1,7 @@
-BigDL-LLM Key Features
+IPEX-LLM Key Features
 ================================

-You may run the LLMs using ``bigdl-llm`` through one of the following APIs:
+You may run the LLMs using ``ipex-llm`` through one of the following APIs:

 * `PyTorch API <./optimize_model.html>`_
 * |transformers_style_api|_
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/inference_on_gpu.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/inference_on_gpu.md
@ -1,6 +1,6 @@
 # Inference on GPU

-Apart from the significant acceleration capabilites on Intel CPUs, BigDL-LLM also supports optimizations and acceleration for running LLMs (large language models) on Intel GPUs. With BigDL-LLM, PyTorch models (in FP16/BF16/FP32) can be optimized with low-bit quantizations (supported precisions include INT4, INT5, INT8, etc).
+Apart from the significant acceleration capabilites on Intel CPUs, IPEX-LLM also supports optimizations and acceleration for running LLMs (large language models) on Intel GPUs. With IPEX-LLM, PyTorch models (in FP16/BF16/FP32) can be optimized with low-bit quantizations (supported precisions include INT4, INT5, INT8, etc).

 Compared with running on Intel CPUs, some additional operations are required on Intel GPUs. To help you better understand the process, here we use a popular model [Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) as an example.

@ -9,14 +9,14 @@ Compared with running on Intel CPUs, some additional operations are required on
 ```eval_rst
 .. note::

-   If you are using an older version of ``bigdl-llm`` (specifically, older than 2.5.0b20240104), you need to manually add ``import intel_extension_for_pytorch as ipex`` at the beginning of your code.
+   If you are using an older version of ``ipex-llm`` (specifically, older than 2.5.0b20240104), you need to manually add ``import intel_extension_for_pytorch as ipex`` at the beginning of your code.
 ```

 ## Load and Optimize Model

 You could choose to use [PyTorch API](./optimize_model.html) or [`transformers`-style API](./transformers_style_api.html) on Intel GPUs according to your preference.

-**Once you have the model with BigDL-LLM low bit optimization, set it to `to('xpu')`**.
+**Once you have the model with IPEX-LLM low bit optimization, set it to `to('xpu')`**.

 ```eval_rst
 .. tabs::
@ -32,7 +32,7 @@ You could choose to use [PyTorch API](./optimize_model.html) or [`transformers`-
         from ipex_llm import optimize_model

         model = LlamaForCausalLM.from_pretrained('meta-llama/Llama-2-7b-chat-hf', torch_dtype='auto', low_cpu_mem_usage=True)
-         model = optimize_model(model) # With only one line to enable BigDL-LLM INT4 optimization
+         model = optimize_model(model) # With only one line to enable IPEX-LLM INT4 optimization

         model = model.to('xpu') # Important after obtaining the optimized model

@ -49,7 +49,7 @@ You could choose to use [PyTorch API](./optimize_model.html) or [`transformers`-
         from transformers import LlamaForCausalLM
         from ipex_llm.optimize import low_memory_init, load_low_bit

-         saved_dir='./llama-2-bigdl-llm-4-bit'
+         saved_dir='./llama-2-ipex-llm-4-bit'
         with low_memory_init(): # Fast and low cost by loading model on meta device
            model = LlamaForCausalLM.from_pretrained(saved_dir,
                                                     torch_dtype="auto",
@ -84,7 +84,7 @@ You could choose to use [PyTorch API](./optimize_model.html) or [`transformers`-

         from ipex_llm.transformers import AutoModelForCausalLM

-         saved_dir='./llama-2-bigdl-llm-4-bit'
+         saved_dir='./llama-2-ipex-llm-4-bit'
         model = AutoModelForCausalLM.load_low_bit(saved_dir) # Load the optimized model

         model = model.to('xpu') # Important after obtaining the optimized model
@ -124,5 +124,5 @@ with torch.inference_mode():
 ```eval_rst
 .. seealso::

-   See the complete examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU>`_
+   See the complete examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU>`_
 ```
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/langchain_api.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/langchain_api.md
@ -1,6 +1,6 @@
 # LangChain API

-You may run the models using the LangChain API in `bigdl-llm`.
+You may run the models using the LangChain API in `ipex-llm`.

 ## Using Hugging Face `transformers` INT4 Format

@ -12,16 +12,16 @@ from ipex_llm.langchain.embeddings import TransformersEmbeddings
 from langchain.chains.question_answering import load_qa_chain

 embeddings = TransformersEmbeddings.from_model_id(model_id=model_path)
-bigdl_llm = TransformersLLM.from_model_id(model_id=model_path, ...)
+ipex_llm = TransformersLLM.from_model_id(model_id=model_path, ...)

-doc_chain = load_qa_chain(bigdl_llm, ...)
+doc_chain = load_qa_chain(ipex_llm, ...)
 output = doc_chain.run(...)
 ```

 ```eval_rst
 .. seealso::

-   See the examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/LangChain/transformers_int4>`_.
+   See the examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain/transformers_int4>`_.
 ```

 ## Using Native INT4 Format
@ -44,14 +44,14 @@ from langchain.chains.question_answering import load_qa_chain
 # switch to ChatGLMEmbeddings/GptneoxEmbeddings/BloomEmbeddings/StarcoderEmbeddings to load other models
 embeddings = LlamaEmbeddings(model_path='/path/to/converted/model.bin')
 # switch to ChatGLMLLM/GptneoxLLM/BloomLLM/StarcoderLLM to load other models
-bigdl_llm = LlamaLLM(model_path='/path/to/converted/model.bin')
+ipex_llm = LlamaLLM(model_path='/path/to/converted/model.bin')

-doc_chain = load_qa_chain(bigdl_llm, ...)
+doc_chain = load_qa_chain(ipex_llm, ...)
 doc_chain.run(...)
 ```

 ```eval_rst
 .. seealso::

-   See the examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/LangChain/native_int4>`_.
+   See the examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/LangChain/native_int4>`_.
 ```
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/native_format.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/native_format.md
@ -11,7 +11,7 @@ You may also convert Hugging Face *Transformers* models into native INT4 format
 ```python
 # convert the model
 from ipex_llm import llm_convert
-bigdl_llm_path = llm_convert(model='/path/to/model/',
+ipex_llm_path = llm_convert(model='/path/to/model/',
       outfile='/path/to/output/', outtype='int4', model_family="llama")

 # load the converted model
@ -28,5 +28,5 @@ output = llm.batch_decode(output_ids)
 ```eval_rst
 .. seealso::
   
-   See the complete example `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models>`_
+   See the complete example `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models>`_
 ```
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/optimize_model.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/optimize_model.md
@ -1,6 +1,6 @@
 ## PyTorch API

-In general, you just need one-line `optimize_model` to easily optimize any loaded PyTorch model, regardless of the library or API you are using. With BigDL-LLM, PyTorch models (in FP16/BF16/FP32) can be optimized with low-bit quantizations (supported precisions include INT4, INT5, INT8, etc).
+In general, you just need one-line `optimize_model` to easily optimize any loaded PyTorch model, regardless of the library or API you are using. With IPEX-LLM, PyTorch models (in FP16/BF16/FP32) can be optimized with low-bit quantizations (supported precisions include INT4, INT5, INT8, etc).

 ### Optimize model

@ -16,11 +16,11 @@ Then, just need to call `optimize_model` to optimize the loaded model and INT4 o
 ```python
 from ipex_llm import optimize_model

-# With only one line to enable BigDL-LLM INT4 optimization
+# With only one line to enable IPEX-LLM INT4 optimization
 model = optimize_model(model)
 ```

-After optimizing the model, BigDL-LLM does not require any change in the inference code. You can use any libraries to run the optimized model with very low latency.
+After optimizing the model, IPEX-LLM does not require any change in the inference code. You can use any libraries to run the optimized model with very low latency.

 ### More Precisions

@ -44,7 +44,7 @@ The loading process of the original model may be time-consuming and memory-inten

 Continuing with the [example of Llama-2-7b-chat-hf](#optimize-model), we can save the previously optimized model as follows:
 ```python
-saved_dir='./llama-2-bigdl-llm-4-bit'
+saved_dir='./llama-2-ipex-llm-4-bit'
 model.save_low_bit(saved_dir)
 ```
 #### Load
@ -63,7 +63,7 @@ model = load_low_bit(model, saved_dir) # Load the optimized model
 ```eval_rst
 .. seealso::

-   * Please refer to the `API documentation <https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/LLM/optimize.html>`_ for more details.
+   * Please refer to the `API documentation <https://ipex-llm.readthedocs.io/en/latest/doc/PythonAPI/LLM/optimize.html>`_ for more details.

-   * We also provide detailed examples on how to run PyTorch models (e.g., Openai Whisper, LLaMA2, ChatGLM2, Falcon, MPT, Baichuan2, etc.) using BigDL-LLM. See the complete CPU examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models>`_ and GPU examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models>`_.
+   * We also provide detailed examples on how to run PyTorch models (e.g., Openai Whisper, LLaMA2, ChatGLM2, Falcon, MPT, Baichuan2, etc.) using IPEX-LLM. See the complete CPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models>`_ and GPU examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models>`_.
 ```
--- a/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/transformers_style_api.rst
+++ b/docs/readthedocs/source/doc/LLM/Overview/KeyFeatures/transformers_style_api.rst
@ -1,7 +1,7 @@
 ``transformers``-style API
 ================================

-You may run the LLMs using ``transformers``-style API in ``bigdl-llm``.
+You may run the LLMs using ``transformers``-style API in ``ipex-llm``.

 * |hugging_face_transformers_format|_
 * `Native Format <./native_format.html>`_
--- a/docs/readthedocs/source/doc/LLM/Overview/examples.rst
+++ b/docs/readthedocs/source/doc/LLM/Overview/examples.rst
@ -1,9 +1,9 @@
-BigDL-LLM Examples
+IPEX-LLM Examples
 ================================

-You can use BigDL-LLM to run any PyTorch model with INT4 optimizations on Intel XPU (from Laptop to GPU to Cloud).
+You can use IPEX-LLM to run any PyTorch model with INT4 optimizations on Intel XPU (from Laptop to GPU to Cloud).

-Here, we provide examples to help you quickly get started using BigDL-LLM to run some popular open-source models in the community. Please refer to the appropriate guide based on your device:
+Here, we provide examples to help you quickly get started using IPEX-LLM to run some popular open-source models in the community. Please refer to the appropriate guide based on your device:

 * `CPU <./examples_cpu.html>`_
 * `GPU <./examples_gpu.html>`_
--- a/docs/readthedocs/source/doc/LLM/Overview/examples_cpu.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/examples_cpu.md
@ -1,8 +1,8 @@
-# BigDL-LLM Examples: CPU
+# IPEX-LLM Examples: CPU

-Here, we provide some examples on how you could apply BigDL-LLM INT4 optimizations on popular open-source models in the community.
+Here, we provide some examples on how you could apply IPEX-LLM INT4 optimizations on popular open-source models in the community.

-To run these examples, please first refer to [here](./install_cpu.html) for more information about how to install ``bigdl-llm``, requirements and best practices for setting up your environment.
+To run these examples, please first refer to [here](./install_cpu.html) for more information about how to install ``ipex-llm``, requirements and best practices for setting up your environment.

 The following models have been verified on either servers or laptops with Intel CPUs.

@ -10,17 +10,17 @@ The following models have been verified on either servers or laptops with Intel

 | Model      | Example of PyTorch API                                |
 |------------|-------------------------------------------------------|
-| LLaMA 2    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/llama2)  |
-| ChatGLM    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/chatglm) |
-| Mistral    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/mistral) |
-| Bark       | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/bark)    |
-| BERT       | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/bert)    |
-| Openai Whisper    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/openai-whisper) |
+| LLaMA 2    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/Model/llama2)  |
+| ChatGLM    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/Model/chatglm) |
+| Mistral    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/Model/mistral) |
+| Bark       | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/Model/bark)    |
+| BERT       | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/Model/bert)    |
+| Openai Whisper    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/Model/openai-whisper) |

 ```eval_rst
 .. important::

-   In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through PyTorch API as `example <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/More-Data-Types>`_.
+   In addition to INT4 optimization, IPEX-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through PyTorch API as `example <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/More-Data-Types>`_.
 ```


@ -28,37 +28,37 @@ The following models have been verified on either servers or laptops with Intel

 | Model      | Example of `transformers`-style API                   |
 |------------|-------------------------------------------------------|
-| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/vicuna) |
-| LLaMA 2    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/llama2) | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/llama2) |
-| ChatGLM    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/PyTorch-Models/Model/chatglm) | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm)   |
-| ChatGLM2   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm2)  |
-| Mistral    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mistral)   |
-| Falcon     | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/falcon)    |
-| MPT        | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt)       |
-| Dolly-v1   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1)  |
-| Dolly-v2   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2)  |
-| Replit Code| [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit)    |
-| RedPajama  | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/redpajama) |
-| Phoenix    | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/phoenix)   |
-| StarCoder  | [link1](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/starcoder) |
-| Baichuan   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan)  |
-| Baichuan2  | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan2) |
-| InternLM   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm)  |
-| Qwen       | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen)      |
-| Aquila     | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila)    |
-| MOSS       | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/moss)      |
-| Whisper    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/whisper)   |
+| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* | [link1](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/vicuna) |
+| LLaMA 2    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/Model/llama2) | [link1](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/llama2) |
+| ChatGLM    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/Model/chatglm) | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm)   |
+| ChatGLM2   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/chatglm2)  |
+| Mistral    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mistral)   |
+| Falcon     | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/falcon)    |
+| MPT        | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt)       |
+| Dolly-v1   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1)  |
+| Dolly-v2   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2)  |
+| Replit Code| [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit)    |
+| RedPajama  | [link1](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/redpajama) |
+| Phoenix    | [link1](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/phoenix)   |
+| StarCoder  | [link1](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/Native-Models), [link2](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/starcoder) |
+| Baichuan   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan)  |
+| Baichuan2  | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan2) |
+| InternLM   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/internlm)  |
+| Qwen       | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/qwen)      |
+| Aquila     | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila)    |
+| MOSS       | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/moss)      |
+| Whisper    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/whisper)   |

 ```eval_rst
 .. important::

-   In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through ``transformers``-style API as `example <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/More-Data-Types>`_.
+   In addition to INT4 optimization, IPEX-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through ``transformers``-style API as `example <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/More-Data-Types>`_.
 ```


 ```eval_rst
 .. seealso::

-   See the complete examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU>`_.
+   See the complete examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU>`_.
 ```

--- a/docs/readthedocs/source/doc/LLM/Overview/examples_gpu.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/examples_gpu.md
@ -1,8 +1,8 @@
-# BigDL-LLM Examples: GPU
+# IPEX-LLM Examples: GPU

-Here, we provide some examples on how you could apply BigDL-LLM INT4 optimizations on popular open-source models in the community.
+Here, we provide some examples on how you could apply IPEX-LLM INT4 optimizations on popular open-source models in the community.

-To run these examples, please first refer to [here](./install_gpu.html) for more information about how to install ``bigdl-llm``, requirements and best practices for setting up your environment.
+To run these examples, please first refer to [here](./install_gpu.html) for more information about how to install ``ipex-llm``, requirements and best practices for setting up your environment.

 ```eval_rst
 .. important::
@ -16,20 +16,20 @@ The following models have been verified on either servers or laptops with Intel

 | Model      | Example of PyTorch API                                |
 |------------|-------------------------------------------------------|
-| LLaMA 2    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/llama2)    |
-| ChatGLM 2  | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/chatglm2)  |
-| Mistral    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/mistral)   |
-| Baichuan   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/baichuan)  |
-| Baichuan2  | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/baichuan2) |
-| Replit     | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/replit)    |
-| StarCoder  | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/starcoder) |
-| Dolly-v1   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/dolly-v1)  |
-| Dolly-v2   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/dolly-v2)  |
+| LLaMA 2    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/llama2)    |
+| ChatGLM 2  | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/chatglm2)  |
+| Mistral    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/mistral)   |
+| Baichuan   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/baichuan)  |
+| Baichuan2  | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/baichuan2) |
+| Replit     | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/replit)    |
+| StarCoder  | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/starcoder) |
+| Dolly-v1   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/dolly-v1)  |
+| Dolly-v2   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/Model/dolly-v2)  |

 ```eval_rst
 .. important::

-   In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through PyTorch API as `example <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/More-Data-Types>`_.
+   In addition to INT4 optimization, IPEX-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through PyTorch API as `example <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models/More-Data-Types>`_.
 ```


@ -37,34 +37,34 @@ The following models have been verified on either servers or laptops with Intel

 | Model      | Example of `transformers`-style API                   |
 |------------|-------------------------------------------------------|
-| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* |[link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/vicuna)|
-| LLaMA 2    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2) |
-| ChatGLM2   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm2)   |
-| Mistral    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/mistral)    |
-| Falcon     | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/falcon)     |
-| MPT        | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt)        |
-| Dolly-v1   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1)   | 
-| Dolly-v2   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2)   | 
-| Replit     | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit)     |
-| StarCoder  | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/starcoder)  | 
-| Baichuan   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan)   |
-| Baichuan2  | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2)  |
-| InternLM   | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/internlm)   |
-| Qwen       | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen)       |
-| Aquila     | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/aquila)     |
-| Whisper    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/whisper)    |
-| Chinese Llama2	    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chinese-llama2)    |
-| GPT-J    | [link](https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/gpt-j)    |
+| LLaMA *(such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.)* |[link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/vicuna)|
+| LLaMA 2    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2) |
+| ChatGLM2   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chatglm2)   |
+| Mistral    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/mistral)    |
+| Falcon     | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/falcon)     |
+| MPT        | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/mpt)        |
+| Dolly-v1   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v1)   | 
+| Dolly-v2   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/dolly_v2)   | 
+| Replit     | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/replit)     |
+| StarCoder  | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/starcoder)  | 
+| Baichuan   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan)   |
+| Baichuan2  | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/baichuan2)  |
+| InternLM   | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/internlm)   |
+| Qwen       | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/qwen)       |
+| Aquila     | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/aquila)     |
+| Whisper    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/whisper)    |
+| Chinese Llama2	    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/chinese-llama2)    |
+| GPT-J    | [link](https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/gpt-j)    |

 ```eval_rst
 .. important::

-   In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through ``transformers``-style API as `example <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types>`_.
+   In addition to INT4 optimization, IPEX-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through ``transformers``-style API as `example <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels/More-Data-Types>`_.
 ```


 ```eval_rst
 .. seealso::

-   See the complete examples `here <https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU>`_.
+   See the complete examples `here <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU>`_.
 ```
--- a/docs/readthedocs/source/doc/LLM/Overview/install.rst
+++ b/docs/readthedocs/source/doc/LLM/Overview/install.rst
@ -1,7 +1,7 @@
-BigDL-LLM Installation
+IPEX-LLM Installation
 ================================

-Here, we provide instructions on how to install ``bigdl-llm`` and best practices for setting up your environment. Please refer to the appropriate guide based on your device:
+Here, we provide instructions on how to install ``ipex-llm`` and best practices for setting up your environment. Please refer to the appropriate guide based on your device:

 * `CPU <./install_cpu.html>`_
 * `GPU <./install_gpu.html>`_
--- a/docs/readthedocs/source/doc/LLM/Overview/install_cpu.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/install_cpu.md
@ -1,11 +1,11 @@
-# BigDL-LLM Installation: CPU
+# IPEX-LLM Installation: CPU

 ## Quick Installation

-Install BigDL-LLM for CPU supports using pip through:
+Install IPEX-LLM for CPU supports using pip through:

 ```bash
-pip install --pre --upgrade bigdl-llm[all] # install the latest bigdl-llm nightly build with 'all' option
+pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option
 ```

 Please refer to [Environment Setup](#environment-setup) for more information.
@ -17,12 +17,12 @@ Please refer to [Environment Setup](#environment-setup) for more information.

 .. important::

-   ``bigdl-llm`` is tested with Python 3.9, 3.10 and 3.11; Python 3.9 is recommended for best practices.
+   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11; Python 3.9 is recommended for best practices.
 ```

 ## Recommended Requirements

-Here list the recommended hardware and OS for smooth BigDL-LLM optimization experiences on CPU:
+Here list the recommended hardware and OS for smooth IPEX-LLM optimization experiences on CPU:

 * Hardware

@ -37,7 +37,7 @@ Here list the recommended hardware and OS for smooth BigDL-LLM optimization expe

 ## Environment Setup

-For optimal performance with LLM models using BigDL-LLM optimizations on Intel CPUs, here are some best practices for setting up environment:
+For optimal performance with LLM models using IPEX-LLM optimizations on Intel CPUs, here are some best practices for setting up environment:

 First we recommend using [Conda](https://docs.conda.io/en/latest/miniconda.html) to create a python 3.9 enviroment:

@ -45,10 +45,10 @@ First we recommend using [Conda](https://docs.conda.io/en/latest/miniconda.html)
 conda create -n llm python=3.9
 conda activate llm

-pip install --pre --upgrade bigdl-llm[all] # install the latest bigdl-llm nightly build with 'all' option
+pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option
 ```

-Then for running a LLM model with BigDL-LLM optimizations (taking an `example.py` an example):
+Then for running a LLM model with IPEX-LLM optimizations (taking an `example.py` an example):

 ```eval_rst	
 .. tabs::
--- a/docs/readthedocs/source/doc/LLM/Overview/install_gpu.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/install_gpu.md
@ -1,15 +1,15 @@
-# BigDL-LLM Installation: GPU
+# IPEX-LLM Installation: GPU

 ## Windows

 ### Prerequisites

-BigDL-LLM on Windows supports Intel iGPU and dGPU.
+IPEX-LLM on Windows supports Intel iGPU and dGPU.

 ```eval_rst
 .. important::

-    BigDL-LLM on Windows only supports PyTorch 2.1.
+    IPEX-LLM on Windows only supports PyTorch 2.1.
 ```

 To apply Intel GPU acceleration, there're several prerequisite steps for tools installation and environment preparation:
@ -40,28 +40,28 @@ Intel® oneAPI Base Toolkit 2024.0 installation methods:
         Activating your working conda environment will automatically configure oneAPI environment variables.
 ```

-### Install BigDL-LLM From PyPI
+### Install IPEX-LLM From PyPI

 We recommend using [miniconda](https://docs.conda.io/en/latest/miniconda.html) to create a python 3.9 enviroment:

 ```eval_rst
 .. important::

-   ``bigdl-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.9 is recommended for best practices.
+   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.9 is recommended for best practices.
 ```

-The easiest ways to install `bigdl-llm` is the following commands:
+The easiest ways to install `ipex-llm` is the following commands:

 ```
 conda create -n llm python=3.9 libuv
 conda activate llm

-pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
+pip install --pre --upgrade ipex-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
 ```

-### Install BigDL-LLM From Wheel
+### Install IPEX-LLM From Wheel

-If you encounter network issues when installing IPEX, you can also install BigDL-LLM dependencies for Intel XPU from source archives. First you need to download and install torch/torchvision/ipex from wheels listed below before installing `bigdl-llm`.
+If you encounter network issues when installing IPEX, you can also install IPEX-LLM dependencies for Intel XPU from source archives. First you need to download and install torch/torchvision/ipex from wheels listed below before installing `ipex-llm`.

 Download the wheels on Windows system:

@ -71,14 +71,14 @@ wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/torchv
 wget https://intel-extension-for-pytorch.s3.amazonaws.com/ipex_stable/xpu/intel_extension_for_pytorch-2.1.10%2Bxpu-cp39-cp39-win_amd64.whl
 ```

-You may install dependencies directly from the wheel archives and then install `bigdl-llm` using following commands:
+You may install dependencies directly from the wheel archives and then install `ipex-llm` using following commands:

 ```
 pip install torch-2.1.0a0+cxx11.abi-cp39-cp39-win_amd64.whl
 pip install torchvision-0.16.0a0+cxx11.abi-cp39-cp39-win_amd64.whl
 pip install intel_extension_for_pytorch-2.1.10+xpu-cp39-cp39-win_amd64.whl

-pip install --pre --upgrade bigdl-llm[xpu]
+pip install --pre --upgrade ipex-llm[xpu]
 ```

 ```eval_rst
@ -154,7 +154,7 @@ If you met error when importing `intel_extension_for_pytorch`, please ensure tha

 ### Prerequisites

-BigDL-LLM GPU support on Linux has been verified on:
+IPEX-LLM GPU support on Linux has been verified on:

 * Intel Arc™ A-Series Graphics
 * Intel Data Center GPU Flex Series
@ -163,7 +163,7 @@ BigDL-LLM GPU support on Linux has been verified on:
 ```eval_rst
 .. important::

-    BigDL-LLM on Linux supports PyTorch 2.0 and PyTorch 2.1.
+    IPEX-LLM on Linux supports PyTorch 2.0 and PyTorch 2.1.
 ```

 ```eval_rst
@ -176,7 +176,7 @@ BigDL-LLM GPU support on Linux has been verified on:
 .. tabs::
   .. tab:: PyTorch 2.1

-      To enable BigDL-LLM for Intel GPUs with PyTorch 2.1, here are several prerequisite steps for tools installation and environment preparation:
+      To enable IPEX-LLM for Intel GPUs with PyTorch 2.1, here are several prerequisite steps for tools installation and environment preparation:


      * Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. We highly recommend installing the latest version of intel-i915-dkms using apt.
@ -213,7 +213,7 @@ BigDL-LLM GPU support on Linux has been verified on:

            .. note::
               You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
-               You can continue with your working conda environment and install ``bigdl-llm`` as guided in the next section.
+               You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.

            .. note::

@ -269,7 +269,7 @@ BigDL-LLM GPU support on Linux has been verified on:

   .. tab:: PyTorch 2.0

-      To enable BigDL-LLM for Intel GPUs with PyTorch 2.0, here're several prerequisite steps for tools installation and environment preparation:
+      To enable IPEX-LLM for Intel GPUs with PyTorch 2.0, here're several prerequisite steps for tools installation and environment preparation:


      * Step 1: Install Intel GPU Driver version >= stable_775_20_20231219. Highly recommend installing the latest version of intel-i915-dkms using apt.
@ -306,7 +306,7 @@ BigDL-LLM GPU support on Linux has been verified on:

            .. note::
               You can view the configured environment variables for your environment (e.g. with name ``llm``) by running ``conda env config vars list -n llm``.
-               You can continue with your working conda environment and install ``bigdl-llm`` as guided in the next section.
+               You can continue with your working conda environment and install ``ipex-llm`` as guided in the next section.

            .. note::

@ -369,19 +369,19 @@ BigDL-LLM GPU support on Linux has been verified on:
                  sudo ./installer
 ```

-### Install BigDL-LLM From PyPI
+### Install IPEX-LLM From PyPI

 We recommend using [miniconda](https://docs.conda.io/en/latest/miniconda.html) to create a python 3.9 enviroment:

 ```eval_rst
 .. important::

-   ``bigdl-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.9 is recommended for best practices.
+   ``ipex-llm`` is tested with Python 3.9, 3.10 and 3.11. Python 3.9 is recommended for best practices.
 ```

 ```eval_rst
 .. important::
-   Make sure you install matching versions of BigDL-LLM/pytorch/IPEX and oneAPI Base Toolkit. BigDL-LLM with Pytorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. BigDL-LLM with Pytorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
+   Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with Pytorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with Pytorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
 ```

 ```eval_rst
@ -393,15 +393,15 @@ We recommend using [miniconda](https://docs.conda.io/en/latest/miniconda.html) t
         conda create -n llm python=3.9
         conda activate llm

-         pip install --pre --upgrade bigdl-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu
+         pip install --pre --upgrade ipex-llm[xpu] -f https://developer.intel.com/ipex-whl-stable-xpu

      .. note::

-         The ``xpu`` option will install BigDL-LLM with PyTorch 2.1 by default, which is equivalent to
+         The ``xpu`` option will install IPEX-LLM with PyTorch 2.1 by default, which is equivalent to

         .. code-block:: bash

-            pip install --pre --upgrade bigdl-llm[xpu_2.1] -f https://developer.intel.com/ipex-whl-stable-xpu
+            pip install --pre --upgrade ipex-llm[xpu_2.1] -f https://developer.intel.com/ipex-whl-stable-xpu
            

   .. tab:: PyTorch 2.0
@ -411,13 +411,13 @@ We recommend using [miniconda](https://docs.conda.io/en/latest/miniconda.html) t
         conda create -n llm python=3.9
         conda activate llm

-         pip install --pre --upgrade bigdl-llm[xpu_2.0] -f https://developer.intel.com/ipex-whl-stable-xpu
+         pip install --pre --upgrade ipex-llm[xpu_2.0] -f https://developer.intel.com/ipex-whl-stable-xpu

 ```

-### Install BigDL-LLM From Wheel
+### Install IPEX-LLM From Wheel

-If you encounter network issues when installing IPEX, you can also install BigDL-LLM dependencies for Intel XPU from source archives. First you need to download and install torch/torchvision/ipex from wheels listed below before installing `bigdl-llm`.
+If you encounter network issues when installing IPEX, you can also install IPEX-LLM dependencies for Intel XPU from source archives. First you need to download and install torch/torchvision/ipex from wheels listed below before installing `ipex-llm`.

 ```eval_rst
 .. tabs::
@ -439,8 +439,8 @@ If you encounter network issues when installing IPEX, you can also install BigDL
         pip install torchvision-0.16.0a0+cxx11.abi-cp39-cp39-linux_x86_64.whl
         pip install intel_extension_for_pytorch-2.1.10+xpu-cp39-cp39-linux_x86_64.whl

-         # install bigdl-llm for Intel GPU
-         pip install --pre --upgrade bigdl-llm[xpu]
+         # install ipex-llm for Intel GPU
+         pip install --pre --upgrade ipex-llm[xpu]

   .. tab:: PyTorch 2.0

@ -460,8 +460,8 @@ If you encounter network issues when installing IPEX, you can also install BigDL
         pip install torchvision-0.15.2a0+cxx11.abi-cp39-cp39-linux_x86_64.whl
         pip install intel_extension_for_pytorch-2.0.110+xpu-cp39-cp39-linux_x86_64.whl

-         # install bigdl-llm for Intel GPU
-         pip install --pre --upgrade bigdl-llm[xpu_2.0]
+         # install ipex-llm for Intel GPU
+         pip install --pre --upgrade ipex-llm[xpu_2.0]

 ```

@ -543,8 +543,8 @@ OSError: libmkl_intel_lp64.so.2: cannot open shared object file: No such file or
 Error: libmkl_sycl_blas.so.4: cannot open shared object file: No such file or directory
 ```

-The reason for such errors is that oneAPI has not been initialized properly before running BigDL-LLM code or before importing IPEX package.
+The reason for such errors is that oneAPI has not been initialized properly before running IPEX-LLM code or before importing IPEX package.

-* For oneAPI installed using APT or Offline Installer, make sure you execute `setvars.sh` of oneAPI Base Toolkit before running BigDL-LLM.
+* For oneAPI installed using APT or Offline Installer, make sure you execute `setvars.sh` of oneAPI Base Toolkit before running IPEX-LLM.
 * For PIP-installed oneAPI, activate your working environment and run ``echo $LD_LIBRARY_PATH`` to check if the installation path is properly configured for the environment. If the output does not contain oneAPI path (e.g. ``~/intel/oneapi/lib``), check [Prerequisites](#id1) to re-install oneAPI with PIP installer.
-* Make sure you install matching versions of BigDL-LLM/pytorch/IPEX and oneAPI Base Toolkit. BigDL-LLM with PyTorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. BigDL-LLM with PyTorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
+* Make sure you install matching versions of ipex-llm/pytorch/IPEX and oneAPI Base Toolkit. IPEX-LLM with PyTorch 2.1 should be used with oneAPI Base Toolkit version 2024.0. IPEX-LLM with PyTorch 2.0 should be used with oneAPI Base Toolkit version 2023.2.
--- a/docs/readthedocs/source/doc/LLM/Overview/known_issues.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/known_issues.md
@ -1 +1 @@
-# BigDL-LLM Known Issues
+# IPEX-LLM Known Issues
--- a/docs/readthedocs/source/doc/LLM/Overview/llm.md
+++ b/docs/readthedocs/source/doc/LLM/Overview/llm.md
@ -1,14 +1,14 @@
-# BigDL-LLM in 5 minutes
+# IPEX-LLM in 5 minutes

-You can use BigDL-LLM to run any [*Hugging Face Transformers*](https://huggingface.co/docs/transformers/index) PyTorch model. It automatically optimizes and accelerates LLMs using low-precision (INT4/INT5/INT8) techniques, modern hardware accelerations and latest software optimizations.
+You can use IPEX-LLM to run any [*Hugging Face Transformers*](https://huggingface.co/docs/transformers/index) PyTorch model. It automatically optimizes and accelerates LLMs using low-precision (INT4/INT5/INT8) techniques, modern hardware accelerations and latest software optimizations.

-Hugging Face transformers-based applications can run on BigDL-LLM with one-line code change, and you'll immediately observe significant speedup<sup><a href="#footnote-perf" id="ref-perf">[1]</a></sup>.
+Hugging Face transformers-based applications can run on IPEX-LLM with one-line code change, and you'll immediately observe significant speedup<sup><a href="#footnote-perf" id="ref-perf">[1]</a></sup>.

-Here, let's take a relatively small LLM model, i.e [open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2), and BigDL-LLM INT4 optimizations as an example.
+Here, let's take a relatively small LLM model, i.e [open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2), and IPEX-LLM INT4 optimizations as an example.

 ## Load a Pretrained Model

-Simply use one-line `transformers`-style API in `bigdl-llm` to load `open_llama_3b_v2` with INT4 optimization (by specifying `load_in_4bit=True`) as follows:
+Simply use one-line `transformers`-style API in `ipex-llm` to load `open_llama_3b_v2` with INT4 optimization (by specifying `load_in_4bit=True`) as follows:

 ```python
 from ipex_llm.transformers import AutoModelForCausalLM
@ -20,7 +20,7 @@ model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path="open
 ```eval_rst
 .. tip::

-   `open_llama_3b_v2 <https://huggingface.co/openlm-research/open_llama_3b_v2>`_ is a pretrained large language model hosted on Hugging Face. ``openlm-research/open_llama_3b_v2`` is its Hugging Face model id. ``from_pretrained`` will automatically download the model from Hugging Face to a local cache path (e.g. ``~/.cache/huggingface``), load the model, and converted it to ``bigdl-llm`` INT4 format.
+   `open_llama_3b_v2 <https://huggingface.co/openlm-research/open_llama_3b_v2>`_ is a pretrained large language model hosted on Hugging Face. ``openlm-research/open_llama_3b_v2`` is its Hugging Face model id. ``from_pretrained`` will automatically download the model from Hugging Face to a local cache path (e.g. ``~/.cache/huggingface``), load the model, and converted it to ``ipex-llm`` INT4 format.

   It may take a long time to download the model using API. You can also download the model yourself, and set ``pretrained_model_name_or_path`` to the local path of the downloaded model. This way, ``from_pretrained`` will load and convert directly from local path without download.
 ```
@ -62,7 +62,7 @@ with torch.inference_mode():
 <div>
    <p>
        <sup><a href="#ref-perf" id="footnote-perf">[1]</a>
-            Performance varies by use, configuration and other factors. <code><span>bigdl-llm</span></code> may not optimize to the same degree for non-Intel products. Learn more at <a href="https://www.Intel.com/PerformanceIndex">www.Intel.com/PerformanceIndex</a>.
+            Performance varies by use, configuration and other factors. <code><span>ipex-llm</span></code> may not optimize to the same degree for non-Intel products. Learn more at <a href="https://www.Intel.com/PerformanceIndex">www.Intel.com/PerformanceIndex</a>.
        </sup>
    </p>
 </div>
--- a/docs/readthedocs/source/doc/LLM/Quickstart/benchmark_quickstart.md
+++ b/docs/readthedocs/source/doc/LLM/Quickstart/benchmark_quickstart.md
@ -1,10 +1,10 @@
-# BigDL-LLM Benchmarking
+# IPEX-LLM Benchmarking

-We can do benchmarking for BigDL-LLM on Intel CPUs and GPUs using the benchmark scripts we provide.
+We can do benchmarking for IPEX-LLM on Intel CPUs and GPUs using the benchmark scripts we provide.

 ## Prepare The Environment

-You can refer to [here](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install.html) to install BigDL-LLM in your environment. The following dependencies are also needed to run the benchmark scripts.
+You can refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install.html) to install IPEX-LLM in your environment. The following dependencies are also needed to run the benchmark scripts.

 ```
 pip install pandas
@ -13,12 +13,12 @@ pip install omegaconf

 ## Prepare The Scripts

-Navigate to your local workspace and then download BigDL from GitHub. Modify the `config.yaml` under `all-in-one` folder for your own benchmark configurations.
+Navigate to your local workspace and then download IPEX-LLM from GitHub. Modify the `config.yaml` under `all-in-one` folder for your own benchmark configurations.

 ```
 cd your/local/workspace
-git clone https://github.com/intel-analytics/BigDL.git
-cd BigDL/python/llm/dev/benchmark/all-in-one/
+git clone https://github.com/intel-analytics/ipex-llm.git
+cd ipex-llm/python/llm/dev/benchmark/all-in-one/
 ```

 ## Configure YAML File
@ -55,7 +55,7 @@ Some parameters in the yaml file that you can configure:

 ## Run on Windows

-Please refer to [here](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration) to configure oneAPI environment variables.
+Please refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration) to configure oneAPI environment variables.

 ```eval_rst
 .. tabs::
@ -144,4 +144,4 @@ Please refer to [here](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/i

 ## Result

-After the script runnning is completed, you can obtain a CSV result file under the current folder. You can mainly look at the results of columns `1st token avg latency (ms)` and `2+ avg latency (ms/token)` for  performance results. You can also check whether the column `actual input/output tokens` is consistent with the column `input/output tokens` and whether the parameters you specified in config.yaml have been successfully applied in the benchmarking.
+After the script runnning is completed, you can obtain a CSV result file under the current folder. You can mainly look at the results of columns `1st token avg latency (ms)` and `2+ avg latency (ms/token)` for  performance results. You can also check whether the column `actual input/output tokens` is consistent with the column `input/output tokens` and whether the parameters you specified in config.yaml have been successfully applied in the benchmarking.
--- a/docs/readthedocs/source/doc/LLM/Quickstart/docker_windows_gpu.md
+++ b/docs/readthedocs/source/doc/LLM/Quickstart/docker_windows_gpu.md
@ -1,6 +1,6 @@
-# Install BigDL-LLM in Docker on Windows with Intel GPU
+# Install IPEX-LLM in Docker on Windows with Intel GPU

-This guide demonstrates how to install BigDL-LLM in Docker on Windows with Intel GPUs. 
+This guide demonstrates how to install IPEX-LLM in Docker on Windows with Intel GPUs. 

 It applies to Intel Core Core 12 - 14 gen integrated GPUs (iGPUs) and Intel Arc Series GPU.

@ -51,20 +51,20 @@ It applies to Intel Core Core 12 - 14 gen integrated GPUs (iGPUs) and Intel Arc
     >Note: During the use of Docker in WSL, Docker Desktop needs to be kept open all the time.
   
     
-## BigDL LLM Inference with XPU on Windows
-### 1. Prepare bigdl-llm-xpu Docker Image
+## IPEX LLM Inference with XPU on Windows
+### 1. Prepare ipex-llm-xpu Docker Image
 Run the following command in WSL:
 ```bash
-docker pull intelanalytics/bigdl-llm-xpu:2.5.0-SNAPSHOT
+docker pull intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT
 ```
 This step will take around 20 minutes depending on your network.

-### 2. Start bigdl-llm-xpu Docker Container
+### 2. Start ipex-llm-xpu Docker Container

 To map the xpu into the container, an example (docker_setup.sh) could be:
 ```bash
 #/bin/bash
-export DOCKER_IMAGE=intelanalytics/bigdl-llm-xpu:2.5.0-SNAPSHOT
+export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.5.0-SNAPSHOT
 export CONTAINER_NAME=my_container
 export MODEL_PATH=/llm/models[change to your model path]

@ -115,7 +115,7 @@ root@docker-desktop:/# sycl-ls
  The output is similar like this:
 ```bash
 Human: What is AI?
-BigDL-LLM:
+IPEX-LLM:
 AI, or Artificial Intelligence, refers to the development of computer systems or machines that can perform tasks that typically require human intelligence. These systems are designed to learn from data and make decisions, or take actions, based on that data.
 ``` 

--- a/docs/readthedocs/source/doc/LLM/Quickstart/index.rst
+++ b/docs/readthedocs/source/doc/LLM/Quickstart/index.rst
@ -1,4 +1,4 @@
-BigDL-LLM Quickstart
+IPEX-LLM Quickstart
 ================================

 .. note::
@ -7,9 +7,9 @@ BigDL-LLM Quickstart

 This section includes efficient guide to show you how to:

-* `Install BigDL-LLM on Linux with Intel GPU <./install_linux_gpu.html>`_
-* `Install BigDL-LLM on Windows with Intel GPU <./install_windows_gpu.html>`_
-* `Install BigDL-LLM in Docker on Windows with Intel GPU <./docker_windows_gpu.html>`_
+* `Install IPEX-LLM on Linux with Intel GPU <./install_linux_gpu.html>`_
+* `Install IPEX-LLM on Windows with Intel GPU <./install_windows_gpu.html>`_
+* `Install IPEX-LLM in Docker on Windows with Intel GPU <./docker_windows_gpu.html>`_
 * `Use Text Generation WebUI on Windows with Intel GPU <./webui_quickstart.html>`_
-* `Conduct Performance Benchmarking with BigDL-LLM <./benchmark_quickstart.html>`_
-* `Use llama.cpp with BigDL-LLM on Intel GPU <./llama_cpp_quickstart.html>`_
+* `Conduct Performance Benchmarking with IPEX-LLM <./benchmark_quickstart.html>`_
+* `Use llama.cpp with IPEX-LLM on Intel GPU <./llama_cpp_quickstart.html>`_
--- a/docs/readthedocs/source/doc/LLM/Quickstart/install_linux_gpu.md
+++ b/docs/readthedocs/source/doc/LLM/Quickstart/install_linux_gpu.md
@ -1,8 +1,8 @@
-# Install BigDL-LLM on Linux with Intel GPU
+# Install IPEX-LLM on Linux with Intel GPU

-This guide demonstrates how to install BigDL-LLM on Linux with Intel GPUs. It applies to Intel Data Center GPU Flex Series and Max Series, as well as Intel Arc Series GPU.
+This guide demonstrates how to install IPEX-LLM on Linux with Intel GPUs. It applies to Intel Data Center GPU Flex Series and Max Series, as well as Intel Arc Series GPU.

-BigDL-LLM currently supports the Ubuntu 20.04 operating system and later, and supports PyTorch 2.0 and PyTorch 2.1 on Linux. This page demonstrates BigDL-LLM with PyTorch 2.1. Check the [Installation](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux) page for more details.
+IPEX-LLM currently supports the Ubuntu 20.04 operating system and later, and supports PyTorch 2.0 and PyTorch 2.1 on Linux. This page demonstrates IPEX-LLM with PyTorch 2.1. Check the [Installation](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux) page for more details.


 ## Install Intel GPU Driver
@ -91,14 +91,14 @@ Install the Miniconda as follows if you don't have conda installed on your machi
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/basekit.png" alt="image-20240221102252565" width=100%; />


-## Install `bigdl-llm`
+## Install `ipex-llm`

-* With the `llm` environment active, use `pip` to install `bigdl-llm` for GPU:
+* With the `llm` environment active, use `pip` to install `ipex-llm` for GPU:
  ```
  conda create -n llm python=3.9
  conda activate llm

-  pip install --pre --upgrade bigdl-llm[xpu] --extra-index-url https://developer.intel.com/ipex-whl-stable-xpu
+  pip install --pre --upgrade ipex-llm[xpu] --extra-index-url https://developer.intel.com/ipex-whl-stable-xpu
  ```

  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/create_conda_env.png" alt="image-20240221102252564" width=100%; />
@ -106,7 +106,7 @@ Install the Miniconda as follows if you don't have conda installed on your machi
  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/create_conda_env.png" alt="image-20240221102252564" width=100%; />


-* You can verify if bigdl-llm is successfully installed by simply importing a few classes from the library. For example, execute the following import command in the terminal:
+* You can verify if ipex-llm is successfully installed by simply importing a few classes from the library. For example, execute the following import command in the terminal:
  ```bash
  source /opt/intel/oneapi/setvars.sh

@ -115,7 +115,7 @@ Install the Miniconda as follows if you don't have conda installed on your machi
  > from ipex_llm.transformers import AutoModel, AutoModelForCausalLM
  ```

-  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/verify_bigdl_import.png" alt="image-20240221102252562" width=100%; />
+  > <img src="https://llm-assets.readthedocs.io/en/latest/_images/verify_ipex_import.png" alt="image-20240221102252562" width=100%; />


 ## Runtime Configurations
@ -157,7 +157,7 @@ Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface
   conda activate llm
   ```
 * Step 2: If you're running on iGPU, set some environment variables by running below commands:
-  > For more details about runtime configurations, refer to [this guide](https://bigdl.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration): 
+  > For more details about runtime configurations, refer to [this guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration): 
  ```bash
  # Skip this step for PIP-installed oneAPI since the environment has already been configured in LD_LIBRARY_PATH.
  source /opt/intel/oneapi/setvars.sh
@ -175,7 +175,7 @@ Now let's play with a real LLM. We'll be using the [phi-1.5](https://huggingface
   generation_config = GenerationConfig(use_cache = True)
   
   tokenizer = AutoTokenizer.from_pretrained("tiiuae/falcon-7b", trust_remote_code=True)
-   # load Model using bigdl-llm and load it to GPU
+   # load Model using ipex-llm and load it to GPU
   model = AutoModelForCausalLM.from_pretrained(
       "tiiuae/falcon-7b", load_in_4bit=True, cpu_embedding=True, trust_remote_code=True)
   model = model.to('xpu')
--- a/Show more
+++ b/Show more
				`@ -1 +0,0 @@`
				<svg width="1320" height="990" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" overflow="hidden"><defs><clipPath id="clip0"><rect x="1907" y="139" width="1320" height="990"/></clipPath></defs><g clip-path="url(#clip0)" transform="translate(-1907 -139)"><rect x="1907" y="139" width="1320" height="990" fill="#FFFFFF"/><path d="M0.00756648 0.0364743 153.008 492.037" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2090.5 932.5)"/><path d="M2241.86 430.229 2400.86 941.229" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2047.36 930.729C2047.36 908.089 2065.72 889.728 2088.36 889.728 2111 889.728 2129.36 908.089 2129.36 930.729 2129.36 953.369 2111 971.729 2088.36 971.729 2065.72 971.729 2047.36 953.369 2047.36 930.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M2201.36 433.729C2201.36 410.532 2219.72 391.728 2242.36 391.728 2265 391.728 2283.36 410.532 2283.36 433.729 2283.36 456.924 2265 475.729 2242.36 475.729 2219.72 475.729 2201.36 456.924 2201.36 433.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M0.0637745 0.0364743 153.064 492.037" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2401.5 932.5)"/><path d="M2554.86 430.229 2713.86 941.229" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2360.36 930.729C2360.36 908.089 2378.72 889.728 2401.36 889.728 2424 889.728 2442.36 908.089 2442.36 930.729 2442.36 953.369 2424 971.729 2401.36 971.729 2378.72 971.729 2360.36 953.369 2360.36 930.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M2514.37 433.729C2514.37 410.532 2532.5 391.728 2554.87 391.728 2577.23 391.728 2595.37 410.532 2595.37 433.729 2595.37 456.924 2577.23 475.729 2554.87 475.729 2532.5 475.729 2514.37 456.924 2514.37 433.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M0.12133 0.005045 150.122 653.006" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd" transform="matrix(1 0 0 -1 2718.5 920.5)"/><path d="M2876.86 263.229 3071.86 926.23" stroke="#0171C3" stroke-width="20.5406" stroke-miterlimit="8" fill="none" fill-rule="evenodd"/><path d="M2675.36 920.729C2675.36 898.089 2694.16 879.728 2717.36 879.728 2740.56 879.728 2759.36 898.089 2759.36 920.729 2759.36 943.369 2740.56 961.729 2717.36 961.729 2694.16 961.729 2675.36 943.369 2675.36 920.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M3030.36 920.729C3030.36 898.089 3049.16 879.728 3072.36 879.728 3095.56 879.728 3114.36 898.089 3114.36 920.729 3114.36 943.369 3095.56 961.729 3072.36 961.729 3049.16 961.729 3030.36 943.369 3030.36 920.729Z" fill="#28A745" fill-rule="evenodd"/><path d="M2835.37 279.232C2835.37 256.864 2853.94 238.732 2876.87 238.732 2899.79 238.732 2918.37 256.864 2918.37 279.232 2918.37 301.6 2899.79 319.732 2876.87 319.732 2853.94 319.732 2835.37 301.6 2835.37 279.232Z" fill="#DC3545" fill-rule="evenodd"/></g></svg>