[Nano] Correct typos in ReadtheDocs (#5250)
* Correct typos for Nano User Guide page * Correct typos for BigDL-Nano PyTorch Training Overview page * Correct typos for BigDL-Nano PyTorch Inference Overview page * Correct typos for BigDL-Nano PyTorch ONNXRuntime Acceleration Quickstart page * Correct typos for Windows User Guide page * Correct typos for BigDL-Nano TensorFlow Training Overview page * Correct typos for BigDL-Nano TensorFlow Inference Overview page * Small fix based on Shengsheng's comments * Fix based on Wang, Yang's comments
This commit is contained in:
parent
70b2070e4d
commit
7ef605b58f
7 changed files with 55 additions and 55 deletions
|
|
@ -2,16 +2,16 @@
|
|||
|
||||
## **1. Overview**
|
||||
|
||||
BigDL Nano is a python package to transparently accelerate PyTorch and TensorFlow applications on Intel hardware. It provides a unified and easy-to-use API for several optimization techniques and tools, so that users can only apply a few lines of code changes to make their PyTorch or TensorFlow code run faster.
|
||||
BigDL Nano is a Python package to transparently accelerate PyTorch and TensorFlow applications on Intel hardware. It provides a unified and easy-to-use API for several optimization techniques and tools, so that users can only apply a few lines of code changes to make their PyTorch or TensorFlow code run faster.
|
||||
|
||||
---
|
||||
## **2. Install**
|
||||
|
||||
Note: For windows Users, we recommend using Windows Subsystem for Linux 2 (WSL2) to run BigDL-Nano. Please refer [here](./windows_guide.md) for instructions.
|
||||
Note: For windows users, we recommend using Windows Subsystem for Linux 2 (WSL2) to run BigDL-Nano. Please refer [here](./windows_guide.md) for instructions.
|
||||
|
||||
BigDL-Nano can be installed using pip and we recommend installing BigDL-Nano in a conda environment.
|
||||
|
||||
For PyTorch Users, you can install bigdl-nano along with some dependencies specific to PyTorch using the following command.
|
||||
For PyTorch Users, you can install bigdl-nano along with some dependencies specific to PyTorch using the following commands.
|
||||
|
||||
```bash
|
||||
conda create -n env
|
||||
|
|
@ -19,7 +19,7 @@ conda activate env
|
|||
pip install bigdl-nano[pytorch]
|
||||
```
|
||||
|
||||
For TensorFlow users, you can install bigdl-nano along with some dependencies specific to TensorFlow using the following command.
|
||||
For TensorFlow users, you can install bigdl-nano along with some dependencies specific to TensorFlow using the following commands.
|
||||
|
||||
```bash
|
||||
conda create -n env
|
||||
|
|
@ -35,7 +35,7 @@ source bigdl-nano-init
|
|||
|
||||
The `bigdl-nano-init` scripts will export a few environment variable according to your hardware to maximize performance.
|
||||
|
||||
In a conda environment, this will also add this script to `$CONDA_PREFIX/etc/conda/activate.d/`, which will automaticly run when you activate your current environment.
|
||||
In a conda environment, `source bigdl-nano-init` will also be added to `$CONDA_PREFIX/etc/conda/activate.d/`, which will automaticly run when you activate your current environment.
|
||||
|
||||
In a pure pip environment, you need to run `source bigdl-nano-init` every time you open a new shell to get optimal performance and run `source bigdl-nano-unset-env` if you want to unset these environment variables.
|
||||
|
||||
|
|
@ -45,11 +45,11 @@ In a pure pip environment, you need to run `source bigdl-nano-init` every time y
|
|||
|
||||
#### **3.1 PyTorch**
|
||||
|
||||
BigDL-Nano supports both PyTorch and PyTorch Lightning models and most optimizations requires only changing a few "import" lines in your code and adding a few flags.
|
||||
BigDL-Nano supports both PyTorch and PyTorch Lightning models and most optimizations require only changing a few "import" lines in your code and adding a few flags.
|
||||
|
||||
BigDL-Nano uses a extended version of PyTorch Lightning trainer for integrating our optimizations.
|
||||
|
||||
For example, if you are using a LightingModule, you can use the following code to enable intel-extension-for-pytorch and multi-instance training.
|
||||
For example, if you are using a LightningModule, you can use the following code snippet to enable intel-extension-for-pytorch and multi-instance training.
|
||||
|
||||
```python
|
||||
from bigdl.nano.pytorch import Trainer
|
||||
|
|
@ -76,11 +76,11 @@ For more details on the BigDL-Nano's PyTorch usage, please refer to the [PyTorch
|
|||
|
||||
### **3.2 TensorFlow**
|
||||
|
||||
BigDL-Nano supports `tensorflow.keras` API and most optimizations requires only changing a few "import" lines in your code and adding a few flags.
|
||||
BigDL-Nano supports `tensorflow.keras` API and most optimizations require only changing a few "import" lines in your code and adding a few flags.
|
||||
|
||||
BigDL-Nano uses a extended version of `tf.keras.Model` or `tf.keras.Sequential` for integrating our optimizations.
|
||||
|
||||
For example, you can conduct a multi-instance training using the following code:
|
||||
For example, you can conduct a multi-instance training using the following lines of code:
|
||||
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
|
|
|||
|
|
@ -18,7 +18,7 @@ To learn more about installation of WSL2, please Follow [this guide](https://doc
|
|||
|
||||
## Step 2: Install conda in WSL2
|
||||
|
||||
Start a new WSL2 window and setup the user information. Then download and install the conda.
|
||||
Start a new WSL2 window and set up the user information. Then download and install the conda.
|
||||
|
||||
```bash
|
||||
wget https://repo.continuum.io/miniconda/Miniconda3-4.5.4-Linux-x86_64.sh
|
||||
|
|
@ -28,7 +28,7 @@ chmod +x Miniconda3-4.5.4-Linux-x86_64.sh
|
|||
|
||||
## Step 3: Create a BigDL-Nano env
|
||||
|
||||
Use conda to create a new environment. For example, use `bigdl-nano` as the new environemnt name:
|
||||
Use conda to create a new environment. For example, use `bigdl-nano` as the new environment name:
|
||||
|
||||
```bash
|
||||
conda create -n bigdl-nano
|
||||
|
|
@ -36,9 +36,9 @@ conda activate bigdl-nano
|
|||
```
|
||||
|
||||
|
||||
## Step 4: Install BigDL Nano from Pypi
|
||||
## Step 4: Install BigDL-Nano from Pypi
|
||||
|
||||
You can install BigDL nano from Pypi with `pip`. Specifically, for PyTroch extensions, please run:
|
||||
You can install BigDL-Nano from Pypi with `pip`. Specifically, for PyTorch extensions, please run:
|
||||
|
||||
```
|
||||
pip install bigdl-nano[pytorch]
|
||||
|
|
|
|||
|
|
@ -1,15 +1,15 @@
|
|||
# BigDL-Nano PyTorch Inference Overview
|
||||
|
||||
BigDL-Nano provides several APIs which can help users easily apply optimizations on inference pipelines to improve latency and throughput. Currently, performance accelerations are achieved by integrating extra runtimes as inference backend engines or using quantization methods on full-precision trained models to reduce computation during inference. Trainer (`bigdl.nano.pytorch.Trainer`) provides the APIs for all optimizations you need for inference.
|
||||
BigDL-Nano provides several APIs which can help users easily apply optimizations on inference pipelines to improve latency and throughput. Currently, performance accelerations are achieved by integrating extra runtimes as inference backend engines or using quantization methods on full-precision trained models to reduce computation during inference. Trainer (`bigdl.nano.pytorch.Trainer`) provides the APIs for all optimizations that you need for inference.
|
||||
|
||||
For runtime acceleration, BigDL-Nano has enabled two kinds of runtime for users in `Trainer.trace()`, ONNXRuntime and OpenVINO.
|
||||
|
||||
For quantization, BigDL-Nano provides only post-training quantization in `trainer.quantize()` for users to infer with models of 8-bit precision. Quantization-Aware Training is not available for now. Model conversion to 16-bit like BF16, and FP16 will be coming soon.
|
||||
For quantization, BigDL-Nano provides only post-training quantization in `trainer.quantize()` for users to infer with models of 8-bit precision. Quantization-aware training is not available for now. Model conversion to 16-bit like BF16, and FP16 will be coming soon.
|
||||
|
||||
Before you go ahead with these APIs, you have to make sure BigDL-Nano is correctly installed for PyTorch. If not, please follow [this](../Overview/nano.md) to set up your environment.
|
||||
|
||||
## Runtime Acceleration
|
||||
All available runtime accelerations are integrated in `Trainer.trace(accelerator='onnxruntime'/'openvino')` with different accelerator value. Before you know about BigDL-Nano and any optimizations on inference, taking mobilenetv3 as an example, you may have one script for training and inference like this:
|
||||
All available runtime accelerations are integrated in `Trainer.trace(accelerator='onnxruntime'/'openvino')` with different accelerator values. Let's take mobilenetv3 as an example model and here is a short script that you might have before applying any BigDL-Nano's optimizations:
|
||||
```python
|
||||
from torchvision.models.mobilenetv3 import mobilenet_v3_small
|
||||
import torch
|
||||
|
|
@ -37,7 +37,7 @@ trainer.test(ort_model, dataloader)
|
|||
trainer.predict(ort_model, dataloader)
|
||||
```
|
||||
### ONNXRuntime Acceleration
|
||||
Before you start with onnxruntime accelerator, you are required to install some onnx packages as follows to set up your environment with ONNXRuntime acceleration.
|
||||
Before you start with ONNXRuntime accelerator, you are required to install some ONNX packages as follows to set up your environment with ONNXRuntime acceleration.
|
||||
```shell
|
||||
pip install onnx onnxruntime
|
||||
```
|
||||
|
|
@ -70,7 +70,7 @@ The OpenVINO usage is quite similar to ONNXRuntime, the following usage is for O
|
|||
ov_model = Trainer.trace(model, accelerator='openvino', input_sample=x)
|
||||
|
||||
# step 5: use returned model for transparent acceleration
|
||||
# The usage is almost the same with any pytorch module
|
||||
# The usage is almost the same with any PyTorch module
|
||||
y_hat = ov_model(x)
|
||||
|
||||
# validate, predict, test in Trainer also support acceleration
|
||||
|
|
@ -83,11 +83,11 @@ trainer.predict(ort_model, dataloader)
|
|||
```
|
||||
|
||||
## Quantization
|
||||
Quantization is widely used to compress models to a lower precision, which not only reduces the model size but also accelerates inference. BigDL-Nano provides `Trainer.quantize()` API for users to quickly obtain a quantized model with accuracy control by specifying a few arguments. Intel Neural Compressor (INC) and Post-training Optimization Tools (POT) from OpenVINO toolkit are enabled as options. In the meantime, runtime acceleration is also included directly in the quantization pipeline when using accelerator='onnxruntime'/'openvino' so you don't have to run `Trainer.trace` before quantization.
|
||||
Quantization is widely used to compress models to a lower precision, which not only reduces the model size but also accelerates inference. BigDL-Nano provides `Trainer.quantize()` API for users to quickly obtain a quantized model with accuracy control by specifying a few arguments. Intel Neural Compressor (INC) and Post-training Optimization Tools (POT) from OpenVINO toolkit are enabled as options. In the meantime, runtime acceleration is also included directly in the quantization pipeline when using `accelerator='onnxruntime'/'openvino'` so you don't have to run `Trainer.trace` before quantization.
|
||||
|
||||
To use INC as your quantization engine, you can choose accelerator as None or 'onnxruntime'. Otherwise, accelerator='openvino' means using OpenVINO POT to do quantization.
|
||||
To use INC as your quantization engine, you can choose accelerator as `None` or `'onnxruntime'`. Otherwise, `accelerator='openvino'` means using OpenVINO POT to do quantization.
|
||||
|
||||
By default, `Trainer.quantize()` doesn't search the tuning space and returns the fully-quantized model without considering the accuracy drop. If you need to search quantization tuning space for a model with accuracy control, you'll have to specify a few arguments to define the tuning space. More instructions in [Quantization with Accuracy control](#quantization-with-accuracy-control)
|
||||
By default, `Trainer.quantize()` doesn't search the tuning space and returns the fully-quantized model without considering the accuracy drop. If you need to search quantization tuning space for a model with accuracy control, you'll have to specify a few arguments to define the tuning space. More instructions in [Quantization with Accuracy Control](#quantization-with-accuracy-control)
|
||||
|
||||
### Quantization using Intel Neural Compressor
|
||||
By default, Intel Neural Compressor is not installed with BigDL-Nano. So if you determine to use it as your quantization backend, you'll need to install it first:
|
||||
|
|
@ -96,7 +96,7 @@ pip install neural-compressor==1.11.0
|
|||
```
|
||||
**Quantization without extra accelerator**
|
||||
|
||||
Without extra accelerator, `Trainer.quantize()` returns a pytorch module with desired precision and accuracy. Following the example in [Runtime Acceleration](#runtime-acceleration), you can add quantization as below:
|
||||
Without extra accelerator, `Trainer.quantize()` returns a PyTorch module with desired precision and accuracy. Following the example in [Runtime Acceleration](#runtime-acceleration), you can add quantization as below:
|
||||
```python
|
||||
q_model = trainer.quantize(model, calib_dataloader=dataloader)
|
||||
# run simple prediction with transparent acceleration
|
||||
|
|
@ -111,7 +111,7 @@ This is a most basic usage to quantize a model with defaults, INT8 precision, an
|
|||
|
||||
**Quantization with ONNXRuntime accelerator**
|
||||
|
||||
Without the ONNXRuntime accelerator, `Trainer.quantize()` will return a model with compressed precision but running inference in the ONNXRuntime engine. It's also required to install onnxruntime-extensions as a dependency of INC when using ONNXRuntime as backend as well as the dependencies required in [ONNXRuntime Acceleration](#onnxruntime-acceleration):
|
||||
With the ONNXRuntime accelerator, `Trainer.quantize()` will return a model with compressed precision and running inference in the ONNXRuntime engine. It's also required to install onnxruntime-extensions as a dependency of INC when using ONNXRuntime as backend as well as the dependencies required in [ONNXRuntime Acceleration](#onnxruntime-acceleration):
|
||||
```shell
|
||||
pip install onnx onnxruntime onnxruntime-extensions
|
||||
```
|
||||
|
|
@ -126,7 +126,7 @@ trainer.validate(ort_q_model, dataloader)
|
|||
trainer.test(ort_q_model, dataloader)
|
||||
trainer.predict(ort_q_model, dataloader)
|
||||
```
|
||||
Using accelerator='onnxruntime' actually equals to converting the model from Pytorch to ONNX firstly and then do quantization on the converted ONNX model:
|
||||
Using `accelerator='onnxruntime'` actually equals to converting the model from PyTorch to ONNX firstly and then do quantization on the converted ONNX model:
|
||||
```python
|
||||
ort_model = Trainer.trace(model, accelerator='onnruntime', input_sample=x):
|
||||
ort_q_model = trainer.quantize(ort_model, accelerator='onnxruntime', calib_dataloader=dataloader)
|
||||
|
|
@ -139,7 +139,7 @@ trainer.predict(ort_q_model, dataloader)
|
|||
```
|
||||
|
||||
### Quantization using Post-training Optimization Tools
|
||||
The POT(Post-training Optimization Tools) is provided by OpenVINO toolkit. To use POT, you need to install OpenVINO as the same in [OpenVINO acceleration](#openvino-acceleration):
|
||||
The POT (Post-training Optimization Tools) is provided by OpenVINO toolkit. To use POT, you need to install OpenVINO as the same in [OpenVINO acceleration](#openvino-acceleration):
|
||||
```shell
|
||||
pip install openvino-dev
|
||||
```
|
||||
|
|
@ -154,7 +154,7 @@ trainer.validate(ov_q_model, dataloader)
|
|||
trainer.test(ov_q_model, dataloader)
|
||||
trainer.predict(ov_q_model, dataloader)
|
||||
```
|
||||
Same as ONNXRuntime, it equals to converting the model from Pytorch to OpenVINO firstly and then doing quantization on the converted OpenVINO model:
|
||||
Same as using ONNXRuntime accelerator, it equals to converting the model from PyTorch to OpenVINO firstly and then doing quantization on the converted OpenVINO model:
|
||||
```python
|
||||
ov_model = Trainer.trace(model, accelerator='openvino', input_sample=x):
|
||||
ov_q_model = trainer.quantize(ov_model, accelerator='onnxruntime', calib_dataloader=dataloader)
|
||||
|
|
@ -179,9 +179,9 @@ A set of arguments that helps to tune the results for both INC and POT quantizat
|
|||
- `max_trials`: Maximum trails on the search, if the algorithm can't find a satisfying model, it will exit and raise the error.
|
||||
|
||||
**Accuracy Control with INC**
|
||||
There are a few arguments that require only by INC, and you should not specify or modify any of them if you use `accelerator='openvino'`.
|
||||
- `tuning_strategy`(optional): it specifies the algorithm to search the tuning space. In most cases, you don't need to change it.
|
||||
- `timeout`: Timeout of your tuning. Defaults 0 means endless time for tuning.
|
||||
There are a few arguments required only by INC, and you should not specify or modify any of them if you use `accelerator='openvino'`.
|
||||
- `tuning_strategy` (optional): it specifies the algorithm to search the tuning space. In most cases, you don't need to change it.
|
||||
- `timeout`: Timeout of your tuning. Defaults `0` means endless time for tuning.
|
||||
|
||||
Here is an example to use INC with accuracy control as below. It will search for a model within 1% accuracy drop with 10 trials.
|
||||
```python
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@
|
|||
We recommend using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to prepare the environment. Please refer to the [install guide](../../UserGuide/python.md) for more details.
|
||||
|
||||
```bash
|
||||
conda create py37 python==3.7.10 setuptools==58.0.4
|
||||
conda create -n py37 python==3.7.10 setuptools==58.0.4
|
||||
conda activate py37
|
||||
# nightly bulit version
|
||||
pip install --pre --upgrade bigdl-nano[pytorch]
|
||||
|
|
@ -14,7 +14,7 @@ pip install --pre --upgrade bigdl-nano[pytorch]
|
|||
source bigdl-nano-init
|
||||
```
|
||||
|
||||
Before you start with onnxruntime accelerator, you need to install some onnx packages as follows to set up your environment with ONNXRuntime acceleration.
|
||||
Before you start with ONNXRuntime accelerator, you need to install some ONNX packages as follows to set up your environment with ONNXRuntime acceleration.
|
||||
```bash
|
||||
pip install onnx onnxruntime
|
||||
```
|
||||
|
|
@ -85,5 +85,8 @@ ort_model = Trainer.trace(model_ft, accelerator="onnxruntime", input_sample=torc
|
|||
y_hat = ort_model(x)
|
||||
y_hat.argmax(dim=1)
|
||||
```
|
||||
- Note
|
||||
`ort_model` is not trainable any more, so you can't use like trainer.fit(ort_model, dataloader)
|
||||
|
||||
```eval_rst
|
||||
.. note::
|
||||
``ort_model`` is not trainable any more, so you cannot use it in ``fit`` such as ``trainer.fit(ort_model, dataloader)``.
|
||||
```
|
||||
|
|
@ -6,7 +6,7 @@ We will briefly describe here the major features in BigDL-Nano for PyTorch train
|
|||
|
||||
### Best Known Configurations
|
||||
|
||||
When you run `source bigdl-nano-init`, BigDL-Nano will export a few environment variables, such as OMP_NUM_THREADS and KMP_AFFINITY, according to your current hardware. Empirically, these environment variables work best for most PyTorch applications. After setting these environment variables, you can just run your applications as usual (`python app.py`) and no additional changes are required.
|
||||
When you run `source bigdl-nano-init`, BigDL-Nano will export a few environment variables, such as `OMP_NUM_THREADS` and `KMP_AFFINITY`, according to your current hardware. Empirically, these environment variables work best for most PyTorch applications. After setting these environment variables, you can just run your applications as usual (`python app.py`) and no additional changes are required.
|
||||
|
||||
### BigDL-Nano PyTorch Trainer
|
||||
|
||||
|
|
@ -40,7 +40,7 @@ trainer.fit(lightning_module, train_loader)
|
|||
|
||||
#### Intel® Extension for PyTorch
|
||||
|
||||
Intel Extension for Pytorch (a.k.a. IPEX) [link](https://github.com/intel/intel-extension-for-pytorch) extends PyTorch with optimizations for an extra performance boost on Intel hardware. BigDL-Nano integrates IPEX through the `Trainer`. Users can turn on IPEX by setting `use_ipex=True`.
|
||||
[Intel Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch) (a.k.a. IPEX) extends PyTorch with optimizations for an extra performance boost on Intel hardware. BigDL-Nano integrates IPEX through the `Trainer`. Users can turn on IPEX by setting `use_ipex=True`.
|
||||
|
||||
```python
|
||||
from bigdl.nano.pytorch import Trainer
|
||||
|
|
@ -58,7 +58,7 @@ from bigdl.nano.pytorch import Trainer
|
|||
trainer = Trainer(max_epoch=10, num_processes=4)
|
||||
```
|
||||
|
||||
Note that the effective batch size multi-instance training is the `batch_size` in your `dataloader` times `num_processes` so the number of iterations of each epoch will be reduced `num_processes` fold. A common practice to compensate for that is to gradually increase the learning rate to `num_processes` times. You can find more details of this trick in the [Facebook paper](https://arxiv.org/abs/1706.02677).
|
||||
Note that the effective batch size in multi-instance training is the `batch_size` in your `dataloader` times `num_processes` so the number of iterations of each epoch will be reduced `num_processes` fold. A common practice to compensate for that is to gradually increase the learning rate to `num_processes` times. You can find more details of this trick in this [paper](https://arxiv.org/abs/1706.02677) published by Facebook.
|
||||
|
||||
### BigDL-Nano PyTorch TorchNano
|
||||
|
||||
|
|
@ -92,7 +92,7 @@ MyNano(use_ipex=True).train(...)
|
|||
MyNano(use_ipex=True, num_processes=2, strategy="subprocess").train(...)
|
||||
```
|
||||
|
||||
### Optimized Data pipeline
|
||||
### Optimized Data Pipeline
|
||||
|
||||
Computer Vision task often needs a data processing pipeline that sometimes constitutes a non-trivial part of the whole training pipeline. Leveraging OpenCV and libjpeg-turbo, BigDL-Nano can accelerate computer vision data pipelines by providing a drop-in replacement of torch_vision's `datasets` and `transforms`.
|
||||
|
||||
|
|
|
|||
|
|
@ -1,16 +1,16 @@
|
|||
# BigDL-Nano TensorFlow Inference Overview
|
||||
BigDL-Nano provides several APIs which can help users easily apply optimizations on inference pipelines to improve latency and throughput. Currently, performance accelerations are achieved by integrating extra runtimes as inference backend engines or using quantization methods on full-precision trained models to reduce computation during inference. Keras Model (`bigdl.nano.tf.keras.Model`) and Sequential (`bigdl.nano.tf.keras.Sequential`) provides the APIs for all optimizations you need for inference.
|
||||
BigDL-Nano provides several APIs which can help users easily apply optimizations on inference pipelines to improve latency and throughput. Currently, performance accelerations are achieved by integrating extra runtimes as inference backend engines or using quantization methods on full-precision trained models to reduce computation during inference. Keras Model (`bigdl.nano.tf.keras.Model`) and Sequential (`bigdl.nano.tf.keras.Sequential`) provides the APIs for all optimizations that you need for inference.
|
||||
|
||||
For quantization, BigDL-Nano provides only post-training quantization in `Model.quantize()` for users to infer with models of 8-bit precision. Quantization-Aware Training is not available for now. Model conversion to 16-bit like BF16, and FP16 will be coming soon.
|
||||
|
||||
Before you go ahead with these APIs, you have to make sure BigDL-Nano is correctly installed for Tensorflow. If not, please follow [this](../Overview/nano.md) to set up your environment.
|
||||
Before you go ahead with these APIs, you have to make sure BigDL-Nano is correctly installed for TensorFlow. If not, please follow [this](../Overview/nano.md) to set up your environment.
|
||||
|
||||
## Quantization
|
||||
Quantization is widely used to compress models to a lower precision, which not only reduces the model size but also accelerates inference. BigDL-Nano provides `Model.quantize()` API for users to quickly obtain a quantized model with accuracy control by specifying a few arguments. `Sequential` has similar usage, so we will only show how to use an instance of `Model` to enable quantization pipeline.
|
||||
Quantization is widely used to compress models to a lower precision, which not only reduces the model size but also accelerates inference. BigDL-Nano provides `Model.quantize()` API for users to quickly obtain a quantized model with accuracy control by specifying a few arguments. `Sequential` has similar usage, so we will only show how to use an instance of `Model` to enable quantization pipeline here.
|
||||
|
||||
To use INC as your quantization engine, you can choose accelerator as None or 'onnxruntime'. Otherwise, accelerator='openvino' means using OpenVINO POT to do quantization.
|
||||
To use INC as your quantization engine, you can choose accelerator as `None` or `'onnxruntime'`. Otherwise, `accelerator='openvino'` means using OpenVINO POT to do quantization.
|
||||
|
||||
By default, `Model.quantize()` doesn't search the tuning space and returns the fully-quantized model without considering the accuracy drop. If you need to search quantization tuning space for a model with accuracy control, you'll have to specify a few arguments to define the tuning space. More instructions in [Quantization with Accuracy control](#quantization-with-accuracy-control)
|
||||
By default, `Model.quantize()` doesn't search the tuning space and returns the fully-quantized model without considering the accuracy drop. If you need to search quantization tuning space for a model with accuracy control, you'll have to specify a few arguments to define the tuning space. More instructions in [Quantization with Accuracy Control](#quantization-with-accuracy-control)
|
||||
|
||||
### Quantization using Intel Neural Compressor
|
||||
By default, Intel Neural Compressor is not installed with BigDL-Nano. So if you determine to use it as your quantization backend, you'll need to install it first:
|
||||
|
|
@ -18,7 +18,8 @@ By default, Intel Neural Compressor is not installed with BigDL-Nano. So if you
|
|||
# We have tested on neural-compressor>=1.8.1,<=1.11.0
|
||||
pip install 'neural-compressor>=1.8.1,<=1.11.0'
|
||||
```
|
||||
**Quantization without extra accelerator**
|
||||
**Quantization without extra accelerator**
|
||||
|
||||
Without extra accelerators, `Model.quantize()` returns a Keras module with desired precision and accuracy. Taking MobileNetV2 as an example, you can add quantization as below:
|
||||
```python
|
||||
import tensorflow as tf
|
||||
|
|
@ -56,7 +57,7 @@ of INC.
|
|||
### Quantization with Accuracy Control
|
||||
A set of arguments that helps to tune the results for both INC and POT quantization:
|
||||
|
||||
- `calib_dataset`: A tf.data.Dataset object for calibration. Required for static quantization. It's also used as a validation dataloader.
|
||||
- `calib_dataset`: A `tf.data.Dataset` object for calibration. Required for static quantization. It's also used as a validation dataloader.
|
||||
- `metric`: A `tensorflow.keras.metrics.Metric` object for evaluation.
|
||||
|
||||
- `accuracy_criterion`: A dictionary to specify the acceptable accuracy drop, e.g. `{'relative': 0.01, 'higher_is_better': True}`
|
||||
|
|
@ -67,9 +68,9 @@ A set of arguments that helps to tune the results for both INC and POT quantizat
|
|||
- `batch`: Specify the batch size of the dataloader. This will only take effect on evaluation. If it's not set, then we use `batch=1` for evaluation.
|
||||
|
||||
**Accuracy Control with INC**
|
||||
There are a few arguments that require only by INC.
|
||||
- `tuning_strategy`(optional): it specifies the algorithm to search the tuning space. In most cases, you don't need to change it.
|
||||
- `timeout`: Timeout of your tuning. Defaults 0 means endless time for tuning.
|
||||
There are a few arguments required only by INC.
|
||||
- `tuning_strategy` (optional): it specifies the algorithm to search the tuning space. In most cases, you don't need to change it.
|
||||
- `timeout`: Timeout of your tuning. Defaults `0` means endless time for tuning.
|
||||
- `inputs`: A list of input names. Default: None, automatically get names from the graph.
|
||||
- `outputs`: A list of output names. Default: None, automatically get names from the graph.
|
||||
Here is an example to use INC with accuracy control as below. It will search for a model within 1% accuracy drop with 10 trials.
|
||||
|
|
|
|||
|
|
@ -1,11 +1,11 @@
|
|||
# BigDL-Nano TensorFlow Training Overview
|
||||
|
||||
BigDL-Nano can be used to accelerate TensorFlow Keras applications on training workloads. The optimizations in BigDL-Nano are delivered through BigDL-Nano's `Model` and `Sequential` classes, which have identical APIs with `tf.keras.Model` and `tf.keras.Sequential`. For most cases, you can just replace your `tf.keras.Model` to `bigdl.nano.tf.keras.Model` and `tf.keras.Sequential` to `bigdl.nano.tf.keras.Sequential` to benefits from BigDL-Nano.
|
||||
BigDL-Nano can be used to accelerate TensorFlow Keras applications on training workloads. The optimizations in BigDL-Nano are delivered through BigDL-Nano's `Model` and `Sequential` classes, which have identical APIs with `tf.keras.Model` and `tf.keras.Sequential`. For most cases, you can just replace your `tf.keras.Model` with `bigdl.nano.tf.keras.Model` and `tf.keras.Sequential` with `bigdl.nano.tf.keras.Sequential` to benefit from BigDL-Nano.
|
||||
|
||||
We will briefly describe here the major features in BigDL-Nano for TensorFlow training. You can find complete examples here [links to be added]().
|
||||
|
||||
### Best Known Configurations
|
||||
When you install BigDL-Nano by `pip install bigdl-nano[tensorflow]`, intel-tensorflow will be installed in your environment, which has intel's oneDNN optimizations enabled by default; and when you run `source bigdl-nano-init`, it will export a few environment variables, such as OMP_NUM_THREADS and KMP_AFFINITY, according to your current hardware. Empirically, these environment variables work best for most TensorFlow applications. After setting these environment variables, you can just run your applications as usual (`python app.py`) and no additional changes are required.
|
||||
When you install BigDL-Nano by `pip install bigdl-nano[tensorflow]`, `intel-tensorflow` will be installed in your environment, which has intel's oneDNN optimizations enabled by default; and when you run `source bigdl-nano-init`, it will export a few environment variables, such as `OMP_NUM_THREADS` and `KMP_AFFINITY`, according to your current hardware. Empirically, these environment variables work best for most TensorFlow applications. After setting these environment variables, you can just run your applications as usual (`python app.py`) and no additional changes are required.
|
||||
|
||||
### Multi-Instance Training
|
||||
|
||||
|
|
@ -38,10 +38,6 @@ model.compile(optimizer='adam',
|
|||
model.fit(train_ds, epochs=3, validation_data=val_ds, num_processes=2)
|
||||
```
|
||||
|
||||
Note that, different from the conventions in PyTorch, the effective batch size will not change in TensorFlow multi-instance training, which means it is still the batch size you specify in your dataset. This is because TensorFlow's `MultiWorkerMirroredStrategy` will try to split the batch into multiple sub-batches for different workers. We chose this behavior to match the semantics of TensorFlow distributed training.
|
||||
Note that, different from the conventions in [BigDL-Nano PyTorch multi-instance training](./pytorch_train.html#multi-instance-training), the effective batch size will not change in TensorFlow multi-instance training, which means it is still the batch size you specify in your dataset. This is because TensorFlow's `MultiWorkerMirroredStrategy` will try to split the batch into multiple sub-batches for different workers. We chose this behavior to match the semantics of TensorFlow distributed training.
|
||||
|
||||
When you do want to increase your effective batch_size, you can do so by directly changing it in your dataset definition and you may also want to gradually increase the learning rate linearly to the batch_size, as described in the [Facebook paper](https://arxiv.org/abs/1706.02677).
|
||||
|
||||
## TensorFlow Inference
|
||||
|
||||
### Quantization
|
||||
When you do want to increase your effective `batch_size`, you can do so by directly changing it in your dataset definition and you may also want to gradually increase the learning rate linearly to the `batch_size`, as described in this [paper](https://arxiv.org/abs/1706.02677) published by Facebook.
|
||||
Loading…
Reference in a new issue