Nano: Update examples and tutorials (#6888)

This commit is contained in:
Yishuo Wang 2022-12-08 11:23:48 +08:00 committed by GitHub
parent 044785173e
commit 7e2742cace
6 changed files with 14 additions and 11 deletions

View file

@ -56,7 +56,7 @@ To enable OpenVINO acceleration, or use POT for quantization, you need to instal
```eval_rst
.. note::
If you meet ``ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject`` when using ``Trainer.trace`` or ``Trainer.quantize`` function, you could try to solve it by upgrading ``numpy`` through:
If you meet ``ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject`` when using ``InferenceOptimizer.trace`` or ``InferenceOptimizer.quantize`` function, you could try to solve it by upgrading ``numpy`` through:
.. code-block:: python

View file

@ -78,8 +78,8 @@ When you're ready, you can simply append the following part to enable your ONNXR
# you have run `trainer.fit` before trace
# Model has `example_input_array` set
# Model is a LightningModule with any dataloader attached.
from bigdl.nano.pytorch import Trainer
ort_model = Trainer.trace(model_ft, accelerator="onnxruntime", input_sample=torch.rand(1, 3, 224, 224))
from bigdl.nano.pytorch import InferenceOptimizer
ort_model = InferenceOptimizer.trace(model_ft, accelerator="onnxruntime", input_sample=torch.rand(1, 3, 224, 224))
# The usage is almost the same with any PyTorch module
y_hat = ort_model(x)

View file

@ -78,8 +78,8 @@ When you're ready, you can simply append the following part to enable your OpenV
# The argument `input_sample` is not required in the following cases:
# you have run `trainer.fit` before trace
# The Model has `example_input_array` set
from bigdl.nano.pytorch import Trainer
ov_model = Trainer.trace(model_ft, accelerator="openvino", input_sample=torch.rand(1, 3, 224, 224))
from bigdl.nano.pytorch import InferenceOptimizer
ov_model = InferenceOptimizer.trace(model_ft, accelerator="openvino", input_sample=torch.rand(1, 3, 224, 224))
# The usage is almost the same with any PyTorch module
y_hat = ov_model(x)

View file

@ -74,12 +74,13 @@ y_hat.argmax(dim=1)
```
### Step 3: Quantization using Intel Neural Compressor
Quantization is widely used to compress models to a lower precision, which not only reduces the model size but also accelerates inference. BigDL-Nano provides `Trainer.quantize()` API for users to quickly obtain a quantized model with accuracy control by specifying a few arguments.
Quantization is widely used to compress models to a lower precision, which not only reduces the model size but also accelerates inference. BigDL-Nano provides `InferenceOptimizer.quantize()` API for users to quickly obtain a quantized model with accuracy control by specifying a few arguments.
Without extra accelerator, `Trainer.quantize()` returns a pytorch module with desired precision and accuracy. You can add quantization as below:
Without extra accelerator, `InferenceOptimizer.quantize()` returns a pytorch module with desired precision and accuracy. You can add quantization as below:
```python
from bigdl.nano.pytorch import InferenceOptimizer
from torchmetrics.functional import accuracy
q_model = trainer.quantize(model, calib_dataloader=train_dataloader, metric=accuracy)
q_model = InferenceOptimizer.quantize(model, calib_data=train_dataloader, metric=accuracy)
# run simple prediction
y_hat = q_model(x)

View file

@ -73,12 +73,13 @@ y_hat.argmax(dim=1)
```
### Step 3: Quantization with ONNXRuntime accelerator
With the ONNXRuntime accelerator, `Trainer.quantize()` will return a model with compressed precision but running inference in the ONNXRuntime engine.
With the ONNXRuntime accelerator, `InferenceOptimizer.quantize()` will return a model with compressed precision but running inference in the ONNXRuntime engine.
you can add quantization as below:
```python
from bigdl.nano.pytorch import InferenceOptimizer
from torchmetrics.functional import accuracy
ort_q_model = trainer.quantize(model, accelerator='onnxruntime', calib_dataloader=train_dataloader, metric=accuracy)
ort_q_model = InferenceOptimizer.quantize(model, accelerator='onnxruntime', calib_data=train_dataloader, metric=accuracy)
# run simple prediction
y_hat = ort_q_model(x)

View file

@ -76,8 +76,9 @@ y_hat.argmax(dim=1)
### Step 3: Quantization using Post-training Optimization Tools
Accelerator='openvino' means using OpenVINO POT to do quantization. The quantization can be added as below:
```python
from bigdl.nano.pytorch import InferenceOptimizer
from torchmetrics import Accuracy
ov_q_model = trainer.quantize(model, accelerator="openvino", calib_dataloader=data_loader)
ov_q_model = InferenceOptimizer.quantize(model, accelerator="openvino", calib_data=data_loader)
# run simple prediction
batch = torch.stack([data_set[0][0], data_set[1][0]])