doc fix (#4829)
This commit is contained in:
parent
04542bc6db
commit
7dcd79366f
1 changed files with 6 additions and 5 deletions
|
|
@ -92,10 +92,10 @@ By default, `Trainer.quantize()` doesn't search the tuning space and returns the
|
||||||
### Quantization using Intel Neural Compressor
|
### Quantization using Intel Neural Compressor
|
||||||
By default, Intel Neural Compressor is not installed with BigDL-Nano. So if you determine to use it as your quantization backend, you'll need to install it first:
|
By default, Intel Neural Compressor is not installed with BigDL-Nano. So if you determine to use it as your quantization backend, you'll need to install it first:
|
||||||
```shell
|
```shell
|
||||||
# We have tested on neural-compressor>=1.8.1,<=1.11.0
|
pip install neural-compressor==1.11.0
|
||||||
pip install 'neural-compressor>=1.8.1,<=1.11.0'
|
|
||||||
```
|
```
|
||||||
**Quantization without extra accelerator**
|
**Quantization without extra accelerator**
|
||||||
|
|
||||||
Without extra accelerator, `Trainer.quantize()` returns a pytorch module with desired precision and accuracy. Following the example in [Runtime Acceleration](#runtime-acceleration), you can add quantization as below:
|
Without extra accelerator, `Trainer.quantize()` returns a pytorch module with desired precision and accuracy. Following the example in [Runtime Acceleration](#runtime-acceleration), you can add quantization as below:
|
||||||
```python
|
```python
|
||||||
q_model = trainer.quanize(model, calib_dataloader=dataloader)
|
q_model = trainer.quanize(model, calib_dataloader=dataloader)
|
||||||
|
|
@ -109,8 +109,9 @@ trainer.predict(q_model, dataloader)
|
||||||
```
|
```
|
||||||
This is a most basic usage to quantize a model with defaults, INT8 precision, and without search tuning space to control accuracy drop.
|
This is a most basic usage to quantize a model with defaults, INT8 precision, and without search tuning space to control accuracy drop.
|
||||||
|
|
||||||
**Quantization with ONNXRuntime accelerator**
|
**Quantization with ONNXRuntime accelerator**
|
||||||
Without the ONNXRuntime accelerator, `Trainer.quantize()` will return a model with compressed precision but running inference in the ONNXRuntime engine. If your INC version >= 1.11, it's also required to install onnxruntime-extensions as a dependency of INC when using ONNXRuntime as backend as well as the dependencies required in [ONNXRuntime Acceleration](#onnxruntime-acceleration):
|
|
||||||
|
Without the ONNXRuntime accelerator, `Trainer.quantize()` will return a model with compressed precision but running inference in the ONNXRuntime engine. It's also required to install onnxruntime-extensions as a dependency of INC when using ONNXRuntime as backend as well as the dependencies required in [ONNXRuntime Acceleration](#onnxruntime-acceleration):
|
||||||
```shell
|
```shell
|
||||||
pip install onnx onnxruntime onnxruntime-extensions
|
pip install onnx onnxruntime onnxruntime-extensions
|
||||||
```
|
```
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue