Fix typo in nano documentation (#5036)
This commit is contained in:
		
							parent
							
								
									2157af9a03
								
							
						
					
					
						commit
						01e6c62a68
					
				
					 1 changed files with 5 additions and 5 deletions
				
			
		| 
						 | 
				
			
			@ -98,7 +98,7 @@ pip install neural-compressor==1.11.0
 | 
			
		|||
 | 
			
		||||
Without extra accelerator, `Trainer.quantize()` returns a pytorch module with desired precision and accuracy. Following the example in [Runtime Acceleration](#runtime-acceleration), you can add quantization as below:
 | 
			
		||||
```python
 | 
			
		||||
q_model = trainer.quanize(model, calib_dataloader=dataloader)
 | 
			
		||||
q_model = trainer.quantize(model, calib_dataloader=dataloader)
 | 
			
		||||
# run simple prediction with transparent acceleration
 | 
			
		||||
y_hat = q_model(x)
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -117,7 +117,7 @@ pip install onnx onnxruntime onnxruntime-extensions
 | 
			
		|||
```
 | 
			
		||||
Still taking the example in [Runtime Acceleration](pytorch_inference.md#runtime-acceleration), you can add quantization as below:
 | 
			
		||||
```python
 | 
			
		||||
ort_q_model = trainer.quanize(model, accelerator='onnxruntime', calib_dataloader=dataloader)
 | 
			
		||||
ort_q_model = trainer.quantize(model, accelerator='onnxruntime', calib_dataloader=dataloader)
 | 
			
		||||
# run simple prediction with transparent acceleration
 | 
			
		||||
y_hat = ort_q_model(x)
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -129,7 +129,7 @@ trainer.predict(ort_q_model, dataloader)
 | 
			
		|||
Using accelerator='onnxruntime' actually equals to converting the model from Pytorch to ONNX firstly and then do quantization on the converted ONNX model:
 | 
			
		||||
```python
 | 
			
		||||
ort_model = Trainer.trace(model, accelerator='onnruntime', input_sample=x):
 | 
			
		||||
ort_q_model = trainer.quanize(ort_model, accelerator='onnxruntime', calib_dataloader=dataloader)
 | 
			
		||||
ort_q_model = trainer.quantize(ort_model, accelerator='onnxruntime', calib_dataloader=dataloader)
 | 
			
		||||
 | 
			
		||||
# run inference with transparent acceleration 
 | 
			
		||||
y_hat = ort_q_model(x)
 | 
			
		||||
| 
						 | 
				
			
			@ -145,7 +145,7 @@ pip install openvino-dev
 | 
			
		|||
```
 | 
			
		||||
Take the example in [Runtime Acceleration](#runtime-acceleration), and add quantization:
 | 
			
		||||
```python
 | 
			
		||||
ov_q_model = trainer.quanize(model, accelerator='openvino', calib_dataloader=dataloader)
 | 
			
		||||
ov_q_model = trainer.quantize(model, accelerator='openvino', calib_dataloader=dataloader)
 | 
			
		||||
# run simple prediction with transparent acceleration
 | 
			
		||||
y_hat = ov_q_model(x)
 | 
			
		||||
 | 
			
		||||
| 
						 | 
				
			
			@ -157,7 +157,7 @@ trainer.predict(ov_q_model, dataloader)
 | 
			
		|||
Same as ONNXRuntime, it equals to converting the model from Pytorch to OpenVINO firstly and then doing quantization on the converted OpenVINO model:
 | 
			
		||||
```python
 | 
			
		||||
ov_model = Trainer.trace(model, accelerator='openvino', input_sample=x):
 | 
			
		||||
ov_q_model = trainer.quanize(ov_model, accelerator='onnxruntime', calib_dataloader=dataloader)
 | 
			
		||||
ov_q_model = trainer.quantize(ov_model, accelerator='onnxruntime', calib_dataloader=dataloader)
 | 
			
		||||
 | 
			
		||||
# run inference with transparent acceleration 
 | 
			
		||||
y_hat = ov_q_model(x)
 | 
			
		||||
| 
						 | 
				
			
			
 | 
			
		|||
		Loading…
	
		Reference in a new issue