Nano: improve Howto guide of InferenceOptimizer.optimize (#6924)
* update howto guide for optimizer * update export model * update typo * update based on comments * fix bug of get_best_model without validation data * update ut * update * update * fix 600s * fix
This commit is contained in:
		
							parent
							
								
									614d2ab289
								
							
						
					
					
						commit
						925874b2cd
					
				
					 2 changed files with 2 additions and 3 deletions
				
			
		| 
						 | 
					@ -176,7 +176,7 @@ optimizer.summary()
 | 
				
			||||||
The output of `optimizer.summary()` will be something like:
 | 
					The output of `optimizer.summary()` will be something like:
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 -------------------------------- ---------------------- -------------- ----------------------
 | 
					 -------------------------------- ---------------------- -------------- ----------------------
 | 
				
			||||||
|             method             |        status        | latency(ms)  |       accuracy       |
 | 
					|             method             |        status        | latency(ms)  |     metric value     |
 | 
				
			||||||
 -------------------------------- ---------------------- -------------- ----------------------
 | 
					 -------------------------------- ---------------------- -------------- ----------------------
 | 
				
			||||||
|            original            |      successful      |    45.145    |        0.975         |
 | 
					|            original            |      successful      |    45.145    |        0.975         |
 | 
				
			||||||
|              bf16              |      successful      |    27.549    |        0.975         |
 | 
					|              bf16              |      successful      |    27.549    |        0.975         |
 | 
				
			||||||
| 
						 | 
					@ -190,7 +190,7 @@ The output of `optimizer.summary()` will be something like:
 | 
				
			||||||
|        onnxruntime_fp32        |      successful      |    20.838    |        0.975*        |
 | 
					|        onnxruntime_fp32        |      successful      |    20.838    |        0.975*        |
 | 
				
			||||||
|    onnxruntime_int8_qlinear    |      successful      |    7.123     |        0.981         |
 | 
					|    onnxruntime_int8_qlinear    |      successful      |    7.123     |        0.981         |
 | 
				
			||||||
 -------------------------------- ---------------------- -------------- ----------------------
 | 
					 -------------------------------- ---------------------- -------------- ----------------------
 | 
				
			||||||
* means we assume the precision of the traced model does not change, so we don't recompute accuracy to save time.
 | 
					* means we assume the metric value of the traced model does not change, so we don't recompute metric value to save time.
 | 
				
			||||||
Optimization cost 60.8s in total.
 | 
					Optimization cost 60.8s in total.
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
| 
						 | 
					@ -284,7 +284,6 @@ The output table of `optimize()` looks like:
 | 
				
			||||||
|        onnxruntime_fp32        |      successful      |    3.801     |
 | 
					|        onnxruntime_fp32        |      successful      |    3.801     |
 | 
				
			||||||
|    onnxruntime_int8_qlinear    |      successful      |    4.727     |
 | 
					|    onnxruntime_int8_qlinear    |      successful      |    4.727     |
 | 
				
			||||||
 -------------------------------- ---------------------- --------------
 | 
					 -------------------------------- ---------------------- --------------
 | 
				
			||||||
* means we assume the accuracy of the traced model does not change, so we don't recompute accuracy to save time.
 | 
					 | 
				
			||||||
Optimization cost 58.3s in total.
 | 
					Optimization cost 58.3s in total.
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
		Loading…
	
		Reference in a new issue