Nano: improve Howto guide of InferenceOptimizer.optimize (#6924)

* update howto guide for optimizer

* update export model

* update typo

* update based on comments

* fix bug of get_best_model without validation data

* update ut

* update

* update

* fix 600s

* fix
This commit is contained in:
Ruonan Wang 2023-01-03 14:08:05 +08:00 committed by GitHub
parent 614d2ab289
commit 925874b2cd
2 changed files with 2 additions and 3 deletions

View file

@ -176,7 +176,7 @@ optimizer.summary()
The output of `optimizer.summary()` will be something like: The output of `optimizer.summary()` will be something like:
``` ```
-------------------------------- ---------------------- -------------- ---------------------- -------------------------------- ---------------------- -------------- ----------------------
| method | status | latency(ms) | accuracy | | method | status | latency(ms) | metric value |
-------------------------------- ---------------------- -------------- ---------------------- -------------------------------- ---------------------- -------------- ----------------------
| original | successful | 45.145 | 0.975 | | original | successful | 45.145 | 0.975 |
| bf16 | successful | 27.549 | 0.975 | | bf16 | successful | 27.549 | 0.975 |
@ -190,7 +190,7 @@ The output of `optimizer.summary()` will be something like:
| onnxruntime_fp32 | successful | 20.838 | 0.975* | | onnxruntime_fp32 | successful | 20.838 | 0.975* |
| onnxruntime_int8_qlinear | successful | 7.123 | 0.981 | | onnxruntime_int8_qlinear | successful | 7.123 | 0.981 |
-------------------------------- ---------------------- -------------- ---------------------- -------------------------------- ---------------------- -------------- ----------------------
* means we assume the precision of the traced model does not change, so we don't recompute accuracy to save time. * means we assume the metric value of the traced model does not change, so we don't recompute metric value to save time.
Optimization cost 60.8s in total. Optimization cost 60.8s in total.
``` ```

View file

@ -284,7 +284,6 @@ The output table of `optimize()` looks like:
| onnxruntime_fp32 | successful | 3.801 | | onnxruntime_fp32 | successful | 3.801 |
| onnxruntime_int8_qlinear | successful | 4.727 | | onnxruntime_int8_qlinear | successful | 4.727 |
-------------------------------- ---------------------- -------------- -------------------------------- ---------------------- --------------
* means we assume the accuracy of the traced model does not change, so we don't recompute accuracy to save time.
Optimization cost 58.3s in total. Optimization cost 58.3s in total.
``` ```