Nano: improve Howto guide of InferenceOptimizer.optimize (#6924)

* update howto guide for optimizer

* update export model

* update typo

* update based on comments

* fix bug of get_best_model without validation data

* update ut

* update

* update

* fix 600s

* fix
This commit is contained in:
Ruonan Wang 2023-01-03 14:08:05 +08:00 committed by GitHub
parent 614d2ab289
commit 925874b2cd
2 changed files with 2 additions and 3 deletions

View file

@ -176,7 +176,7 @@ optimizer.summary()
The output of `optimizer.summary()` will be something like:
```
-------------------------------- ---------------------- -------------- ----------------------
| method | status | latency(ms) | accuracy |
| method | status | latency(ms) | metric value |
-------------------------------- ---------------------- -------------- ----------------------
| original | successful | 45.145 | 0.975 |
| bf16 | successful | 27.549 | 0.975 |
@ -190,7 +190,7 @@ The output of `optimizer.summary()` will be something like:
| onnxruntime_fp32 | successful | 20.838 | 0.975* |
| onnxruntime_int8_qlinear | successful | 7.123 | 0.981 |
-------------------------------- ---------------------- -------------- ----------------------
* means we assume the precision of the traced model does not change, so we don't recompute accuracy to save time.
* means we assume the metric value of the traced model does not change, so we don't recompute metric value to save time.
Optimization cost 60.8s in total.
```

View file

@ -284,7 +284,6 @@ The output table of `optimize()` looks like:
| onnxruntime_fp32 | successful | 3.801 |
| onnxruntime_int8_qlinear | successful | 4.727 |
-------------------------------- ---------------------- --------------
* means we assume the accuracy of the traced model does not change, so we don't recompute accuracy to save time.
Optimization cost 58.3s in total.
```