From c7674f52b2e0716620a5132212abc5b4bfcb1f02 Mon Sep 17 00:00:00 2001
From: Yuwen Hu <54161268+Oscilloscope98@users.noreply.github.com>
Date: Wed, 1 Feb 2023 13:33:54 +0800
Subject: [PATCH] [Nano] Add how-to guide for TensorFlow Keras inference using
 bf16 mixed precision (#7337)

* Add basic structure for tf bf16 infer how-to guide

* Add how-to use BFloat16 mixed precision for TensorFlow Keras inference

* Small fixes

* Add instruction requires box

* Add outputs to show inference time diff

* Small fixes regarding hardware requirements and others

* Update based on comments

* Small fixes

* Small fixes
---
 docs/readthedocs/source/_toc.yml                               | 1 +
 .../Inference/TensorFlow/tensorflow_inference_bf16.nblink      | 3 +++
 docs/readthedocs/source/doc/Nano/Howto/index.rst               | 3 ++-
 3 files changed, 6 insertions(+), 1 deletion(-)
 create mode 100644 docs/readthedocs/source/doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16.nblink

diff --git a/docs/readthedocs/source/_toc.yml b/docs/readthedocs/source/_toc.yml
index c0ada445..c8233ef8 100644
--- a/docs/readthedocs/source/_toc.yml
+++ b/docs/readthedocs/source/_toc.yml
@@ -133,6 +133,7 @@ subtrees:
                   - file: doc/Nano/Howto/Inference/PyTorch/inference_optimizer_optimize
                   - file: doc/Nano/Howto/Inference/TensorFlow/accelerate_tensorflow_inference_onnx
                   - file: doc/Nano/Howto/Inference/TensorFlow/accelerate_tensorflow_inference_openvino
+                  - file: doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16
                   - file: doc/Nano/Howto/Inference/TensorFlow/tensorflow_save_and_load_onnx
                   - file: doc/Nano/Howto/Inference/TensorFlow/tensorflow_save_and_load_openvino
                   - file: doc/Nano/Howto/install_in_colab
diff --git a/docs/readthedocs/source/doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16.nblink b/docs/readthedocs/source/doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16.nblink
new file mode 100644
index 00000000..fa8a1ae6
--- /dev/null
+++ b/docs/readthedocs/source/doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16.nblink
@@ -0,0 +1,3 @@
+{
+    "path": "../../../../../../../../python/nano/tutorial/notebook/inference/tensorflow/tensorflow_inference_bf16.ipynb"
+}
\ No newline at end of file
diff --git a/docs/readthedocs/source/doc/Nano/Howto/index.rst b/docs/readthedocs/source/doc/Nano/Howto/index.rst
index 19f21516..9f37cb07 100644
--- a/docs/readthedocs/source/doc/Nano/Howto/index.rst
+++ b/docs/readthedocs/source/doc/Nano/Howto/index.rst
@@ -41,7 +41,7 @@ TensorFlow
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 * `How to accelerate a TensorFlow Keras application on training workloads through multiple instances <Training/TensorFlow/accelerate_tensorflow_training_multi_instance.html>`_
 * |tensorflow_training_embedding_sparseadam_link|_
-* `How to conduct BFloat16 Mixed Precision training in your TensorFlow application <Training/TensorFlow/tensorflow_training_bf16.html>`_
+* `How to conduct BFloat16 Mixed Precision training in your TensorFlow Keras application <Training/TensorFlow/tensorflow_training_bf16.html>`_
 
 .. |tensorflow_training_embedding_sparseadam_link| replace:: How to optimize your model with a sparse ``Embedding`` layer and ``SparseAdam`` optimizer
 .. _tensorflow_training_embedding_sparseadam_link: Training/TensorFlow/tensorflow_training_embedding_sparseadam.html
@@ -83,6 +83,7 @@ TensorFlow
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 * `How to accelerate a TensorFlow inference pipeline through ONNXRuntime <Inference/TensorFlow/accelerate_tensorflow_inference_onnx.html>`_
 * `How to accelerate a TensorFlow inference pipeline through OpenVINO <Inference/TensorFlow/accelerate_tensorflow_inference_openvino.html>`_
+* `How to conduct BFloat16 Mixed Precision inference in a TensorFlow Keras application <Inference/TensorFlow/tensorflow_inference_bf16.html>`_
 * `How to save and load optimized ONNXRuntime model in TensorFlow <Inference/TensorFlow/tensorflow_save_and_load_onnx.html>`_
 * `How to save and load optimized OpenVINO model in TensorFlow <Inference/TensorFlow/tensorflow_save_and_load_openvino.html>`_