From c7674f52b2e0716620a5132212abc5b4bfcb1f02 Mon Sep 17 00:00:00 2001 From: Yuwen Hu <54161268+Oscilloscope98@users.noreply.github.com> Date: Wed, 1 Feb 2023 13:33:54 +0800 Subject: [PATCH] [Nano] Add how-to guide for TensorFlow Keras inference using bf16 mixed precision (#7337) * Add basic structure for tf bf16 infer how-to guide * Add how-to use BFloat16 mixed precision for TensorFlow Keras inference * Small fixes * Add instruction requires box * Add outputs to show inference time diff * Small fixes regarding hardware requirements and others * Update based on comments * Small fixes * Small fixes --- docs/readthedocs/source/_toc.yml | 1 + .../Inference/TensorFlow/tensorflow_inference_bf16.nblink | 3 +++ docs/readthedocs/source/doc/Nano/Howto/index.rst | 3 ++- 3 files changed, 6 insertions(+), 1 deletion(-) create mode 100644 docs/readthedocs/source/doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16.nblink diff --git a/docs/readthedocs/source/_toc.yml b/docs/readthedocs/source/_toc.yml index c0ada445..c8233ef8 100644 --- a/docs/readthedocs/source/_toc.yml +++ b/docs/readthedocs/source/_toc.yml @@ -133,6 +133,7 @@ subtrees: - file: doc/Nano/Howto/Inference/PyTorch/inference_optimizer_optimize - file: doc/Nano/Howto/Inference/TensorFlow/accelerate_tensorflow_inference_onnx - file: doc/Nano/Howto/Inference/TensorFlow/accelerate_tensorflow_inference_openvino + - file: doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16 - file: doc/Nano/Howto/Inference/TensorFlow/tensorflow_save_and_load_onnx - file: doc/Nano/Howto/Inference/TensorFlow/tensorflow_save_and_load_openvino - file: doc/Nano/Howto/install_in_colab diff --git a/docs/readthedocs/source/doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16.nblink b/docs/readthedocs/source/doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16.nblink new file mode 100644 index 00000000..fa8a1ae6 --- /dev/null +++ b/docs/readthedocs/source/doc/Nano/Howto/Inference/TensorFlow/tensorflow_inference_bf16.nblink @@ -0,0 +1,3 @@ +{ + "path": "../../../../../../../../python/nano/tutorial/notebook/inference/tensorflow/tensorflow_inference_bf16.ipynb" +} \ No newline at end of file diff --git a/docs/readthedocs/source/doc/Nano/Howto/index.rst b/docs/readthedocs/source/doc/Nano/Howto/index.rst index 19f21516..9f37cb07 100644 --- a/docs/readthedocs/source/doc/Nano/Howto/index.rst +++ b/docs/readthedocs/source/doc/Nano/Howto/index.rst @@ -41,7 +41,7 @@ TensorFlow ~~~~~~~~~~~~~~~~~~~~~~~~~ * `How to accelerate a TensorFlow Keras application on training workloads through multiple instances `_ * |tensorflow_training_embedding_sparseadam_link|_ -* `How to conduct BFloat16 Mixed Precision training in your TensorFlow application `_ +* `How to conduct BFloat16 Mixed Precision training in your TensorFlow Keras application `_ .. |tensorflow_training_embedding_sparseadam_link| replace:: How to optimize your model with a sparse ``Embedding`` layer and ``SparseAdam`` optimizer .. _tensorflow_training_embedding_sparseadam_link: Training/TensorFlow/tensorflow_training_embedding_sparseadam.html @@ -83,6 +83,7 @@ TensorFlow ~~~~~~~~~~~~~~~~~~~~~~~~~ * `How to accelerate a TensorFlow inference pipeline through ONNXRuntime `_ * `How to accelerate a TensorFlow inference pipeline through OpenVINO `_ +* `How to conduct BFloat16 Mixed Precision inference in a TensorFlow Keras application `_ * `How to save and load optimized ONNXRuntime model in TensorFlow `_ * `How to save and load optimized OpenVINO model in TensorFlow `_