Home » From Training to Tiny: Deploying TensorFlow Models with TFLite

From Training to Tiny: Deploying TensorFlow Models with TFLite

Keras, LiteRT, TensorFlow, TensorFlow Lite

Vikas Hazrati

The journey of a machine learning model doesn’t end with training; it culminates in deployment. TensorFlow’s robust ecosystem provides tools, notably the SavedModel format and TensorFlow Lite (TFLite), to bridge the gap between training and real-world application, often on devices with limited resources.

A SavedModel plays an important role in building the bridge between the training and deployment phases. A complete TensorFlow program, including weights and computation, is contained within a SavedModel. It does not require the original model-building code to run, which makes it useful for sharing or deploying using various technologies like TFLite, TensorFlow.js, TensorFlow Serving, or TensorFlow Hub.

The Power of the TensorFlow SavedModel

While originally separate packages, TensorFlow and Keras are now combined. The standard practice for saving a full Keras model is using the TensorFlow SavedModel format.

A full Keras model consists of multiple components:

An architecture, or configuration, which specifies what layers the model contains, and how they’re connected.
A set of weight values (the “state of the model”).
An optimizer (defined by compiling the model).
A set of losses and metrics (defined by compiling the model or calling add_loss() or add_metric()).

SavedModel Structure

When saved, a SavedModel directory typically contains three key parts: assets/, saved_model.pb, and variables/.

saved_model.pb – This is the most important part for deployment. It stores the model architecture and training configuration, including the optimizer, losses, and metrics. This file includes the graph definitions as MetaGraphDef protocol buffers.
variables/ – This subfolder contains the output from the TensorFlow Saver. This is where the weights are saved. It typically includes files like variables.data-?????-of-????? and variables.index.
assets/ – This subfolder contains auxiliary files such as vocabularies.

Saving the TensorFlow Model

TensorFlow provides straightforward methods for saving a trained model using the standard SavedModel format.

Keras Model Saving Example

Using Keras’s model.export() The method automatically generates a SavedModel folder. This example demonstrates saving and reloading a Keras model identically

import numpy as np
import tensorflow as tf
from tensorflow import keras

def get_model():
# Create a simple model.
    inputs = keras.Input(shape=(32,))
    outputs = keras.layers.Dense(1)(inputs)
    model = keras.Model(inputs, outputs)
    model.compile(optimizer="adam", loss="mean_squared_error")
    return model

model = get_model()
# ... Training steps omitted ...

# Calling `save('my_model')` creates a SavedModel folder `my_model`.
model.export(mobilenet_save_path)

# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_model")
# The reconstructed model is already compiled and has retained the optimizer state.

General SavedModel API Saving

After training your model, you can use the general TensorFlow deployment workflow function tf.saved_model.save:

export_dir = 'saved_model/1'

tf.saved_model.save(model, export_dir)

This creates a directory with files and metadata describing the model. For example, when saving MobileNet, the directory structure includes `assets`, `saved_model.pb`, and `variables`.

Converting the SavedModel to TFLite

Once you have your saved model, you can use the TensorFlow Lite converter to convert it to the optimized TF Lite format.

converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)

tflite_model = converter.convert()

The resulting tflite_model (a byte string) can then be written to disk as a single .tflite file that fully encapsulates the model and its saved weights:


import pathlib

tflite_model_file = pathlib.Path('model.tflite')

tflite_model_file.write_bytes(tflite_model)

Now, to compare the size of the 2 models, you can do something like

import os 

saved_model_size = sum(os.path.getsize(os.path.join(dirpath, filename)) for dirpath, dirnames, filenames in os.walk(mobilenet_save_path) for filename in filenames) 

print(f"SavedModel directory size: {saved_model_size / (1024 * 1024):.2f} MB") 

tflite_model_size = os.path.getsize('model.tflite') 

print(f"TFLite model file size: {tflite_model_size / (1024 * 1024):.2f} MB")

In our case, this is the difference in the sizes

SavedModel directory size: 33.27 MB

TFLite model file size: 16.12 MB

Performing Inference with TFLite

To use the smaller TFLite model for predictions, you must instantiate a `tf.lite.Interpreter`. If you have the file saved to disk, you can use the model_path property to load the file:

# Loading using the file path:
interpreter = tf.lite.Interpreter(model_path=tflite_model_file)

Once loaded, you can start performing inference. To run inference, you need to get the details of the input and output tensors to the model, set the value of the input tensor, invoke the model, and then get the value of the output tensor.

# Load the TFLite model (as shown above)
# It is necessary to call interpreter.allocate_tensors() before inference

# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

to_predict = # Input data in the same shape as what the model expects

# Set the value of the input tensor
interpreter.set_tensor(input_details['index'], to_predict)

# Invoke the model (run inference)
interpreter.invoke()

# Get the value of the output tensor
tflite_results = interpreter.get_tensor(output_details['index'])

# Process results...

A large part of the skills in running models on embedded systems is being able to format your data to the needs of the model. For example, if you use a common model like MobileNet, you might need to decode and resize image data to specific dimensions, such as 224×224 3-channel images. Performing this conversion is a significant part of engineering for ML systems.

To see other examples, refer to our Colab notebooks here. If you are interested in training your team on TensorFlow and LiteRT, then give us a shout.

Vikas Hazrati

Technologist at heart. Built Knoldus as one of the most recognised companies in functional programming and high performance AI/ML systems space before it was acquired. Working with Edge devices and helping businesses be smarter with Edge AI. MS (Software Systems) & BS (Computer Engineering)

Features

Services

Industries

Our Work

Klyff

From Training to Tiny: Deploying TensorFlow Models with TFLite

Vikas Hazrati

Table of Contents

The Power of the TensorFlow SavedModel

SavedModel Structure

Saving the TensorFlow Model

Keras Model Saving Example

General SavedModel API Saving

Converting the SavedModel to TFLite

Performing Inference with TFLite

Vikas Hazrati

Suggested Article

Make your business smarter with the power of AI on the edge

Edge AI & IoT Newsletter

Klyff Inc.