Skip to content

Conversation

pctablet505
Copy link
Collaborator

@pctablet505 pctablet505 commented Sep 17, 2025

This pull request adds support for exporting Keras models to the LiteRT (TFLite) format, along with improvements to input signature handling and export utility documentation. The changes ensure that LiteRT export is only available when TensorFlow is installed, update the export API and documentation, and enhance input signature inference for various model types.

LiteRT Export Support:

  • Added conditional import of LitertExporter and export_litert in keras/src/export/__init__.py, making LiteRT export available only if TensorFlow is installed.
  • Updated the Model.export method to support the "litert" format, including new options for LiteRT export and user-facing documentation and example. Raises an informative error if TensorFlow is not installed. [1] [2] [3] [4]
  • Registered litert as a lazy module in keras/src/utils/module_utils.py for dynamic import support.

Input Signature and Export Utilities:

  • Improved documentation and logic in get_input_signature to clarify behavior for different model types and ensure correct input signature construction for export. [1] [2]
  • Enhanced _infer_input_signature_from_model to handle flexible batch dimensions and ensure compatibility with downstream exporters, always returning a flat list of input specs.

pctablet505 and others added 30 commits August 28, 2025 11:04
Introduces a custom LiteRTExporter for exporting models to TFLite format, bypassing the standard TFLiteConverter. Updates the export API and documentation to support the new 'lite_rt' format, and adds relevant options for custom ops, select TF ops, and optimizations.
Replaces the custom MLIR-based TFLite conversion logic in LiteRTExporter with direct use of the standard TFLiteConverter. Also improves input signature handling for tf.function tracing and updates imports accordingly.
Moved imports of get_input_signature and make_tf_tensor_spec inside functions in saved_model.py to prevent circular imports. Updated EXPORT_FORMATS in export_utils.py to use string references instead of direct imports.
Adds max_sequence_length parameter to input signature generation for sequence models, bounding sequence length for transformer-like architectures. Improves LiteRTExporter with heuristics for complex models, fallback conversion via SavedModel for large models, and exposes max_sequence_length in Model export options. Updates documentation accordingly.
Adds logic to dynamically reduce max_sequence_length for large vocabulary models in export_utils.py to prevent tensor size overflow. In lite_rt_exporter.py, introduces checks and workarounds for models with _DictWrapper issues, and applies memory optimizations for large models during TFLite conversion. These changes improve export reliability and prevent memory errors for models such as Gemma, Llama, and similar architectures.
Removed custom trackable object logic from LiteRTExporter and now save the model directly, simplifying the export process. Also streamlined vocabulary size checks in export_utils to prevent tensor size overflow, removing verbose warnings and redundant comments.
Refactors the TFLite conversion logic in lite_rt_exporter.py to attempt direct conversion first and only fall back to SavedModel if necessary, improving robustness and clarity. Adds a new lite_rt_exporter_simple.py file with a streamlined LiteRTExporter class for direct TFLite export, bypassing complex MLIR conversion paths.
Refactors export_utils and lite_rt_exporter to better detect large vocabulary and Keras-Hub models, applying safer sequence length limits and more robust TFLite conversion paths. Adds heuristics for model type detection, ensures memory safety, and improves handling of TensorFlow introspection issues during export.
Working well with keras
Eliminates the logic for bounding sequence length in model export utilities and related code paths. The max_sequence_length parameter and associated shape bounding for large vocabulary models are removed from export_utils.py and lite_rt_exporter.py. Updates model export documentation accordingly. Adds a comprehensive test script for Keras Hub LiteRT export, verifying numerical accuracy between original and exported models.
provided, they will be automatically computed.
- `opset_version`: Optional `int`. Specific to `format="onnx"`.
An integer value that specifies the ONNX opset version.
- `allow_custom_ops`: Optional `bool`. Specific to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should consider putting all the litert args in a single dict to simplify accounting? litert_kwargs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think given that every export format already does this through kwargs we should probably be consistent. Either litert_kwargs, onnx_kwargs, tf_save_model_kwargs, etc. Or one final **kwargs that is interpreted per format. No strong preference.

Copy link
Collaborator

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! The code looks good to me.

pctablet505 and others added 3 commits October 13, 2025 10:57
Introduces a litert_kwargs parameter for LiteRT model export, allowing users to specify custom export options such as allow_custom_ops, enable_select_tf_ops, and optimizations. This enhances flexibility when exporting models to the LiteRT format.
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

provided, they will be automatically computed.
- `opset_version`: Optional `int`. Specific to `format="onnx"`.
An integer value that specifies the ONNX opset version.
- `litert_kwargs`: Optional `dict`. Specific to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit duplicated. We already have a general passing of kwargs passed to this function to the specific export. Maybe we should just the per format kwargs that are currently supported?

input_shapes = tree.map_structure(
lambda spec: spec.shape, self.input_signature
)
self.model.build(input_shapes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want this? It looks to me like tf saved model export expects the model to be built

raise ValueError(
"The layer provided has not yet been built. "
"It must be built before export."
)

and onnx export

raise ValueError(
"The model provided has never called. "
"It must be called at least once before export."
)

We are just going to make things more confusing if one export format attempt to automatically build but no others do. Let's shoot for consistency.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using tf_saved_model as intermediate step, to convert to litert.
We can't expect the model to be fully built/called/traced while calling export.
And making it leave on user, will make the export process complicated from users perspective.

for uniformity, we can change the behaviour in other formats too like onnx.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't expect the model to be fully built/called/traced while calling export.

I think we can right? If tf saved model and onnx expect the model to be built, let's just make the same assumption here. It's a good way to start simple. Basically, we'd expect users to hit this error in some cases. Modify their code to call the model on some inputs, and re-export.

Then we could always add this auto build feature as a follow up right? But do it more consistently across export formats.

Moved KerasModelWrapper definition inside LitertExporter for dynamic class creation and removed the old _KerasModelWrapper. Updated import logic for TensorFlow to use module_utils. Improved LiteRT test interpreter selection and simplified test skipping conditions for better backend compatibility.
Updated verbose output in LitertExporter to use io_utils.print_msg instead of print for consistency and better message handling. Warnings about unavailable LiteRT now use the logging module. Improved comments and formatting for clarity.
aot_compile_targets=None,
**kwargs,
):
"""Export the model as a Litert artifact for inference.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think LiteRT is easier to read than Litert.


# Print compilation report if available
try:
report = result.compilation_report()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this throw an exception if report is not available?

pctablet505 and others added 11 commits October 16, 2025 15:58
Updated all references from LitertExporter to LiteRTExporter in the export module for consistency and clarity. Also corrected related docstrings and messages to use the LiteRT naming.
Improves error messaging in export_utils.py and refines input signature inference logic. Also corrects code block formatting in model.py documentation.
# Registry for export formats
EXPORT_FORMATS = {
"tf_saved_model": "keras.src.export.saved_model:export_saved_model",
"lite_rt": "keras.src.export.lite_rt_exporter:LiteRTExporter",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall it be named 'lite_rt' or 'litert'?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

named it as "litert"

from keras.src.export import export_saved_model

available_formats = ("tf_saved_model", "onnx", "openvino")
available_formats = ("tf_saved_model", "onnx", "openvino", "lite_rt")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tflite and lite_rt both may be supported as both generate same format tflite but lite_rt is supposed to be further optimized.

input_shapes = tree.map_structure(
lambda spec: spec.shape, self.input_signature
)
self.model.build(input_shapes)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using tf_saved_model as intermediate step, to convert to litert.
We can't expect the model to be fully built/called/traced while calling export.
And making it leave on user, will make the export process complicated from users perspective.

for uniformity, we can change the behaviour in other formats too like onnx.

self.kwargs = kwargs

def export(self, filepath):
"""Exports the Keras model to a TFLite file and optionally performs AOT
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LiteRT is just a runtime built on top of TFLite, it generates same old .tflite file.

tflite_model = self._convert_to_tflite(self.input_signature)

if self.verbose:
final_size_mb = len(tflite_model) / (1024 * 1024)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_convert_to_tflite returns the serialized bytes.
len(tflite_model) counts the bytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants