-
Notifications
You must be signed in to change notification settings - Fork 19.6k
Model Export to liteRT #21674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Model Export to liteRT #21674
Conversation
Introduces a custom LiteRTExporter for exporting models to TFLite format, bypassing the standard TFLiteConverter. Updates the export API and documentation to support the new 'lite_rt' format, and adds relevant options for custom ops, select TF ops, and optimizations.
Replaces the custom MLIR-based TFLite conversion logic in LiteRTExporter with direct use of the standard TFLiteConverter. Also improves input signature handling for tf.function tracing and updates imports accordingly.
Moved imports of get_input_signature and make_tf_tensor_spec inside functions in saved_model.py to prevent circular imports. Updated EXPORT_FORMATS in export_utils.py to use string references instead of direct imports.
Adds max_sequence_length parameter to input signature generation for sequence models, bounding sequence length for transformer-like architectures. Improves LiteRTExporter with heuristics for complex models, fallback conversion via SavedModel for large models, and exposes max_sequence_length in Model export options. Updates documentation accordingly.
Adds logic to dynamically reduce max_sequence_length for large vocabulary models in export_utils.py to prevent tensor size overflow. In lite_rt_exporter.py, introduces checks and workarounds for models with _DictWrapper issues, and applies memory optimizations for large models during TFLite conversion. These changes improve export reliability and prevent memory errors for models such as Gemma, Llama, and similar architectures.
Removed custom trackable object logic from LiteRTExporter and now save the model directly, simplifying the export process. Also streamlined vocabulary size checks in export_utils to prevent tensor size overflow, removing verbose warnings and redundant comments.
Refactors the TFLite conversion logic in lite_rt_exporter.py to attempt direct conversion first and only fall back to SavedModel if necessary, improving robustness and clarity. Adds a new lite_rt_exporter_simple.py file with a streamlined LiteRTExporter class for direct TFLite export, bypassing complex MLIR conversion paths.
Refactors export_utils and lite_rt_exporter to better detect large vocabulary and Keras-Hub models, applying safer sequence length limits and more robust TFLite conversion paths. Adds heuristics for model type detection, ensures memory safety, and improves handling of TensorFlow introspection issues during export.
Working well with keras
Eliminates the logic for bounding sequence length in model export utilities and related code paths. The max_sequence_length parameter and associated shape bounding for large vocabulary models are removed from export_utils.py and lite_rt_exporter.py. Updates model export documentation accordingly. Adds a comprehensive test script for Keras Hub LiteRT export, verifying numerical accuracy between original and exported models.
keras/src/models/model.py
Outdated
provided, they will be automatically computed. | ||
- `opset_version`: Optional `int`. Specific to `format="onnx"`. | ||
An integer value that specifies the ONNX opset version. | ||
- `allow_custom_ops`: Optional `bool`. Specific to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should consider putting all the litert args in a single dict to simplify accounting? litert_kwargs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think given that every export format already does this through kwargs we should probably be consistent. Either litert_kwargs
, onnx_kwargs
, tf_save_model_kwargs
, etc. Or one final **kwargs
that is interpreted per format. No strong preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! The code looks good to me.
Introduces a litert_kwargs parameter for LiteRT model export, allowing users to specify custom export options such as allow_custom_ops, enable_select_tf_ops, and optimizations. This enhances flexibility when exporting models to the LiteRT format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
provided, they will be automatically computed. | ||
- `opset_version`: Optional `int`. Specific to `format="onnx"`. | ||
An integer value that specifies the ONNX opset version. | ||
- `litert_kwargs`: Optional `dict`. Specific to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit duplicated. We already have a general passing of kwargs passed to this function to the specific export. Maybe we should just the per format kwargs that are currently supported?
input_shapes = tree.map_structure( | ||
lambda spec: spec.shape, self.input_signature | ||
) | ||
self.model.build(input_shapes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want this? It looks to me like tf saved model export expects the model to be built
keras/keras/src/export/saved_model.py
Lines 151 to 154 in 3137cb0
raise ValueError( | |
"The layer provided has not yet been built. " | |
"It must be built before export." | |
) |
and onnx export
keras/keras/src/export/onnx.py
Lines 79 to 82 in 3137cb0
raise ValueError( | |
"The model provided has never called. " | |
"It must be called at least once before export." | |
) |
We are just going to make things more confusing if one export format attempt to automatically build but no others do. Let's shoot for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using tf_saved_model as intermediate step, to convert to litert.
We can't expect the model to be fully built/called/traced while calling export.
And making it leave on user, will make the export process complicated from users perspective.
for uniformity, we can change the behaviour in other formats too like onnx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't expect the model to be fully built/called/traced while calling export.
I think we can right? If tf saved model and onnx expect the model to be built, let's just make the same assumption here. It's a good way to start simple. Basically, we'd expect users to hit this error in some cases. Modify their code to call the model on some inputs, and re-export.
Then we could always add this auto build feature as a follow up right? But do it more consistently across export formats.
Moved KerasModelWrapper definition inside LitertExporter for dynamic class creation and removed the old _KerasModelWrapper. Updated import logic for TensorFlow to use module_utils. Improved LiteRT test interpreter selection and simplified test skipping conditions for better backend compatibility.
Updated verbose output in LitertExporter to use io_utils.print_msg instead of print for consistency and better message handling. Warnings about unavailable LiteRT now use the logging module. Improved comments and formatting for clarity.
keras/src/export/litert.py
Outdated
aot_compile_targets=None, | ||
**kwargs, | ||
): | ||
"""Export the model as a Litert artifact for inference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think LiteRT
is easier to read than Litert
.
|
||
# Print compilation report if available | ||
try: | ||
report = result.compilation_report() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this throw an exception if report is not available?
Updated all references from LitertExporter to LiteRTExporter in the export module for consistency and clarity. Also corrected related docstrings and messages to use the LiteRT naming.
Improves error messaging in export_utils.py and refines input signature inference logic. Also corrects code block formatting in model.py documentation.
keras/src/export/export_utils.py
Outdated
# Registry for export formats | ||
EXPORT_FORMATS = { | ||
"tf_saved_model": "keras.src.export.saved_model:export_saved_model", | ||
"lite_rt": "keras.src.export.lite_rt_exporter:LiteRTExporter", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall it be named 'lite_rt' or 'litert'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
named it as "litert"
keras/src/models/model.py
Outdated
from keras.src.export import export_saved_model | ||
|
||
available_formats = ("tf_saved_model", "onnx", "openvino") | ||
available_formats = ("tf_saved_model", "onnx", "openvino", "lite_rt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tflite
and lite_rt
both may be supported as both generate same format tflite
but lite_rt
is supposed to be further optimized.
input_shapes = tree.map_structure( | ||
lambda spec: spec.shape, self.input_signature | ||
) | ||
self.model.build(input_shapes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using tf_saved_model as intermediate step, to convert to litert.
We can't expect the model to be fully built/called/traced while calling export.
And making it leave on user, will make the export process complicated from users perspective.
for uniformity, we can change the behaviour in other formats too like onnx.
self.kwargs = kwargs | ||
|
||
def export(self, filepath): | ||
"""Exports the Keras model to a TFLite file and optionally performs AOT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LiteRT is just a runtime built on top of TFLite, it generates same old .tflite
file.
tflite_model = self._convert_to_tflite(self.input_signature) | ||
|
||
if self.verbose: | ||
final_size_mb = len(tflite_model) / (1024 * 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_convert_to_tflite
returns the serialized bytes.
len(tflite_model)
counts the bytes.
This pull request adds support for exporting Keras models to the LiteRT (TFLite) format, along with improvements to input signature handling and export utility documentation. The changes ensure that LiteRT export is only available when TensorFlow is installed, update the
export
API and documentation, and enhance input signature inference for various model types.LiteRT Export Support:
LitertExporter
andexport_litert
inkeras/src/export/__init__.py
, making LiteRT export available only if TensorFlow is installed.Model.export
method to support the"litert"
format, including new options for LiteRT export and user-facing documentation and example. Raises an informative error if TensorFlow is not installed. [1] [2] [3] [4]litert
as a lazy module inkeras/src/utils/module_utils.py
for dynamic import support.Input Signature and Export Utilities:
get_input_signature
to clarify behavior for different model types and ensure correct input signature construction for export. [1] [2]_infer_input_signature_from_model
to handle flexible batch dimensions and ensure compatibility with downstream exporters, always returning a flat list of input specs.