Add D-FINE to KerasHub #2318

harshaljanjani · 2025-07-09T06:30:00Z

Description of the change

Welcome D-FINE to the KerasHub family of models!
D-FINE, a powerful real-time object detector, sets a new state-of-the-art benchmark for object detection on KerasHub. It achieves outstanding localization precision by redefining the bounding box regression task in DETR models. Additionally, it incorporates lightweight optimizations in computationally intensive modules and operations, striking a better balance between speed and accuracy. Specifically, D-FINE-L/X achieves 54.0%/55.8% AP on the COCO dataset at 124/78 FPS on an NVIDIA T4 GPU. When pretrained on Objects365, D-FINE-L/X attains 57.1%/59.3% AP, surpassing all existing real-time detectors.

Closes the second half and thus, the complete issue #2271

Results in Action of KerasHub's D-FINE

Model Predictions	Numerics Matching (Visit the Colab notebook for the complete results)

Colab Notebook

D-FINE: Complete Workflow with Predictions and Numerics Matching

Checklist

I have added all the necessary unit tests for my change.
I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
My PR is based on the latest changes of the main branch (if unsure, rebase the code).
I have followed the Keras Hub Model contribution guidelines in making these changes.
I have followed the Keras Hub API design guidelines in making these changes.
I have signed the Contributor License Agreement.

harshaljanjani · 2025-07-09T13:47:52Z

The tests fail due to pending reviews on the HGNetV2 dependency. Once it is merged, the D-FINE PR will be open for review, and the tests will pass, as demonstrated in the notebook.

divyashreepathihalli · 2025-07-11T00:06:50Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces the D-FINE model to KerasHub, including its architecture, layers, tests, and a checkpoint conversion script. The implementation is comprehensive and well-structured. I've provided a few suggestions to improve code clarity, maintainability, and correctness. Overall, this is a solid contribution.

tools/checkpoint_conversion/convert_d_fine_checkpoints.py

keras_hub/src/models/d_fine/d_fine_attention.py

keras_hub/src/models/d_fine/d_fine_backbone_test.py

keras_hub/src/models/d_fine/d_fine_backbone.py

keras_hub/src/models/d_fine/d_fine_object_detector.py

…o match original

harshaljanjani · 2025-07-13T17:43:26Z

@divyashreepathihalli @mattdangerw D-FINE is ready for its first round of reviews!
As discussed, we're leaving out the task model from the scope of this PR, given the sheer volume of the code, since the task model is a 1000+ LOC effort in itself. I've not only covered the numerics check, but also the examples we'd add to the quickstart notebook on Kaggle once merged, in the Colab notebook linked in the PR description!

mattdangerw

Thanks! Nice work. Just some initial comments.

In general, now that this is up and working let's see if we can find anywhere to cut complexity if we can. Anything we can do to same lines of code (without playing code golf) will probably help keep this maintainable for the future.

keras_hub/src/models/d_fine/d_fine_attention.py

mattdangerw · 2025-07-16T03:22:29Z

keras_hub/src/models/d_fine/d_fine_backbone.py

+
+    def __init__(
+        self,
+        decoder_in_channels,


This is a pretty massive list of arguments.

First, let's try to whittle down to the stuff we absolutely need. For some of the _scale/_temperature/etc might be fine to leave these as default args on relevant component layers and not expose them if they won't vary in any of our presets (we can always add them later if someone asks!).

Second, we could consider taking in sub-models directly. We did this for SD3, which was facing a similar arg explosion. Maybe not for everything, but could consider this for the encoder/decoder and HGNetV2Backbone. Take a look at how SD3 does this https://github.com/keras-team/keras-hub/blob/master/keras_hub/src/models/stable_diffusion_3/stable_diffusion_3_backbone.py#L641-L654. Anything that we take in as an argument must be exposed as a public symbol.

But overall, let's try to simplify this arg list as much as we can. We want a practical list of things users might actually change, not an exhaustive list of everything that could possibly change.

Agreed!

I've kept in mind the practical list from the user's perspective approach you mentioned above. I'll take you through my thought process here a bit just so you can evaluate my approach, since I'd love to know your thoughts.

As you mentioned, HGNetV2Backbone can be passed as an argument, which I've implemented based on the SD3 pattern you shared, including the serialization and deserialization paradigm. Having said that, I think the following makes it unnecessary to do the same for the encoder and the decoder:

a. I checked for invariants across the 13 KH presets for D-FINE, and I noticed many low-level arguments that wouldn't need to be varied by the user practically, such as a {}_prob, {}_scale, and {}_temperature args, which I've removed.

b. I've, however, retained a select few arguments, even though I noticed they were invariants across this set of presets, for example, lqe_hidden_dim, that may want to be configured by the user. For example, the denoising box_noise_scale and label_noise_ratio, these are args that are seemingly invariant, but depending on the users' training scenario, they might want to be configured, so I've kept them in.

This way, we don't play code golf, and they don’t have to instantiate the encoder and decoder separately, which helps maintain symmetry. Otherwise, it might feel inconsistent (e.g., why only instantiate the encoder manually?). At the same time, this helps reduce the number of arguments, we’re able to trim down ~30 just by doing this.
I also rechecked for numerics consistency, inference, preset loading, etc., and everything is maintained, so we're good.

I'd love to know your thoughts on the same, thanks!

keras_hub/src/models/d_fine/d_fine_layers.py

mattdangerw · 2025-07-16T03:26:36Z

keras_hub/src/models/d_fine/d_fine_presets.py

@@ -0,0 +1,147 @@
+# Metadata for loading pretrained model weights.
+backbone_presets = {


probably easiest to leave this empty and only add once we have the actual kaggle handles

I’ve done it, but may I ask why we can’t have the presets with placeholders for the Kaggle handles just yet? Thanks!

keras_hub/src/models/d_fine/d_fine_utils.py

harshaljanjani · 2025-07-16T03:35:56Z

Thanks for the reviews @mattdangerw. Yeah let's definitely cut down the complexity wherever possible for maintainability, I'll look into it!

harshaljanjani · 2025-07-16T13:25:44Z

@mattdangerw Could you please check if all your comments have been addressed when you have the time, thanks a lot!

init: Add initial project structure and files

a4985a2

harshaljanjani self-assigned this Jul 9, 2025

test: Enable test cases (will fail until HGNetV2 dep added)

da6196b

refactor: Make clean and consistent API design choices

55f9c13

gemini-code-assist bot reviewed Jul 11, 2025

View reviewed changes

harshaljanjani and others added 4 commits July 11, 2025 21:32

refactor: Remove the task model from the scope of this PR

958df18

Merge branch 'keras-team:master' into d-fine

b9399ff

refactor: Enhance test suite robustness and standardize weight init t…

1d28041

…o match original

nit: Remove the problematic channel_axis from the deserialization args

a488b8b

harshaljanjani marked this pull request as ready for review July 12, 2025 19:25

nit: Replace hyphen with underscore in preset name

4d32f1a

mattdangerw reviewed Jul 16, 2025

View reviewed changes

refactor: Implement code cleanup based on review feedback

02541a7

harshaljanjani requested a review from mattdangerw July 16, 2025 13:22

sachinprasadhs added this to KerasHub Jul 16, 2025

sachinprasadhs moved this to In Progress in KerasHub Jul 16, 2025

		@@ -0,0 +1,147 @@
		# Metadata for loading pretrained model weights.
		backbone_presets = {

Add D-FINE to KerasHub #2318

Are you sure you want to change the base?

Add D-FINE to KerasHub #2318

Conversation

harshaljanjani commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of the change

Results in Action of KerasHub's D-FINE

Colab Notebook

Checklist

Uh oh!

harshaljanjani commented Jul 9, 2025

Uh oh!

divyashreepathihalli commented Jul 11, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harshaljanjani commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattdangerw Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

harshaljanjani Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mattdangerw Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

harshaljanjani Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

harshaljanjani commented Jul 16, 2025

Uh oh!

harshaljanjani commented Jul 16, 2025

Uh oh!

Uh oh!

harshaljanjani commented Jul 9, 2025 •

edited

Loading

harshaljanjani commented Jul 13, 2025 •

edited

Loading

harshaljanjani Jul 16, 2025 •

edited

Loading