Add Fast Image Processor for Idefics3 #37045

rootonchair · 2025-03-27T16:25:26Z

What does this PR do?

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

github-actions · 2025-03-27T16:25:38Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

yonigozlan

Hi @rootonchair , thanks for contributing this, overall looks really good! There is an ongoing discussion on what to do with Lanczos resampling, but in the meantime I'm curious what diff you get with Bicubic when testing against the slow processor (using Lanczos) in test_slow_fast_equivalence and test_slow_fast_equivalence?

yonigozlan · 2025-03-31T16:44:52Z

src/transformers/models/idefics3/image_processing_idefics3.py

@@ -296,7 +296,7 @@ def __init__(
        do_convert_rgb: bool = True,
        do_resize: bool = True,
        size: Dict[str, int] = None,
-        resample: PILImageResampling = PILImageResampling.LANCZOS,
+        resample: PILImageResampling = PILImageResampling.BICUBIC,


We shouldn't change the default resampling method in the slow processor. I hadn't seen a processor with Lanczos as default before, and I realize that it is not supported in torch/torchvision. I'd say the best course of action for now is to keep using Lanczos for slow image processing and use Bicubic for fast and add-in a warning_once in the fast processor to fall back to slow for exact processing. You can follow the ongoing discussion on this here: #35206 (comment)

src/transformers/models/idefics3/image_processing_idefics3_fast.py

tests/models/idefics3/test_image_processing_idefics3.py

rootonchair · 2025-04-01T05:56:54Z

Hi @rootonchair , thanks for contributing this, overall looks really good! There is an ongoing discussion on what to do with Lanczos resampling, but in the meantime I'm curious what diff you get with Bicubic when testing against the slow processor (using Lanczos) in test_slow_fast_equivalence and test_slow_fast_equivalence?

@yonigozlan thanks for your thorough reviews. So I did a benchmark that measures MAE between LANCZOS and other resamplers. As you can see, BICUBIC is the most suitable approximation among others

Fast resampler	`test_slow_fast_equivalence`	`test_slow_fast_equivalence_batched`
BICUBIC	0.018636604771018028	0.05995668098330498
BILINEAR	0.05067451298236847	0.13565397262573242
NEAREST	0.2290235459804535	0.5087724924087524

…mpler

src/transformers/models/idefics3/image_processing_idefics3.py

yonigozlan · 2025-04-02T14:40:27Z

@yonigozlan thanks for your thorough reviews. So I did a benchmark that measures MAE between LANCZOS and other resamplers. As you can see, BICUBIC is the most suitable approximation among others

Very cool, thanks for that. Looks like Bicubic is the way to go indeed. It's not ideal to have non negligeable differences in the processing, but I guess it's the best we can do for now.

yonigozlan

Thanks for iterating! Only thing left is adding a warning_once before forcing Bicubic resample, then LGTM!

Edit: the tests slow_fast_equivalence don't seem to pass on the CI. We might need a higher threshold for this specific case

Edit2: we're also missing adding Idefics3ImageProcessorFast to docs/source/en/model_doc/idefics3.md, look slike something went wrong with the transformers-cli script

yonigozlan · 2025-04-02T14:34:43Z

tests/models/idefics3/test_image_processing_idefics3.py

+    def _assertEquivalence(self, a, b):
+        self.assertTrue(torch.allclose(a, b, atol=1e-1))
+        self.assertLessEqual(
+            torch.mean(torch.abs(a - b)).item(), 1e-3
+        )


yonigozlan · 2025-04-02T14:40:51Z

src/transformers/models/idefics3/image_processing_idefics3_fast.py

-    # For an example of a fast image processor requiring more complex augmentations, see `LlavaNextImageProcessorFast`.
-
-    # Default values should be checked against the slow image processor
-    # None values left after checking can be removed
    resample = PILImageResampling.BICUBIC


Let's keep Lanczos here, and override the resize function to force Bicubic if resample is Lanczos, with a warning_once so users are aware of the issue. See my comment here: #37140 (comment)

I have added the warning to resize method. If all good, I will forward the same changes for Flava PR

yonigozlan

Hi @rootonchair ! I'm currently working on a refactor of the processing in idefics2/idefics3/smolvlm family which will impact this PR, here's the PR in question: #37291

Basically it will change the processing of batched images to fully flattened images, so it should simplify this PR. It looks like you're already handling image processing this way though, so maybe you won't have much to change.

In any case, I advise waiting for this refactor to be merged before iterating on this PR, I'll ping here when it's done!

add support fast image processor

4d8855b

github-actions bot marked this pull request as draft March 27, 2025 16:25

apply style

cf6c0b0

rootonchair marked this pull request as ready for review March 27, 2025 16:32

github-actions bot requested review from ydshieh and yonigozlan March 27, 2025 16:33

qubvel mentioned this pull request Mar 31, 2025

[Contributions Welcome] Add Fast Image Processors #36978

Open

69 tasks

yonigozlan reviewed Mar 31, 2025

View reviewed changes

update test, remove None default values, revert slow processor's resa…

3e755e3

…mpler

rootonchair commented Apr 1, 2025

View reviewed changes

src/transformers/models/idefics3/image_processing_idefics3.py Outdated Show resolved Hide resolved

src/transformers/models/idefics3/image_processing_idefics3.py Outdated Show resolved Hide resolved

src/transformers/models/idefics3/image_processing_idefics3.py Outdated Show resolved Hide resolved

Apply suggestions from code review

e856056

yonigozlan reviewed Apr 2, 2025

View reviewed changes

rootonchair added 2 commits April 2, 2025 22:13

add warning to resize

211fffa

make style

391495d

yonigozlan reviewed Apr 7, 2025

View reviewed changes

farrosalferro mentioned this pull request Apr 20, 2025

Add Fast Image Processor for Chameleon #37140

Open

5 tasks

yonigozlan mentioned this pull request May 15, 2025

Add Idefics2/3 and SmolVLM Fast image processors + improvements for fast image processors #38157

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Fast Image Processor for Idefics3 #37045

Add Fast Image Processor for Idefics3 #37045

Uh oh!

rootonchair commented Mar 27, 2025

Uh oh!

github-actions bot commented Mar 27, 2025

Uh oh!

yonigozlan left a comment

Uh oh!

yonigozlan Mar 31, 2025

Uh oh!

Uh oh!

Uh oh!

rootonchair commented Apr 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yonigozlan commented Apr 2, 2025

Uh oh!

yonigozlan left a comment •

edited

Loading

Uh oh!

yonigozlan Apr 2, 2025

Uh oh!

yonigozlan Apr 2, 2025

Uh oh!

rootonchair Apr 2, 2025

Uh oh!

yonigozlan left a comment

Uh oh!

Uh oh!

Add Fast Image Processor for Idefics3 #37045

Are you sure you want to change the base?

Add Fast Image Processor for Idefics3 #37045

Uh oh!

Conversation

rootonchair commented Mar 27, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

github-actions bot commented Mar 27, 2025

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

yonigozlan Mar 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rootonchair commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yonigozlan commented Apr 2, 2025

Uh oh!

yonigozlan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yonigozlan Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

rootonchair Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rootonchair commented Apr 1, 2025 •

edited

Loading

yonigozlan left a comment •

edited

Loading