Skip to content

Add the ability to parse workflows from AVIF images #4420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

FerrahWolfeh
Copy link

@FerrahWolfeh FerrahWolfeh commented Jul 11, 2025

Adds the ability to import workflows directly from avif files. Specially useful when using specific nodes that allow the user to specify the output format of the saved image or some tools that allow converting png files into AVIF while preserving their metadata.

The added code works normally as expected, though I plan to create a more robust implementation in the future by reusing some code from the isobmff.ts file.

┆Issue is synchronized with this Notion page by Unito

@FerrahWolfeh FerrahWolfeh requested a review from a team as a code owner July 11, 2025 00:33
@christian-byrne
Copy link
Contributor

Wow, a PR adding support for open, royalty-free image file format that people have requested.

And you are searching by using the binary tags instead of just decoding it all as text and searching for string matches linearly. That's seriously amazing.

Was this inspired by Pillow adding AVIF support last week?

@FerrahWolfeh
Copy link
Author

FerrahWolfeh commented Jul 11, 2025

Wow, a PR adding support for open, royalty-free image file format that people have requested.

And you are searching by using the binary tags instead of just decoding it all as text and searching for string matches linearly. That's seriously amazing.

Was this inspired by Pillow adding AVIF support last week?

Wait, did Pillow just add AVIF support? That's pretty neat!

Also, thanks for the compliment, I really appreciate that.

Currently I'm using a "brute-forcey" way of finding the exif tag inside the AVIF header. Going the binary route is actually faster and opens a way for me to correctly scan for the boxes in the future, as sometimes, the location of the mdat box and the exif header may vary.

@christian-byrne
Copy link
Contributor

christian-byrne commented Jul 11, 2025

Wait, did Pillow just add AVIF support? That's pretty neat!

Yep. I think from python-pillow/Pillow#8858, AVIF is supported without requiring specific wheel. If you run pip install -U Pillow then start ComfyUI, AVIF images will be loadable, it's awesome

Selection_1677

Currently I'm using a "brute-forcey" way of finding the exif tag inside the AVIF header. Going the binary route is actually faster and opens a way for me to correctly scan for the boxes in the future, as sometimes, the location of the mdat box and the exif header may vary.

I totally agree. If you look at some of our decoding implementations in https://github.com/Comfy-Org/ComfyUI_frontend/tree/main/src/scripts/metadata, you will find some lazy approaches. Your approach seems proper and should be 2-3 magnitudes of order faster. I will do some research into AVIF format, I wrote the ISOBMFF decoder somewhat recently and had a lot of fun.

@christian-byrne
Copy link
Contributor

If you are able to create a simple AVIF with a workflow embedded, you can add a test case by just adding the file to this folder: https://github.com/Comfy-Org/ComfyUI_frontend/tree/main/browser_tests/assets

then just putting the filename in this list:

test.describe('Load Workflow in Media', () => {
const fileNames = [
'workflow.webp',
'edited_workflow.webp',
'no_workflow.webp',
'large_workflow.webp',
'workflow.webm',
// Skipped due to 3d widget unstable visual result.
// 3d widget shows grid after fully loaded.
// 'workflow.glb',
'workflow.mp4',
'workflow.mov',
'workflow.m4v',
'workflow.svg'
]

It will run the browser emulation and test dropping the AVIF onto the graph and loading the workflow.

@FerrahWolfeh FerrahWolfeh requested a review from a team as a code owner July 11, 2025 22:42
@christian-byrne christian-byrne added the New Browser Test Expectations New browser test screenshot should be set by github action label Jul 12, 2025
@christian-byrne
Copy link
Contributor

Oh sorry I forgot to mention that we generate the screenshots in the GitHub test runner automatically by adding the "New Browser Test Expectations" label. I will generate the screenshot now!

@webfiltered
Copy link
Contributor

N.B. Workflow needs to be approved after adding label.

@christian-byrne christian-byrne added New Browser Test Expectations New browser test screenshot should be set by github action and removed New Browser Test Expectations New browser test screenshot should be set by github action labels Jul 13, 2025
@christian-byrne
Copy link
Contributor

Give us a moment to fix this, apologies.

@FerrahWolfeh
Copy link
Author

FerrahWolfeh commented Jul 14, 2025

Is there anything I need to do with the code or the snapshots? Because I'm pretty lost about what should be done on my end...

@webfiltered
Copy link
Contributor

Is there anything I need to do with the code or the snapshots? Because I'm pretty lost about what should be done on my end...

Nope! Not unless you want to set your monitor to a MUCH smaller res, nuke all your settings, and upload a new screenshot? 😃

I'll just do a quick follow-up afterwards.

@webfiltered
Copy link
Contributor

Oh ah, actually - could you delete the existing playwright screenshot? Will be able to merge aftter it's all tested, but we may as well not add it to the commit history seeing as it's about to be deleted.

@FerrahWolfeh
Copy link
Author

Ok, done.

Copy link
Contributor

@webfiltered webfiltered left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM w/minor changes - thank you for this!

@christian-byrne
Copy link
Contributor

christian-byrne commented Jul 15, 2025

Should we just merge this and add the screenshot in followup? Then again, we want the update expectations to be runnable by any contributor, especially with new PR requirements.

@webfiltered
Copy link
Contributor

webfiltered commented Jul 15, 2025

Was literally halfway through merging this yesterday when internet problems started.

For the root issue, the action would need to run on the fork. Which means it can't be based on a PR label. But should still be very doable.

@christian-byrne
Copy link
Contributor

Okay nice, we can merge as part of 1.25 then later today!

@FerrahWolfeh
Copy link
Author

FerrahWolfeh commented Jul 17, 2025

For the root issue, the action would need to run on the fork. Which means it can't be based on a PR label. But should still be very doable.

Alright, I can do that. But which workflow in the actions menu should I run then if at all?

@webfiltered
Copy link
Contributor

Alright, I can do that. But which workflow in the actions menu should I run then if at all?

We'll need to fix our workflows to be fork-friendly. We're good to merge this - just waiting for some bug fixes to go in, so this can land in the version after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Browser Test Expectations New browser test screenshot should be set by github action
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants