-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dry run for backfill #45062
base: main
Are you sure you want to change the base?
Add dry run for backfill #45062
Conversation
a7a1efd
to
d288934
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed with daniel, maybe a separate endpoint makes more sense to avoid mixed returned type BackfillResponse | BackfillDryRunResponse
on the same endpoint. That's hard to handle for clients.
Created a separate endpoint for dry run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looking nice.
A few improvement suggestions.
Note: As PR #45312 has been merged, the code formatting rules have changed for new UI. Please rebase and re-run pre-commit checks to ensure that formatting in folder airflow/ui is adjusted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking good.
A few suggestions.
I would wait for @dstandish approval before merging that, just to be sure that the backfill logic is correct. (It looks good to me)
airflow/models/backfill.py
Outdated
@@ -158,72 +151,125 @@ def validate_sort_ordinal(self, key, val): | |||
def _create_backfill_dag_run( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you not make changes to this function but just keep it simple and essentially do the following:
Do just like done for CLI, i.e. in _do_dry_run
, but add the extra step of checking whether it would actually create the run? I.e. extra logic check for existence?
We can use the same function for both cli and api.
essentially, we just need to return the list "these rows would be created". I don't think we need to modify the path where we actually create the runs at this time. Let me know what you think of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation is based on the following comments.
#45062 (comment)
#45062 (comment)
This can be used by both cli and api. The modifications to _create_backfill_dag_run
shouldn't affect anything related to existing cli or api functionality. Are there any potential downsides to modifying this path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are really close to it if I understand correctly.
Daniel suggestion will clearly split code path for dry_run
vs non_dry_run
instead of having both intertwined which I think is better.
_create_backfill
signature is now correct. Then we don't modify the _create_backfill_dag_run
but we just need to extract the dry_run
logic inside its specific function _do_dry_run
. I think it will be cleaner because _create_backfill_dag_run
is already complicated, and adding the dry_run
stuff adds even more to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed and clarified the approach with @dstandish. Have implemented the same.
Please take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only thing i changed is to not lock the dag runs for dry run since it's not necessary.
Closes #44395
Response: