Skip to content

Fix S3ToGCSOperator deferrable mode to return list of copied files#63533

Open
yuseok89 wants to merge 3 commits intoapache:mainfrom
yuseok89:s3-to-gcs-deferrable-return-files
Open

Fix S3ToGCSOperator deferrable mode to return list of copied files#63533
yuseok89 wants to merge 3 commits intoapache:mainfrom
yuseok89:s3-to-gcs-deferrable-return-files

Conversation

@yuseok89
Copy link
Contributor

@yuseok89 yuseok89 commented Mar 13, 2026

This PR addresses functionality originally proposed in #49768. Since that PR has been stalled, and #11323 is effectively blocked by it, I've proceeded with this implementation in the hope of unblocking progress.

If the original author of #49768 returns and continues his work, I can close this PR so that his proceeds instead. If not, would it be acceptable to merge this PR so that deferrable S3→GCS transfers can return the list of copied files and downstream tasks can consume them via XCom?

S3ToGCSOperator in deferrable mode now returns the list of copied file paths (same as non-deferrable mode), so downstream tasks can consume them via XCom. This fixes the previous inconsistency where deferrable=True did not return any value.

Verified in testing: both deferrable=True and deferrable=False return file names, not URIs. Returning URIs instead could be addressed in a separate PR if desired.

Screenshots

Deferrable state

image

XCom

image
Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)
    • Cursor

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Mar 13, 2026
@yuseok89 yuseok89 marked this pull request as ready for review March 13, 2026 23:52
@yuseok89 yuseok89 requested a review from shahar1 as a code owner March 13, 2026 23:52
Copy link
Contributor

@SameerMesiah97 SameerMesiah97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just one nit if you want to implement it.

@potiuk
Copy link
Member

potiuk commented Mar 16, 2026

@yuseok89 This PR has a few issues that need to be addressed before it can be reviewed — please see our Pull Request quality criteria.

Issues found:

  • ⚠️ Testing requirements: The two new tests (test_execute_complete_success_returns_copied_files and the updated test_execute_complete_success) both pass context=mock.MagicMock() without a spec or autospec. Per the code review checklist, unspecced mocks silently accept any attribute access and can hide real bugs. The context mock is inherited from the pre-existing test pattern, but the new test also follows the same unspecced pattern.

Note: Your branch is 67 commits behind main. Some check failures may be caused by changes in the base branch rather than by your PR. Please rebase your branch and push again to get up-to-date CI results.

What to do next:

  • The comment informs you what you need to do.
  • Fix each issue, then mark the PR as "Ready for review" in the GitHub UI - but only after making sure that all the issues are fixed.
  • There is no rush — take your time and work at your own pace. We appreciate your contribution and are happy to wait for updates.
  • Maintainers will then proceed with a normal review.

There is no rush — take your time and work at your own pace. We appreciate your contribution and are happy to wait for updates. If you have questions, feel free to ask on the Airflow Slack.

@yuseok89 yuseok89 force-pushed the s3-to-gcs-deferrable-return-files branch from c762874 to eeb1839 Compare March 16, 2026 13:02
@yuseok89 yuseok89 force-pushed the s3-to-gcs-deferrable-return-files branch from a54e0ac to 5ed2630 Compare March 16, 2026 13:12
@yuseok89
Copy link
Contributor Author

@potiuk
Thanks for the review. I initially kept MagicMock for consistency with the pre-existing test pattern, but I've updated the related execute_complete tests to use context={} instead. I've also rebased onto main as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants