-
Notifications
You must be signed in to change notification settings - Fork 332
Create rollback and set snapshot APIs #758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
chinmay-bhat
wants to merge
15
commits into
apache:main
Choose a base branch
from
chinmay-bhat:rollback_set_current_snapshot_op
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+282
−70
Open
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
f22d46d
support rollback and set current snapshot operations
chinmay-bhat 45c25db
add tests
chinmay-bhat ca63831
use tbl.history() instead of ancestors_of()
chinmay-bhat f7e192a
improve docstrings
chinmay-bhat 6859fa4
Revert "use tbl.history() instead of ancestors_of()"
chinmay-bhat ea0e645
find ancestor before timestamp
chinmay-bhat dc4028b
update tests
chinmay-bhat 7fba98b
small fix
chinmay-bhat 1f4a404
fix test error
chinmay-bhat 59f1626
fixes based on review
chinmay-bhat 7c7907b
add parameter to control when transaction is committed
chinmay-bhat e563b7e
move _set_ref_snapshot
chinmay-bhat 386496f
changes after review
chinmay-bhat 5adccb9
move test and use constants
chinmay-bhat 8885f78
fix docstring and returns
chinmay-bhat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to Java implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only issue here is that
self.commit
will commit the transaction if theManageSnapshot
object comes fromiceberg-python/pyiceberg/table/__init__.py
Lines 1508 to 1521 in 2252e71
where
autocommit
is set to true.One possible way to fix this is that we can add additional parameters in
transaction._apply
to override the autocommit behavior and call that directly here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated! Now there's an extra parameter
commit_transaction_now
that defaults to True, and we override it to False when staged refs need to be applied without commiting the transaction.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I'm re-opening this resolved conversation, since I don't think adding the additional parameter is enough.
Say, in the future, we have more APIs like:
The updates and requirements would be :
(SetSnapshotRefUpdate(action='set-snapshot-ref', ref_name='test_branch_min_snapshots_to_keep', type='branch', snapshot_id=71191752302974125, max_ref_age_ms=None, max_snapshot_age_ms=None, min_snapshots_to_keep=None), SetSnapshotRefUpdate(action='set-snapshot-ref', ref_name='test_branch_min_snapshots_to_keep', type='branch', snapshot_id=71191752302974125, max_ref_age_ms=None, max_snapshot_age_ms=None, min_snapshots_to_keep=2))
(AssertRefSnapshotId(type='assert-ref-snapshot-id', ref='test_branch_min_snapshots_to_keep', snapshot_id=None), AssertRefSnapshotId(type='assert-ref-snapshot-id', ref='test_branch_min_snapshots_to_keep', snapshot_id=71191752302974125))
The 2nd requirement will fail with a
CommitFailedException
as the branch would be missing.With
_commit_if_ref_updates_exist()
, thetransaction.table_metadata
would get updated, but when the transaction exits, it will try tocommit_transaction()
which runs_do_commit()
which runs_commit_table()
.In
_commit_table()
, for non-REST catalogs, we_update_and_stage_table()
where we check the requirements with current table metadata, here the 2nd requirement fails.To fix this, we might consider one of the following solutions:
transaction._apply
identify the differences between current table metadata and staged metadata, and only pass those differences inself._updates
, while not sending the ref updates requirements (since we've already validated them once intransaction._apply
) OR_update_and_stage_table()
to iteratively apply the update with corresponding requirement and always check the requirements with updated_metadata. This is easier than (1), but only serves non-REST catalogs. ORcommit_if_ref_exists()
, the Transaction commits to the table. This would be expensive IMO, but the result would remain atomic and correct, with minimal changes in the PR.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chinmay-bhat Thank you so much for digging into this issue! I think you've made a great point. I am thinking of a similar solution like your first point: to derive a list of requirements when we commit the transaction: https://github.com/apache/iceberg/blob/d69ba0568a2e07dfb5af233350ad5668d9aef134/core/src/main/java/org/apache/iceberg/UpdateRequirements.java#L50-L58
This will save us from manually specifying requirements for every
UpdateTableMetadata
definition and also prevent the problems described above.Let me research more on this and get back to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @HonahX, should I make a new issue for this? Since changing how we specify requirements is not strictly in the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @chinmay-bhat. Sorry for the long wait🙏. I was distracted by other stuff and some blocking issues for 0.7.0 release. Yes, please feel free to create an issue to further discuss it. I can reply to that when I get something.