Skip to content

Conversation

@osipovartem
Copy link
Contributor

@osipovartem osipovartem commented Jan 6, 2026

Based on Embucket/iceberg-rust#52 and
Embucket/iceberg-rust#53

  • for overwrite operation we need to save filtered data files for compatability with other iceberg engines since they take into account all deteled data files and stats within current snapshot
    Spark andSnowflake create additional manifest to track all deleted files for compatability
  manifest_path: s3://...-m1.avro
  manifest_length: 7944
  partition_spec_id: 0
  content: 0
  sequence_number: 2
  min_sequence_number: 2
  added_snapshot_id: 8234522747398339109
  added_files_count: 1
  existing_files_count: 0
  deleted_files_count: 0
  added_rows_count: 6001215
  existing_rows_count: 0
  deleted_rows_count: 0
  partitions: []
  key_metadata: None

Record 2:
  manifest_path: s3://...-m0.avro
  manifest_length: 7942
  partition_spec_id: 0
  content: 0
  sequence_number: 2
  min_sequence_number: 2
  added_snapshot_id: 8234522747398339109
  added_files_count: 0
  existing_files_count: 0
  deleted_files_count: 1
  added_rows_count: 0
  existing_rows_count: 0
  deleted_rows_count: 6001215
  partitions: []
  key_metadata: None
  • this mechanism can be also used in future to restore removed data

@osipovartem osipovartem merged commit a3f8c73 into main Jan 6, 2026
3 checks passed
@osipovartem osipovartem deleted the aosipov/fix_merge_into_compatability branch January 6, 2026 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants