Skip to content

feat: Declarative TableMetadata Builder #1362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

c-thiel
Copy link
Collaborator

@c-thiel c-thiel commented May 21, 2025

Which issue does this PR close?

Currently we don't have a way to create TableMetadata in a declarative way other than deserializing it from JSON.
The TableMetadataBuilder mostly works imperatively with methods like add_schema and add_partition_spec that mutate the state incrementally. While incremental modification is what most users need, it is also helpful to offer a type safe way to create a new TableMetadata object given all of its attributes, other than deserialization from JSON.

My concrete use-case is: Lakekeeper stores TableMetadata in Postgres. We extract data typesafe, so we end up with various objects such as a list of schemas, partition_specs, the last_updated_ms and all other fields required to build a new TableMetadata. As we already have rust-types, we don't want to serialize them to JSON first. We also can't use the TableMetadata builder, as it's incremental nature would result in significant overhead.

Design considerations

Instead of using the builder approach, we could also add another method try_from_parts to impl TableMetadata. We have this alternative approach implemented here: https://github.com/lakekeeper/iceberg-rust/blob/a8b6509775b139c92772a999b0cbca637e274a03/crates/iceberg/src/spec/table_metadata.rs#L676-L740

I feel that a builder is less intrusive though. I don't like adding the UnnormalizedTableMetadata, but didn't find a way to skip this type without writing significantly more code that needs to be maintained. Because of the derived builder in this PR, the approach is almost maintenance free.

What changes are included in this PR?

  • Introduce a new derived TableMetadataDeclarativeBuilder
  • Introduce UnnormalizedTableMetadata as an intermediate type that is .try_normalize() to TableMetadata

Are these changes tested?

yes

@c-thiel
Copy link
Collaborator Author

c-thiel commented May 21, 2025

@liurenjie1024 @Xuanwo Looking forward to your thoughts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant