feat: Declarative TableMetadata Builder #1362
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Currently we don't have a way to create
TableMetadata
in a declarative way other than deserializing it from JSON.The
TableMetadataBuilder
mostly works imperatively with methods likeadd_schema
andadd_partition_spec
that mutate the state incrementally. While incremental modification is what most users need, it is also helpful to offer a type safe way to create a newTableMetadata
object given all of its attributes, other than deserialization from JSON.My concrete use-case is: Lakekeeper stores TableMetadata in Postgres. We extract data typesafe, so we end up with various objects such as a list of schemas, partition_specs, the last_updated_ms and all other fields required to build a new
TableMetadata
. As we already have rust-types, we don't want to serialize them to JSON first. We also can't use theTableMetadata
builder, as it's incremental nature would result in significant overhead.Design considerations
Instead of using the builder approach, we could also add another method
try_from_parts
toimpl TableMetadata
. We have this alternative approach implemented here: https://github.com/lakekeeper/iceberg-rust/blob/a8b6509775b139c92772a999b0cbca637e274a03/crates/iceberg/src/spec/table_metadata.rs#L676-L740I feel that a builder is less intrusive though. I don't like adding the
UnnormalizedTableMetadata
, but didn't find a way to skip this type without writing significantly more code that needs to be maintained. Because of the derived builder in this PR, the approach is almost maintenance free.What changes are included in this PR?
TableMetadataDeclarativeBuilder
UnnormalizedTableMetadata
as an intermediate type that is.try_normalize()
toTableMetadata
Are these changes tested?
yes