Skip to content

NameMapping flattens the names and causes a.b field to collide with child b field of field a #935

@sungwy

Description

@sungwy

Apache Iceberg version

None

Please describe the bug 🐞

According to the Iceberg documentation on Column Projection:

A name may contain . but this refers to a literal name, not a nested field. For example, a.b refers to a field named a.b, not child field b of field a.

The current implementation of NameMapping flattens the name by joining the parent child relationships with a .. This causes name collisions issues with fields that should not collide with each other.

For example, this flat map causes a.b field to collide with child b field of field a.

We should update _field_by_name() and find() methods of NameMapping to use a tree structure instead of a flat dict, and traverse the tree in order to retrieve MappedField of the provided name.

@cached_property
def _field_by_name(self) -> Dict[str, MappedField]:
return visit_name_mapping(self, _IndexByName())
def find(self, *names: str) -> MappedField:
name = ".".join(names)
try:
return self._field_by_name[name]
except KeyError as e:
raise ValueError(f"Could not find field with name: {name}") from e

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions