-
Notifications
You must be signed in to change notification settings - Fork 329
Closed
Description
Feature Request / Improvement
When engines, such as Daft, read from the Table
object (see scan_iceberg), it would be great if PyIceberg transparently handles time travel.
For example, to query an Iceberg table at a specific commit or timestamp, we can use PyIceberg to time travel to the particular snapshot-id or timestamp and then pass it into the engine.
There are several options to achieve this:
- Construct
Table
object with the metadata of a specificSnapshot
. Maybe a function likeTable.as_of(snapshot_id/timestamp) -> Table
. This will make time travel transparent to the engine. - Pass the
Snapshot
object to the engine. The functionTable.snapshot_by_id -> Snapshot
already exists, and represents a specific Iceberg commit. The engine will need to be able to read from bothSnapshot
andTable
Happy to explore other options as well.
sungwy and mrhamburg
Metadata
Metadata
Assignees
Labels
No labels