Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parquet file seek and tell in pyarrow for data checkpointing #45218

Open
aireenmei opened this issue Jan 10, 2025 · 0 comments
Open

Support parquet file seek and tell in pyarrow for data checkpointing #45218

aireenmei opened this issue Jan 10, 2025 · 0 comments

Comments

@aireenmei
Copy link

Describe the enhancement requested

As parquet format becomes popular for Gen AI training, we need seek & tell to create ckpts for data iterator. Something like a python API for arrow::io::RandomAccessFile

Component(s)

Parquet, Python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant