Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic OS Error when writing Delta on S3 bucket with rust engine #3197

Closed
montanarograziano opened this issue Feb 9, 2025 · 2 comments
Closed
Labels
binding/python Issues for the Python package bug Something isn't working

Comments

@montanarograziano
Copy link

Environment

Delta-rs version: deltalake package version 0.24.0

Binding:

Environment: Python 3.10.0

  • Cloud provider:
  • OS: MacOS Sequoia 15.2
  • Other:

Bug

What happened:
Probabily related to #2639.
I'm trying to create a Delta Table on a S3 bucket using write_deltalake but using rust as engine causes OSError source:error sending request for url. This doesn't happen when using pyarrow.
Here's a snippet of what I'm doing. In this case I'm trying to overwrite an existing delta (due to errors) processing it in chucks as it doesn't fit in memory.

from deltalake import write_deltalake

storage_options = {
    "AWS_ACCESS_KEY_ID": os.environ.get("AWS_ACCESS_KEY_ID", ""),
    "AWS_SECRET_ACCESS_KEY": os.environ.get("AWS_SECRET_ACCESS_KEY", ""),
    "AWS_REGION": os.environ.get("AWS_DEFAULT_REGION", ""),
}
mode = "overwrite"
for year in [1970, 2023, 2024]:
    for month in range(1, 13):
        part = df.filter(pl.col("year") == year, pl.col("month") == month).collect()
        if not part.is_empty():
            print(year, month)
            write_deltalake(
                path,
                part.to_arrow(),
                mode=mode,
                storage_options=storage_options,
                engine="rust",
            )
            mode = "append"

And here's the exception:

Image

What you expected to happen:

How to reproduce it:

More details:

@montanarograziano montanarograziano added the bug Something isn't working label Feb 9, 2025
@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Feb 9, 2025

The Rust Writer is a bit slower atm in terms of writing, storage_options = {"timeout" : "60s"} should help for example for ObjectStore to allow the requests to finish before erroring out.

@rtyler rtyler added the binding/python Issues for the Python package label Feb 9, 2025
@montanarograziano
Copy link
Author

Thank you, that worked!

@delta-io delta-io locked and limited conversation to collaborators Feb 14, 2025
@ion-elgreco ion-elgreco converted this issue into discussion #3217 Feb 14, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
binding/python Issues for the Python package bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants