Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(duckdb): set secrets via SecretsManager #9740

Open
gforsyth opened this issue Jul 31, 2024 Discussed in #9633 · 7 comments
Open

feat(duckdb): set secrets via SecretsManager #9740

gforsyth opened this issue Jul 31, 2024 Discussed in #9633 · 7 comments
Labels
duckdb The DuckDB backend feature Features or general enhancements

Comments

@gforsyth
Copy link
Member

Discussed in #9633

Originally posted by andersbogsnes July 18, 2024
Now that DuckDB has implemented a proper credentials manager, is there some way of supporting that through Ibis natively?

https://duckdb.org/docs/configuration/secrets_manager

Today you'd either have to go via FSSpec which is going to be slower than native Duckdb, or make sure to run a .raw_sql('CREATE SECRET ...') beforehand.

Maybe some kind of set_credentials option for the backend?

@gforsyth gforsyth changed the title feat(duckdb); set DuckDB Secrets via SecretsManager feat(duckdb): set DuckDB Secrets via SecretsManager Jul 31, 2024
@gforsyth gforsyth changed the title feat(duckdb): set DuckDB Secrets via SecretsManager feat(duckdb): set secrets via SecretsManager Jul 31, 2024
@cpcloud
Copy link
Member

cpcloud commented Jul 31, 2024

Wonder if we could also or instead of set_credentials add secrets kwargs to connect?

@gforsyth gforsyth added feature Features or general enhancements duckdb The DuckDB backend labels Aug 7, 2024
@IndexSeek
Copy link
Member

This is something that came up for me recently! I used the .raw_sql('CREATE SECRET ...') approach with the DuckDB backend to connect to an S3-compatible storage provider.

Other backends, like Snowflake and Microsoft SQL Server, support creating SECRET/CREDENTIAL, but they are used slightly differently than the DuckDB implementation.

I share this in hopes that it may help determine a method name or if this should be supported via kwargs on connect instead. 💭

@cboettig
Copy link

is .raw_sql still our only option for any S3 secrets?

@gforsyth
Copy link
Member Author

@cboettig -- no, DuckDB currently uses the AWS SDK for authentication, but it can be a bit spotty in what is supported.

The most reliable way is to set access key, secret key and (critically) region/default region via environment variables.

@cboettig
Copy link

cboettig commented Oct 24, 2024

@gforsyth thanks! sorry I don't entirely follow.

e.g. this works (bucket is public, should be reproducible):

import ibis
con = ibis.duckdb.connect()

con.raw_sql(f'''
CREATE OR REPLACE SECRET secret (
    TYPE S3,
    ENDPOINT 'sdsc.osn.xsede.org',
    URL_STYLE 'path'
);
''')

path = "s3://bio230014-bucket01/challenges/forecasts/bundled-parquet/project_id=neon4cast"
con.read_parquet(path + "/**")

but is quite verbose and not particularly pythonic. I wasn't clear how to do this with env vars. What's the best way to do this in ibis at the moment?

@gforsyth
Copy link
Member Author

Hey @cboettig !

I thought you were thinking of secrets for accessing private buckets, in which case, usually setting these environment variables does the trick: https://docs.aws.amazon.com/cli/v1/userguide/cli-configure-envvars.html

However, while there is AWS_ENDPOINT_URL for specifying custom endpoints, I don't know if there's a way to tell the AWS SDK to use a specific URL_STYLE with an environment variable, so you might just be stuck with the non-pythonic method for now.

@cboettig
Copy link

@gforsyth Thanks for the reply!

yeah, looks like we're stuck. The AWS SDK supports addressing_style in config files (https://docs.aws.amazon.com/cli/latest/topic/s3-config.html) but I can't get duckdb to read config directly (??). I know gdal supports AWS_VIRTUAL_HOSTING arg for this. arrow is the best with this, automatically turning virtual hosting on only if an endpoint override is unset (https://arrow.apache.org/docs/python/generated/pyarrow.fs.S3FileSystem.html).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duckdb The DuckDB backend feature Features or general enhancements
Projects
Status: backlog
Development

No branches or pull requests

4 participants