Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(reader): add range named parameter #57

Merged
merged 11 commits into from
Feb 6, 2025

Conversation

mharrisb1
Copy link
Contributor

@mharrisb1 mharrisb1 commented Feb 4, 2025

Closes #56

Adds range as a named parameter so a user can query specific range in sheet.

Usage (from docs):

-- Read a spreadsheet using a specific range
select * from read_gsheet('11QdEasMWbETbFVxry-SsD8jVcdYIT1zBQszcF84MdE8', sheet='Sheet1', range='A1:B7');
-- or using A1 notation
select * from read_gsheet('11QdEasMWbETbFVxry-SsD8jVcdYIT1zBQszcF84MdE8', sheet='Sheet1!A1:B7');

Possible next steps:

test/sql/read_gsheet.test Outdated Show resolved Hide resolved
test/sql/read_gsheet.test Show resolved Hide resolved
src/gsheets_read.cpp Outdated Show resolved Hide resolved
@archiewood
Copy link
Member

archiewood commented Feb 4, 2025

Also, should probably remove the limitation in the README regarding ranged reading in this PR!

@archiewood
Copy link
Member

The other thing we could look to support here, would be when the range is specified in the URL itself, which then allows you to do this without the explicit read_gsheet() function too

FROM 'https://docs.google.com/spreadsheets/d/11QdEasMWbETbFVxry-SsD8jVcdYIT1zBQszcF84MdE8/edit?gid=644613997#gid=644613997&range=D10:E25'

CleanShot 2025-02-04 at 13 46 00

@mharrisb1
Copy link
Contributor Author

mharrisb1 commented Feb 4, 2025

Keeping track of the TODOs here:

  • Test ranges should not start with first row
  • Go ahead and add handling for ! in sheet name
  • Remove note about ranges in README
  • Parse range from URL
  • Open new issue for adding range support to COPY TO

@mharrisb1
Copy link
Contributor Author

Alright, just pushed some new changes. I got handling quoted sheet names with ! char to work and added URL extraction

@mharrisb1
Copy link
Contributor Author

I also added project configs for my editor to disable auto-formatting. Looks I accidentally formatted the main README. Should be fixed now

@archiewood
Copy link
Member

archiewood commented Feb 6, 2025

It's quite easy to break duckdb using this parameter, if for whatever reason you do not select a valid range

Only selected one cell in range
D select * from 'https://docs.google.com/spreadsheets/d/1Fkp9Ki5Wtk23vyH89s_9IQ0cbhPLnAA1BvmHKAJcWT8/edit?gid=0#gid=0&range=c1';
JSON does not contain expected fields
Raw JSON string: {"majorDimension":"ROWS","range":"Sheet1!C1"}
libc++abi: terminating
Abort trap: 6
Only selected top left cell
D select * from read_gsheet('https://docs.google.com/spreadsheets/d/1Fkp9Ki5Wtk23vyH89s_9IQ0cbhPLnAA1BvmHKAJcWT8/edit?gid=0#gid=0', range='A1');
INTERNAL Error:
Failed to bind "read_gsheet": Table function must return at least one column

Stack Trace:

0        duckdb::Exception::Exception(duckdb::ExceptionType, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 64
1        duckdb::InternalException::InternalException(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 20
2        duckdb::InternalException::InternalException<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) + 64
3        duckdb::Binder::BindTableFunctionInternal(duckdb::TableFunction&, duckdb::TableFunctionRef const&, duckdb::vector<duckdb::Value, true>, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, duckdb::Value, duckdb::CaseInsensitiveStringHashFunction, duckdb::CaseInsensitiveStringEquality, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, duckdb::Value>>>, duckdb::vector<duckdb::LogicalType, true>, duckdb::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, true>) + 1716
4        duckdb::Binder::Bind(duckdb::TableFunctionRef&) + 2360
5        duckdb::Binder::Bind(duckdb::TableRef&) + 320
6        duckdb::Binder::BindNode(duckdb::SelectNode&) + 68
7        duckdb::Binder::BindNode(duckdb::QueryNode&) + 140
8        duckdb::Binder::Bind(duckdb::QueryNode&) + 176
9        duckdb::Planner::CreatePlan(duckdb::SQLStatement&) + 156
10       duckdb::ClientContext::CreatePreparedStatementInternal(duckdb::ClientContextLock&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, duckdb::unique_ptr<duckdb::SQLStatement, std::__1::default_delete<duckdb::SQLStatement>, true>, duckdb::optional_ptr<std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, duckdb::BoundParameterData, duckdb::CaseInsensitiveStringHashFunction, duckdb::CaseInsensitiveStringEquality, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, duckdb::BoundParameterData>>>, true>) + 484
11       duckdb::ClientContext::CreatePreparedStatement(duckdb::ClientContextLock&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, duckdb::unique_ptr<duckdb::SQLStatement, std::__1::default_delete<duckdb::SQLStatement>, true>, duckdb::optional_ptr<std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, duckdb::BoundParameterData, duckdb::CaseInsensitiveStringHashFunction, duckdb::CaseInsensitiveStringEquality, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, duckdb::BoundParameterData>>>, true>, duckdb::PreparedStatementMode) + 916
12       duckdb::ClientContext::PendingStatementInternal(duckdb::ClientContextLock&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, duckdb::unique_ptr<duckdb::SQLStatement, std::__1::default_delete<duckdb::SQLStatement>, true>, duckdb::PendingQueryParameters const&) + 128
13       duckdb::ClientContext::PendingStatementOrPreparedStatement(duckdb::ClientContextLock&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, duckdb::unique_ptr<duckdb::SQLStatement, std::__1::default_delete<duckdb::SQLStatement>, true>, duckdb::shared_ptr<duckdb::PreparedStatementData, true>&, duckdb::PendingQueryParameters const&) + 208
14       duckdb::ClientContext::PendingStatementOrPreparedStatementInternal(duckdb::ClientContextLock&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, duckdb::unique_ptr<duckdb::SQLStatement, std::__1::default_delete<duckdb::SQLStatement>, true>, duckdb::shared_ptr<duckdb::PreparedStatementData, true>&, duckdb::PendingQueryParameters const&) + 1556
15       duckdb::ClientContext::PendingQueryInternal(duckdb::ClientContextLock&, duckdb::unique_ptr<duckdb::SQLStatement, std::__1::default_delete<duckdb::SQLStatement>, true>, duckdb::PendingQueryParameters const&, bool) + 120
16       duckdb::ClientContext::PendingQuery(duckdb::unique_ptr<duckdb::SQLStatement, std::__1::default_delete<duckdb::SQLStatement>, true>, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, duckdb::BoundParameterData, duckdb::CaseInsensitiveStringHashFunction, duckdb::CaseInsensitiveStringEquality, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, duckdb::BoundParameterData>>>&, bool) + 192
17       duckdb::ClientContext::PendingQuery(duckdb::unique_ptr<duckdb::SQLStatement, std::__1::default_delete<duckdb::SQLStatement>, true>, bool) + 64
18       duckdb::Connection::PendingQuery(duckdb::unique_ptr<duckdb::SQLStatement, std::__1::default_delete<duckdb::SQLStatement>, true>, bool) + 64
19       duckdb_shell_sqlite3_prepare_v2 + 976
20       duckdb_shell::ShellState::ExecuteSQL(char const*, char**) + 148
21       duckdb_shell::ShellState::RunOneSqlLine(char*) + 248
22       duckdb_shell::ShellState::ProcessInput() + 1148
23       main + 3436
24       start + 2360

This error signals an assertion failure within DuckDB. This usually occurs due to unexpected conditions or errors in the program's logic.
For more information, see https://duckdb.org/docs/dev/internal_errors

This is an existing bug, if you for example selected an empty sheet: It's just easier to select an empty range now.

These are full DB crashes, so I think we should try to handle them more gracefully.

Could be in a separate PR though, this is not introducing any regressions.

Archie 99.0

# Test single value from range
# NOTE: *must* use `header=false` to avoid uncaught bind error
Copy link
Member

@archiewood archiewood Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah i just saw this comment. I wonder why this is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be a bug caused by empty results. I'll need to do some debugging to see if maybe the shape of the fetch sheets response is different for single cell ranges or if the error comes from something more simple. Will investigate and put up an issue once I've tracked down the souce

@archiewood archiewood merged commit 708512c into evidence-dev:main Feb 6, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add "range" parameter to read_gsheet function
2 participants