Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow selectors.of_type("array<struct>") #10724

Closed
1 task done
Riezebos opened this issue Jan 25, 2025 · 2 comments
Closed
1 task done

feat: Allow selectors.of_type("array<struct>") #10724

Riezebos opened this issue Jan 25, 2025 · 2 comments
Labels
feature Features or general enhancements

Comments

@Riezebos
Copy link
Contributor

Is your feature request related to a problem?

t = ibis.memtable([{
    "id": "1",
    "array_struct_col": [{"a": "b"}],
    "array_int_col": [1,2,3]
}])
t.mutate(s.across(s.of_type("array<struct>"), _.cast("array<json>")))
ParseError: expected '<' at 0:12

What is the motivation behind your request?

I am trying to mutate all columns that are an array with a struct in them but I don't want to mutate array columns with any other datatype in them

Describe the solution you'd like

From my perspective as a user the logical solution would be the on in my example code. It's already possible to do s.of_type("array") and s.of_type("struct"). But a different solution is also fine by me.

I found a workaround for now but that seems a but ineffecient:

t = t.mutate(s.across(s.of_type("array"), lambda col: col.cast("array<json>") if col.unnest().type().is_struct() else col))

What version of ibis are you running?

9.5.0

What backend(s) are you using, if any?

Duckdb

Code of Conduct

  • I agree to follow this project's Code of Conduct
@Riezebos Riezebos added the feature Features or general enhancements label Jan 25, 2025
@cpcloud
Copy link
Member

cpcloud commented Jan 25, 2025

You can do this with the where selector, which should be a bit more concise:

In [18]: from ibis.interactive import *

In [19]: t = ibis.memtable([{
    ...:     "id": "1",
    ...:     "array_struct_col": [{"a": "b"}],
    ...:     "array_int_col": [1,2,3]
    ...: }])

In [20]: t.select(s.where(lambda c: c.type().is_array() and c.type().value_type.is_struct()))
Out[20]:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ array_struct_col         ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<struct<a: string>> │
├──────────────────────────┤
│ [{...}]                  │
└──────────────────────────┘

In [21]: t.mutate(s.across(s.where(lambda c: c.type().is_array() and c.type().value_type.is_struct()), _.length()))
Out[21]:
┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ id     ┃ array_struct_col ┃ array_int_col        ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ string │ int64            │ array<int64>         │
├────────┼──────────────────┼──────────────────────┤
│ 1      │                1 │ [1, 2, ... +1]       │
└────────┴──────────────────┴──────────────────────┘

@cpcloud cpcloud closed this as completed Jan 25, 2025
@github-project-automation github-project-automation bot moved this from backlog to done in Ibis planning and roadmap Jan 25, 2025
@Riezebos
Copy link
Contributor Author

Thanks!

TIL Array.value_type exists, good to know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements
Projects
Status: done
Development

No branches or pull requests

2 participants