-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Seal Array trait #9092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seal Array trait #9092
Conversation
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
BTW while this is technically an API change and should wait for the next major release, I think it is important enough to merge for 57.2.0 (a minor release) and will not be disruptive (as I don't think anyone has implemented this) I had a brief look around geoarrow as it is well designed in my opinion and a non trivial extension on top of I didn't see any |
kylebarron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to implement Array for external types but was never able to get the downcasting working because of the closed nature of DataType. So I don't think it was previously possible to implement Array externally: sealing the array just makes it explicit
|
Such a same getting this trait sealed. I was finding it very convenient for implementing a GPU-based |
Co-authored-by: Andrew Lamb <[email protected]>
|
I presume you're referring to https://github.com/gabotechs/libcudf-rs/blob/main/src/column_view.rs#L125 If so why not just define your own trait, it looks like you're only using a very limited subset of Array anyway, and passing that array into any arrow-rs kernel will cause it at best panic (as downcast won't work correctly). |
|
@tustvold It's mainly for wiring it up with https://github.com/apache/datafusion. As DataFusion moves Not saying that the change in this PR does not make sense though, I believe it does, but I wonder what could be the alternative. Maybe letting DataFusion be the one that exposes a customizable trait for transporting data? |
When implementing the variant extension type, we also tried to use So the Aside: the same 1:1 constrait is also why |
Could it pass |
|
I am expecting an issue shortly for this PR, and then I think we should merge it to avoid confusion. I realize this will cause @gabotechs some pain downstream, but let's figure out a better API for backing arrays with GPU memory as a follow on |
gabotechs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it pass ArrayData instead? Should be easy enough to do e.g. StructArray::from(record_batch).into_data() and then RecordBatch::from(StructArray::from(array_data))?
Yeah, I think I can do either this or something similar, thanks for the options! this change makes sense to me 👍
|
Thanks again for everyone's thoughts. Also thanks to @shinmao for the report |
Which issue does this PR close?
try_binarywhenArrayis implemented incorrectly in external crate #9106Rationale for this change
This trait is not meant to be overridden, and doing so will break many kernels in sometimes subtle ways.
What changes are included in this PR?
Seals the Array trait to prevent implementation outside of arrow-array.
Are these changes tested?
Are there any user-facing changes?