-
Notifications
You must be signed in to change notification settings - Fork 926
Refactor build_array_reader
into a struct
#7521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
403a636
to
5f42aac
Compare
5f42aac
to
9690125
Compare
build_array_reader
into a struct
@@ -45,7 +45,7 @@ mod struct_array; | |||
#[cfg(test)] | |||
mod test_util; | |||
|
|||
pub use builder::build_array_reader; | |||
pub(crate) use builder::ArrayReaderBuilder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally find the use of pub use
in non pub modules confusing to reason about what is pub and what is not. I think using pub(crate)
to make it explicit this is crate private makes it easier to understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thank you @alamb.
Thanks for the review @zhuqi-lucas |
Note to reviewers
Looking at the whitespace blind diff (https://github.com/apache/arrow-rs/pull/7521/files?w=1) I think makes it easier to see what is changed in this PR (changed the plumbing of how parameters are passed around rather than any logic changes)
Which issue does this PR close?
Rationale for this change
I am trying to avoid potentially decoding arrays twice
Applying
ArrowPredicate
is sometimes slower than filtering afterwards. Part of the reason for this is that filter columns are decoded twice.I want a way to inject a pre-calculated filter result into the record batch decoding machinery and one way I found that works well is to provide an
ArrayBuilder
instance that uses the cached result. I found it convenient to have a struct on which to hang the cache rather than a bunch of free functionsYou can see how this is used here:
Also, even if we choose not to go with the result
What changes are included in this PR?
ArrayBuilder
s into a new structAre there any user-facing changes?
No, this code is entirely internal (e.g. this code is not public: https://docs.rs/parquet/latest/parquet/?search=array_builder)