-
Notifications
You must be signed in to change notification settings - Fork 205
feat: support array_repeat
#1680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Now problem DF and Spark returns different value if count is null
Spark returns NULL
|
.map(|x| *x as usize) | ||
.collect::<Vec<_>>(); | ||
|
||
let mut nulls = NullBufferBuilder::new(count_array.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual fix to have nulls buffer and have response as null if count
is null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces support for the "array_repeat" scalar function. Key changes include:
- Adding the new function implementation in a dedicated module (array_repeat).
- Updating the scalar function registry to include "array_repeat".
- Enhancing integration tests to verify the new function behavior.
Reviewed Changes
Copilot reviewed 5 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
native/spark-expr/src/comet_scalar_funcs.rs | Added registration for the "array_repeat" UDF in the scalar functions. |
native/spark-expr/src/array_funcs/mod.rs | Included the new array_repeat module and its public export. |
native/spark-expr/src/array_funcs/array_repeat.rs | Implements the logic for the "array_repeat" function. |
native/core/src/execution/planner.rs | Added tests to validate the behavior of "array_repeat". |
native/core/src/execution/operators/scan.rs | Minor documentation formatting updates. |
Files not reviewed (3)
- spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala: Language not supported
- spark/src/main/scala/org/apache/comet/serde/arrays.scala: Language not supported
- spark/src/test/scala/org/apache/comet/CometArrayExpressionSuite.scala: Language not supported
Comments suppressed due to low confidence (1)
native/core/src/execution/planner.rs:2957
- Consider adding additional test cases for array_repeat where the input column is of list type, as current tests only verify behavior for non-list (scalar) arrays.
fn test_array_repeat() {
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1680 +/- ##
============================================
+ Coverage 56.12% 58.84% +2.71%
- Complexity 976 1082 +106
============================================
Files 119 126 +7
Lines 11743 12608 +865
Branches 2251 2363 +112
============================================
+ Hits 6591 7419 +828
- Misses 4012 4018 +6
- Partials 1140 1171 +31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One clarification (for my own understanding), otherwise lgtm.
|
||
for (row_index, &count) in count_vec.iter().enumerate() { | ||
nulls.append(!count_array.is_null(row_index)); | ||
let repeated_array = if array.is_null(row_index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the result be a null array if the count is zero ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be empty array, added this test case as well
Thanks @parthchandra for the review |
Which issue does this PR close?
Replaces #1205 .
Closes #1347
Rationale for this change
What changes are included in this PR?
How are these changes tested?