-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling reduce_then_scan for "Set" family of scan APIs #1879
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass of comments. I think I generally understand the algorithm and the logic seems good to me.
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
wording and constexpr Co-authored-by: Adam Fidel <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
It is only initialized with a constexpr, later updated at runtime Signed-off-by: Dan Hoeflinger <[email protected]>
01ee3a9
to
adc6cc5
Compare
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
I realize I never expanded the set of reviewers from a small group, though I know we agreed not to attempt to spend too much effort on perfecting this approach (for performance) before merging. I've now added a few more for visibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small styling nitpicks. I am ready to approve after.
Signed-off-by: Dan Hoeflinger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Enabling reduce then scan algorithm to be used for set family APIs:
set_intersection
set_union
set_difference
set_symmetric_difference
Adds high-level helpers required to convert set family to use reduce_then_scan on GPU with size 32 subgroups.
Also makes set consistent with the other scan algorithms to select algorithm in the
__parallel_*
level of function call.This provides significant performance improvements, especially at small sizes of
n
. At larger sizes ofn
, this is an improvement over the existing algorithm, but not by a lot.There is still significant room for improvement here, this algorithm is very inefficient.
Three future things to try (from easiest to hardest) to improve the set family within the reduce then scan alg:
__pstl_lower_bound
,__shars_lower_bound
to see if it provides benefit.Right now, we are wasting lots of memory accesses and comparisons on sections of the second set which should be able to ruled out from knowledge we have access to at the time we are making the search (in the "Reduce" predicate). We also lose the "blocking" cache advantage of reduce_then_scan, because the second set always searches from the start of the list.