-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate single await-push command per buffer for all local chunks #324
base: master
Are you sure you want to change the base?
Conversation
This brings await-pushes in line with pushes, where we already compute the union of all regions required by remote chunks executed on the same node.
Check-perf-impact results: (241536968c1afd627708e72c171c89f6) 🚀 Significant speedup (<0.80x) in some microbenchmark results: benchmark intrusive graph dependency handling with N nodes - 100 / creating nodes Relative execution time per category: (mean of relative medians)
|
Pull Request Test Coverage Report for Build 12435190957Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the small comment i made this looks good.
for(const auto& [box, wcs] : local_sources) { | ||
// Note that we initialize all buffers as fresh, so this doesn't trigger for uninitialized reads | ||
if(!box.empty() && !wcs.is_fresh()) { missing_part_boxes.add(box); } | ||
if(!box.empty() && !wcs.is_fresh()) { missing_parts_boxes.push_back(box); } | ||
} | ||
|
||
// There is data we don't yet have locally. Generate an await push command for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me this comment now seems a bit out of place since we don't generate the await push command in this if.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the correct thing to do!
@@ -273,6 +273,13 @@ class region_builder { | |||
m_boxes.push_back(box); | |||
} | |||
|
|||
void add(const box_vector<Dims>& boxes) & { | |||
m_boxes.reserve(m_boxes.size() + boxes.size()); | |||
for(const auto& b : boxes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe have a comment here that we use a loop rather than insert(end)
to skip over empty boxes, something which add(region)
does not need to do.
[only if my comment below does not apply]
auto& required_boxes = per_buffer_required_boxes[bid]; // allow default-insert | ||
required_boxes.add(missing_parts_boxes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't this just be moved into the loop in L401? Then we don't need missing_parts_boxes
(or region_builder::add(box_vector)
for that matter)
This brings await-pushes in line with pushes, where we already compute the union of all regions required by remote chunks executed on the same node.