-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to distributed scheduling model #186
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have gone through everything (that isn't already covered by the r-tree PR), but not fully in-depth. I have purposefully not commented on most things already marked as TODO or NOCOMMIT in the code, since that is obviously already known anyway.
Overall, this looks good for the alpha release to me. distributed_graph_generator::generate_distributed_commands
has lots of excellent comments, but it's still not easy to follow because it does so much. Post-0.4 it would be nice if we managed to somehow refactor or split that up.
src/buffer_transfer_manager.cc
Outdated
assert(pkg.get_command_type() == command_type::await_push); | ||
const auto& data = std::get<await_push_data>(pkg.data); | ||
|
||
GridRegion<3> expected_region; | ||
for(size_t i = 0; i < data.num_subranges; ++i) { | ||
expected_region = GridRegion<3>::merge(expected_region, subrange_to_grid_box(data.region[i])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning seems a bit extreme for an HPC (rather than security critical domain) codebase.
4655582
to
48f293c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I do not have time at the moment to review this. One thing I would like to see for the final release is distributed_graph_generator
being renamed to command_graph_generator
(or similar, +bikeshedding) because we have two (and in the future three) different graph representations!
48f293c
to
4dde523
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without reviewing the tests:
dc8de68
to
b77a20d
Compare
Check-perf-impact results: (cc07e36dc65a2e9d9a3deec584afa9e9) ❓ No new benchmark data submitted. ❓ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
0d327c1
to
c1f8111
Compare
Check-perf-impact results: (9e900a306dc9f17e4a27439205a7680c)
Overall relative execution time: 0.72x (mean of relative medians) |
c1f8111
to
2fbc0ff
Compare
This replaces Celerity's master/worker scheduling model with a fully distributed approach: Each node still processes all chunks, but commands are only generated for locally executed chunks, as well as to push data to other nodes as needed. Instead of tracking where the newest data resides within the distributed system at all times, each node now only determines which parts of the data are currently up to date (based on task-level write requirements) as well as which parts it "owns" (based on writes by locally executed chunks). Read requirements on stale data by local chunks generate await push commands, which now can be satisfied by any number of incoming transfers (as nodes do not know who currently owns the data). Conversely, reads on locally owned data by remote chunks generate push commands. Co-authored-by: Fabian Knorr <fabian.knorr@dps.uibk.ac.at>
2fbc0ff
to
68433b4
Compare
As discussed today this is an early draft of the new distributed scheduling model implementation. I apologize for the state of the codebase right now, but maybe you can still take a quick look if you can spot anything major. As I'm still in the process of migrating all existing command-graph-related tests to the new architecture (with the help of @fknorr - thanks!), the codebase currently has to support both the legacy
graph_generator
and the newdistributed_graph_generator
side by side, so there is quite a bit of code duplication and hackery.The most interesting/relevant parts to review are probably
distributed_graph_generator
(obviously), as well as maybebuffer_transfer_manager
and the new testing infrastructure. Figuring out all data dependencies is still quite the mess, although one could argue it's slightly better because we have one level of indirection less (no "per node" data maps anymore). I hear @fknorr has some ideas of how this could be improved at some point in the future.