-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix MPI tag creation #661
Fix MPI tag creation #661
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks right to me. At some point, I do thing we should rethink how we do the boundary variables more deeply than this. But this at least gets us to a "correct" implementation.
Thanks for digging in to this!
Doesn't this still leave open the possibility that the |
Good point. I added an appropriate check with error message. |
While going through this PR again,I think this needs additional input from @brryan For So it looks to me that some things are not properly aligned. |
Is the swarm thing not automatically fixed by just setting a bit for swarms? Or alternatively, making a new communicator for swarms? |
Yes, there are various straightforward fixes. I think it'd make sense to go with the one that @brryan considers most appropriate given the other open PRs -- especially thinking ahead in terms of supporting multiple Swarms. |
Thanks for pinging me on this -- yes I would imagine this PR would potentially cause issues for swarms. I think @Yurlungur / @pgrete 's solution of one communicator per swarm is a good path forward. Right now all particle variables are always communicated simultaneously so there shouldn't be any additional subtleties to worry about at the moment. I am out of the office this week but I can think through this in detail on Monday. Probably the tracers example with MPI is the best place to test these changes, since that has both cc vars and a swarm, but I don't think that's included in our automated testing currently. |
src/bvals/bvals_swarm.cpp
Outdated
@@ -41,7 +41,7 @@ namespace parthenon { | |||
BoundarySwarm::BoundarySwarm(std::weak_ptr<MeshBlock> pmb) | |||
: bswarm_index(), pmy_block(pmb), pmy_mesh_(pmb.lock()->pmy_mesh) { | |||
#ifdef MPI_PARALLEL | |||
swarm_id_ = pmb.lock()->pbval->bvars_next_phys_id_; | |||
swarm_id_ = pmb.lock()->pbval->AdvanceCounterPhysID(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swarm_id_ = pmb.lock()->pbval->AdvanceCounterPhysID(1); | |
swarm_id_ = pmb.lock()->pbswarm->AdvanceCounterPhysID(1); |
Yes this is a bug, thanks for noticing this @pgrete!
@pgrete Looking at this now, my thinking for moving forward with swarm phys ids is to:
int BoundarySwarm::AdvanceCounterPhysID() {
#ifdef MPI_PARALLEL
int start_id = bvars_next_phys_id_;
bvars_next_phys_id_ += 1;
PARTHENON_REQUIRE_THROWS(
bvars_next_phys_id_ < 32,
"Next phys_id would be >= 32, which is currently not supported as Parthenon only "
"reserves 5 bits for phys_id. Please open an issue on GitHub if you see this "
"message to discuss options.")
return start_id;
#else
return 0;
#endif
} In this pattern each swarm has a specific One thing this prevents is different communication patterns for different subsets of a swarm's Does that seem reasonable to you? I certainly could be missing something. |
Won't this fail unless there's a separate communicator for swarms vs grid variables? |
Oh yes you're right... I suppose we now need either (1) a separate communicator for swarms (2) a bit reserved in the tag for swarm vs cc variable |
I am in favor of a separate communicator for swarms, which frees up more bits for variables in their communicator. |
I now added a separate communicator (in addition to the tag fix, which is technically not required [any more]). |
@pgrete Thanks for adding the swarm communicator! Unrelated (or not) I just ran into an issue today with particles hanging with MPI (on my heavily modified branches), I'm guessing due to |
a1188f0
to
5842a27
Compare
6b1f580
to
08090a7
Compare
With respect to whether we need |
I don't think Phoebus or rage make this assumption. But I think in |
Yes, anywhere we are working with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! A long-overdue fix.
- ctest -R regression_mpi_test:output_hdf5 | ||
# and one that uses swarms | ||
- make -j${J} particle-leapfrog | ||
- ctest -R regression_mpi_test:particle_leapfrog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Quick update: the fix I tried didn't work after reintroducing the |
@brryan @Yurlungur this is ready for complete re-review now. [1] I think it may be worth to make sure we always trigger the full suite before merging any open PR in the future |
Also ping @lroberts36 so that this is ready to be tested against the MPI issue you encountered for the sparse comm. |
Thanks @pgrete I'll try to look at this today. ? [1] I think it may be worth to make sure we always trigger the full suite before merging any open PR in the future I think it depends. For infrastructure PRs I think that's for sure. For smaller changes probably not necessary. We can use our judgement. |
One more comment/conundrum regarding the failing regression test on Spock #659 The
So I'm now wondering if there's sth else we're missing (here). |
Update:
in both cases. So sth is definitely off with MPI comm (and may or may not be system/env dependent). |
Oof. I guess we should figure out what's going on here before merge. I also still need to run it through Phoebus to see if that breaks anything... though if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks reasonable to me, but I want to run it through phoebus and understand what is causing the tests to fail before merge.
This means that any kind of communication (most prominently flux correction and ghost zone | ||
exchange) of a given variable at a given stage should not be interleaved with any other | ||
modifications/communication of said variable as it may result in undefined behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can send and then modify before receiving right? Although obviously you probably don't want to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It all depends on the driver design (and how the tasks are ordered in one or more task regions).
My intention with this comment was that people should be really mindful with the async nature of tasks in combination with a shared data (flux, coarse, ghost zone buffers) across stages/containers of a single variable.
// Technically we could query MPI_TAG_UB here. However, using a fixed (lower) bound | ||
// guaranteed by the standard for now to reduce the additional functions calls or | ||
// adding a new internal upper lim variable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in favor of just using the standard-guaranteed value forever. This is better for portability. Otherwise I can imagine the code dying on some machines and not others...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically MPI_TAG_UB
should provide the right info for the given MPI lib at compile time.
Practically, I'm also in favor of keeping the hard coded limit for now until this actually becomes an issue.
// TODO(someone): double check if following todo is still applicable with the | ||
// "restrict-in-one" machinery. The original code added `vbvar` to | ||
// `pmb->pbval->bvars`, but we already do this in the constructor. Also, we | ||
// potentially don't iterate over `bvars` any more in the "restrict-in-one" case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RestrictBoundaries
is called per meshblock because the physical boundary condition machinery is still per-meshblock. We need to move the boundary stuff to be "in-one" for this to TODO to become non-relevant.
@@ -485,6 +478,11 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, Packages_t &packages, | |||
|
|||
mesh_data.SetMeshPointer(this); | |||
|
|||
resolved_packages = ResolvePackages(packages); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops good catch
// TODO(mpi people) do we still need the pbval part? Discuss also in the context of | ||
// other than cellvariables, see comment above on communicators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any non-"in one" communication we still do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood your comment above correctly then RestrictBoundaries
may be the last place.
However, as far as I can tell the following line can independently be removed because the vbar
ctor already enlists the vbars
in the pbval
list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems reasonable.
Are you referring to the failed tests on Spock? I'm still not sure what's going on there but I was not able to reproduce the issue anywhere else so far. |
I ran a 2d blast wave in Phoebus with the updated machinery, and 24 MPI ranks. Things seemed to work just fine. Might be good to run a more strenuous test such as a radiation test though, if @brryan or @lroberts36 feel up to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the updated machinery seems to not break Phoebus's integration, I'm calling this good.
@brryan or @lroberts36 do you also want to give this PR another look before merging? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I also have this merged into #663.
return (lid << 11) | (bufid << 5) | phys; | ||
int BoundaryBase::CreateBvalsMPITag(int lid, int bufid) { | ||
PARTHENON_REQUIRE_THROWS(lid >= 0 && bufid >= 0, "Ids are expected to be positive.") | ||
int tag = (lid << 6) | bufid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be saving more bits for the bufid
now that there are no physics bits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is required at the moment as we have a max. of 57 buffers globally.
src/mesh/mesh.hpp
Outdated
@@ -212,6 +212,11 @@ class Mesh { | |||
// size of default MeshBlockPacks | |||
int default_pack_size_; | |||
|
|||
#ifdef MPI_PARALLEL | |||
// Global map of MPI comms for separate variables | |||
std::map<std::string, MPI_Comm> mpi_comm_map_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe should be unordered_map
, although I doubt it matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a misplaced const
that did nothing and I am about to fix. Will update this at the same time.
Thanks for double checking with Phoebus. Pulling the trigger now so that |
PR Summary
Fixes #660
PR Checklist