-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update ORTE to support PMIx v3 #4854
Conversation
@ggouaillardet @jjhursey @jsquyres You guys might want to take a look at this one - it is a backport from the PMIx reference server. I didn't want to hold off until the full debugger support is completed - hoped to break it down into some smaller digestible chunks. Still, don't want to let OMPI diverge too far! I'm assuming OMPI wants the debugger support and the "instant on" features, and so this is likely something the community wants committed. If not, just close it. |
@artpol84 FYI |
MTT results:
|
MTT without this patch - looks like the patch is impacting comm_spawn somehow:
|
@ggouaillardet @jjhursey I'm not sure why no-disconnect is hanging - it might be a common problem with loop-spawn. I only see it on multi-node jobs, so it has something to do with a race condition. Unfortunately, I cannot chase it down as it only appears in OMPI with comm_spawn. Can someone take a look so we can get this updated? |
@rhc54 I cannot even run a simple
the difference I was able to spot is in /* if all local contributions have been received,
* let the local host's server know that we are at the
* "fence" point - they will callback once the [dis]connect
* across all participants has been completed */
if (trk->def_complete &&
pmix_list_get_size(&trk->local_cbs) == trk->nlocal) {
rc = pmix_host_server.connect(trk->pcs, trk->npcs, trk->info, trk->ninfo, cbfunc, trk);
} else {
rc = PMIX_SUCCESS;
}
fwiw, if I comment the I just noticed you pushed a new commit, and I will double check if it affects the current behavior. |
the new commit did not fix the issue on multiple nodes |
Hmmm...it works fine for me, so there must be some difference between our setups. Let me guess - your mpirun executes on a non-compute node (i.e., has no procs on it)? If so, I'll debug that scenario. |
@rhc54 not really ... but if I |
Okay, so in your scenario mpirun and the "parent" are on one node, and the spawned child is on another? I suspect this is the scenario that is causing those tests to fail. They have a lot of spawns in them, and eventually they fill the local node and overflow to the other node - and then hang. So it sounds like we may have an easy way to reproduce the problem. Let me poke at it a bit. |
also
from |
Weird - makes me wonder again if perhaps this is just a race condition, and the placement of procs just causes you to fall on one side or another. |
yes, this is my scenario. |
all I can say is I reproduce the issue 100% of the time. so let me put it this way
|
@rhc54 I think I get a better understanding of where things differ. in but in this branch, In
|
Ah, excellent!! Thanks so much for digging into this! I'll address it. |
This is a point-in-time update that includes support for several new PMIx features, mostly focused on debuggers and "instant on": * initial prototype support for PMIx-based debuggers. For the moment, this is restricted to using the DVM. Supports direct launch of apps under debugger control, and indirect launch using prun as the intermediate launcher. Includes ability for debuggers to control the environment of both the launcher and the spawned app procs. Work continues on completing support for indirect launch * IO forwarding for tools. Output of apps launched under tool control is directed to the tool and output there - includes support for XML formatting and output to files. Stdin can be forwarded from the tool to apps, but this hasn't been implemented in ORTE yet. * Fabric integration for "instant on". Enable collection of network "blobs" to be delivered to network libraries on compute nodes prior to local proc spawn. Infrastructure is in place - implementation will come later. * Harvesting and forwarding of envars. Enable network plugins to harvest envars and include them in the launch msg for setting the environment prior to local proc spawn. Currently, only OmniPath is supported. PMIx MCA params control which envars are included, and also allows envars to be excluded. Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
The current code path for PMIx_Resolve_peers and PMIx_Resolve_nodes executes a threadshift in the preg components themselves. This is done to ensure thread safety when called from the user level. However, it causes thread-stall when someone attempts to call the regex functions from _inside_ the PMIx code base should the call occur from within an event. Accordingly, move the threadshift to the client-level functions and make the preg components just execute their algorithms. Create a new pnet/test component to verify that the prge code can be safely accessed - set that component to be selected only when the user directly specifies it. The new component will be used to validate various logical extensions during development, and can then be discarded. Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 456ac7f)
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Looks like that last commit got it:
|
@ggouaillardet Thanks again! |
Committing per discussion on the teleconf of 2/26. |
This is a point-in-time update that includes support for several new PMIx features, mostly focused on debuggers and "instant on":
initial prototype support for PMIx-based debuggers. For the moment, this is restricted to using the DVM. Supports direct launch of apps under debugger control, and indirect launch using prun as the intermediate launcher. Includes ability for debuggers to control the environment of both the launcher and the spawned app procs. Work continues on completing support for indirect launch
IO forwarding for tools. Output of apps launched under tool control is directed to the tool and output there - includes support for XML formatting and output to files. Stdin can be forwarded from the tool to apps, but this hasn't been implemented in ORTE yet.
Fabric integration for "instant on". Enable collection of network "blobs" to be delivered to network libraries on compute nodes prior to local proc spawn. Infrastructure is in place - implementation will come later.
Harvesting and forwarding of envars. Enable network plugins to harvest envars and include them in the launch msg for setting the environment prior to local proc spawn. Currently, only OmniPath is supported. PMIx MCA params control which envars are included, and also allows envars to be excluded.
Signed-off-by: Ralph Castain rhc@open-mpi.org