Failure while mpirun job depends on the order of the hosts

## Background information

### What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)
v3.0.x
master

### Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
git clone

```shell
./configure --prefix=`pwd`/install --enable-orterun-prefix-by-default --with-slurm --with-pmi --with-ucx
```

### Please describe the system on which you are running

* Operating system/version: 
RedHat 7.2
* Computer hardware: 
Intel dual socket Broadwell
* Network type: 
IB

-----------------------------

## Details of the problem
Running on nodes `node1,node2` works well, but if change the order of the nodes to `node2,node1` this will result to failure:

```shell
ssh node1
mpirun --bind-to core --map-by node -H node2,node1 -np 2  $HPCX_MPI_DIR/tests/osu-micro-benchmarks-5.3.2/osu_allreduce
--------------------------------------------------------------------------
[node2:13941] Error: pml_yalla.c:95 - recv_ep_address() Failed to receive EP address
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Not found" (-13) instead of "Success" (0)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Failure while mpirun job depends on the order of the hosts #4516

Background information

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Please describe the system on which you are running

Details of the problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Failure while mpirun job depends on the order of the hosts #4516

Description

Background information

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Please describe the system on which you are running

Details of the problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions