Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4.0.0 hangs for simple message send & recv in mca_btl_vader_component_progress? #6258

Closed
q-p opened this issue Jan 9, 2019 · 22 comments
Closed

Comments

@q-p
Copy link

q-p commented Jan 9, 2019

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

v4.0.0

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Installed from source, but the same error occurs with the installation from homebrew on macOS. The configure options were all default (i.e. ./configure --prefix=... && make -j10 && make install).

The compiler used to compile Open MPI and the example was the system compiler, which is gcc 4.3.4.

Please describe the system on which you are running

  • Operating system/version: Linux (SLED11SP2)
  • Computer hardware: x86_64 (Intel Xeon E3-1276)
  • Network type: n/a (Ethernet)

Details of the problem

A relatively simple case (repo attached) involving 2 processes -- one doing a (non-blocking) send followed by a wait, the other doing a matching (non-blocking) recv followed by a wait -- will hang once the message exceeds a certain size (6185592 bytes => OK, 6185593 bytes => hang).

When the send & recv are changed to their blocking counterpart, the hang still occurs.

The problem did not occur in previous versions of Open MPI, in particular 3.1.3 seems fine.

#include <stdio.h>

#include "mpi.h"

static const MPI_Datatype Datatype = MPI_PACKED;
static const int Tag = 42;
static const int RecvProc = 0;
static const int SendProc = 1;

// 6185592 does not hang w/ Open-MPI 4.0.0, 6185593 does hang in the Wait()s
#define MessageSize (6185592 + 1)
static unsigned char data[MessageSize] = {0};

int main(int argc, char *argv[])
{
  MPI_Init(&argc, &argv);
  MPI_Comm comm = MPI_COMM_WORLD;
  
  int myID = 0;
  int nProcs = 1;
  MPI_Comm_size(comm, &nProcs);
  MPI_Comm_rank(comm, &myID);

  if (nProcs != 2)
  {
    if (myID == 0)
      printf("Must be run on 2 procs\n");
    MPI_Finalize();
    return -1;
  }

  int result = 0;
  if (myID == RecvProc)
  {
    MPI_Status probeStatus;
    result = MPI_Probe(SendProc, MPI_ANY_TAG, comm, &probeStatus);
    printf("[%i] MPI_Probe => %i\n", myID, result);
    int size = 0;
    result = MPI_Get_count(&probeStatus, Datatype, &size);
    printf("[%i] MPI_Get_count => %i, size = %i\n", myID, result, size);

    MPI_Request recvRequest;
    result = MPI_Irecv(data, size, Datatype, SendProc, Tag, comm, &recvRequest);
    printf("[%i] MPI_Irecv(size = %i) => %i\n", myID, size, result);
    MPI_Status recvStatus;
    result = MPI_Wait(&recvRequest, &recvStatus);
    printf("[%i] MPI_Wait => %i\n", myID, result);
  }
  else
  { // myID == SendProc
    MPI_Request sendRequest;
    result = MPI_Isend(data, MessageSize, Datatype, RecvProc, Tag, comm, &sendRequest);
    printf("[%i] MPI_Isend(size = %i) => %i\n", myID, MessageSize, result);
    MPI_Status sendStatus;
    result = MPI_Wait(&sendRequest, &sendStatus);
    printf("[%i] MPI_Wait => %i\n", myID, result);
  }

  printf("[%i] Done\n", myID);
  MPI_Finalize();
  return 0;
}

open-mpi4_hang_repo.c.zip

@q-p
Copy link
Author

q-p commented Jan 9, 2019

Hang also occurs with current master acc2a70

@jsquyres
Copy link
Member

jsquyres commented Jan 9, 2019

You said your network type is "Ethernet" -- does that mean you're using the TCP BTL?

@q-p
Copy link
Author

q-p commented Jan 9, 2019

I don't think so. I'm simply using mpirun -np 2 ./a.out on my local machine.

@ghackebeil
Copy link

I came here to post this exact issue (also narrowed it down to a 6185592 byte threshold). I'm glad someone else already typed it up. I'm on macOS 10.14.2. OpenMPI was installed with homebrew and configured with "--disable-silent-rules --enable-ipv6 --with-libevent=/usr/local/opt/libevent".

Here is my test program:

#include <mpi.h>
#include <iostream>

#define N 6185593

int main()
{
  char data[N];
  MPI_Init(NULL, NULL);

  int rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  if (rank == 0) {
    MPI_Send(data, N, MPI_BYTE, 1, 0, MPI_COMM_WORLD);
  } else if (rank == 1) {
    MPI_Status status;
    MPI_Recv(data, N, MPI_BYTE, 0, 0, MPI_COMM_WORLD, &status);
    std::cout << status._ucount << " " << status.MPI_ERROR << std::endl;
  }

  MPI_Finalize();

  return 0;
}

which I try running with mpirun -np 2 ./a.out.

@ghackebeil
Copy link

Also worth noting that the issue does not occur when I test with MPICH (installed with homebrew)

@steve-ord
Copy link

steve-ord commented Feb 26, 2019

I would like to say I also see this one. I see the same thing using 4.0.0 from homebrew on Mojave (10.14.1). A quite complex MPI application works fine on other platforms and used to work on the Mac. I recently updated to a brand new one and now a send and receive pair deadlocks when the message size passes a particular size. I haven't checked the size precisely but it is comparable to the OP. When I check the processes by attaching with lldb. And move up a couple of frames for some clarity I get:

For one half:

frame #4: 0x000000010fcb6626 libmpi.40.dylib`MPI_Recv + 398
libmpi.40.dylib`MPI_Recv:
    0x10fcb6626 <+398>: movl   %eax, %ebx
    0x10fcb6628 <+400>: xorl   %eax, %eax
    0x10fcb662a <+402>: testl  %ebx, %ebx
    0x10fcb662c <+404>: je     0x10fcb671d               ; <+645>

And for the other end:

frame #4: 0x000000010e545043 libmpi.40.dylib`MPI_Send + 372
libmpi.40.dylib`MPI_Send:
    0x10e545043 <+372>: xorl   %ecx, %ecx
    0x10e545045 <+374>: movl   %eax, %r14d
    0x10e545048 <+377>: testl  %r14d, %r14d
    0x10e54504b <+380>: jne    0x10e54507d               ; <+430>

If burrowing down the stack further would help I'd happily oblige.

BTW they have both received SIG STOP when I attach - but that is probably a coincidence

@dionhaefner
Copy link

Same problem here (OSX, OpenMPI 4.0 via Homebrew). For anyone arriving here looking for a workaround:

You can use a different BTL, e.g.

$ mpirun --mca btl self,sm,tcp

or (if you don't have sm)

$ mpirun --mca btl self,tcp

@jsquyres
Copy link
Member

Can you all try the latest 4.0.1rc nightly snapshot tarball?

https://www.open-mpi.org/nightly/v4.0.x/

@bosilca
Copy link
Member

bosilca commented Mar 11, 2019

I can reproduce on different flavors of OSX, but not on Linux. The issue seems to come from vader, as if I force the use of TCP (--mca btl tcp,self) the program correctly completes. I'll take a look.

@q-p
Copy link
Author

q-p commented Mar 11, 2019

@jsquyres Same problem with 4.0.1rc1

@bosilca I'm having this problem on linux, and as far as I can tell (using -mca btl_base_verbose 100) I'm using the tcp btl.

@jsquyres
Copy link
Member

@gpaulsen @hppritcha This has the potential to be a v4.0.x blocker. I have marked it as so to make sure it isn't missed. Please evaluate.

@jsquyres
Copy link
Member

Do we know if this happens on v3.0.x or v3.1.x? I ask because we're just about to do RCs for those 2.

@dionhaefner
Copy link

Not happening on 3.1 for me.

@jsquyres
Copy link
Member

...answering my own question...

I am able to replicate on v4.0.0, v4.0.1rc1, and v4.0.x HEAD on my MBP MacOS 10.14.3. I am not able to replicate with v3.0.x HEAD and v3.1.x HEAD.

@bosilca
Copy link
Member

bosilca commented Mar 12, 2019

Yesterday I updated to OSX 10.14.3 and gcc 7.4.0. I cannot replicate this issue anymore.

@jsquyres
Copy link
Member

FWIW, I'm at 10.14.3, and I can replicate. But I am using the MacOS gcc (i.e., clang), not a homebrew gcc:

$ gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/c++/4.2.1
Apple LLVM version 10.0.0 (clang-1000.11.45.5)
Target: x86_64-apple-darwin18.2.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

@jsquyres
Copy link
Member

@bosilca and I have investigated:

  • He had a large vader shared memory size on his Mac, which is why he stopped seeing the issue. When he changed it back to the default, he was able to replicate again.
  • We found that the problem is that mca_bml_base_alloc() is failing in mca_pml_ob1_send_fin() after around 4MB or so.

Looks like at least f62d26d and 6ffc7cc and b51c8f8 were missed coming over to v4.0.x from master. These appear to be the main ones we need; there may be one or two more that could be worthwhile to come over. PR inbound shortly...

@jsquyres
Copy link
Member

Also -- we confirmed: this is not an issue for master. It's just commits that we didn't bring over to v4.0.x.

hppritcha added a commit that referenced this issue Mar 15, 2019
v4.0.x: Cherry-pick fixes for issue #6258 from master (vader fixes)
@hppritcha
Copy link
Member

missing commit has been committed to v4.0.x. closing.

@karenibowman
Copy link

Hello! I am a student just learning MPI, but I encountered a similar hang situation (just on MPI_Probe) when a previous MPI_Isend was sending size 0. Do you think that something worth pulling an issue for/exploring further?

Thanks!

@karenibowman
Copy link

The hang likewise occurred in mca_btl_vader_component_progress

@jsquyres
Copy link
Member

We just released Open MPI v4.0.1 yesterday (https://www.mail-archive.com/announce@lists.open-mpi.org/msg00122.html); can you please try again with that version?

If the problem persists, please open a new issue (vs. commenting on a closed issued). Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants