Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btl/tcp: Don't use mca_btl_tcp_put. #8984

Conversation

awlauria
Copy link
Contributor

Use btl/base active messaging protocols instead.

This was included/discussed a bit in #8756, but that PR has some other issues.
It might be easier to get changes in piecemeal.

Fixes #8983

Signed-off-by: Austen Lauria awlauria@us.ibm.com

Use btl/base active messaging protocols instead.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
@awlauria awlauria requested review from bosilca and hjelmn May 19, 2021 20:30
@gpaulsen gpaulsen added this to the v5.0.0 milestone Jun 17, 2021
@gpaulsen
Copy link
Member

@bosilca Can you please take a look?

@hjelmn
Copy link
Member

hjelmn commented Jul 14, 2021

Hmm, how does this impact the large message two-sided performance when using TCP?

Also, if it is better to disable btl/tcp put you should just remove it from the default for btl_flags. That way it can be turned on if needed.

@awlauria
Copy link
Contributor Author

@hjelmn doing a code inspection, I don't see a consumer of btl_put except for osc/rdma, but I could be wrong on that.

If it's not the case, is it possible to turn this off only for osc and leave it on for point-to-point?

@gpaulsen
Copy link
Member

gpaulsen commented Aug 3, 2021

@hjelmn ^^^ ?

@gpaulsen
Copy link
Member

@hjelmn Can you please take another look? This apparently works around a wrong answer in one-sided, so it's desirable as a band-aid for v5.0.0 until we can get a better fix.

@gpaulsen gpaulsen requested review from wckzhang and removed request for bosilca August 26, 2021 13:49
@gpaulsen
Copy link
Member

@wckzhang Are you also familiar with BTL/TCP? Could you also please take a look? v5.0.0rc1 is less than one month away.

@wckzhang
Copy link
Contributor

We don't know why the TCP BTL put is incorrect? So we're disabling it in in favor of other protocols as a temporary fix? Is the wrong answer difficult to debug?

@awlauria
Copy link
Contributor Author

I'll dig up the test case, it was a simple one.

If I remember right, the test was doing accumulate in a loop, and the accumulates were completing out of order with this path.

@hjelmn
Copy link
Member

hjelmn commented Aug 26, 2021

btl_put is used by pml/ob1 as well.

@wckzhang
Copy link
Contributor

I'm pretty wary of just disabling a feature like this. I feel like this is a place where the "temporary" solution would just become permanent since nobody will want to commit time for a real fix.

@hjelmn
Copy link
Member

hjelmn commented Sep 2, 2021

Exactly. Though I am concerned that this behavior is incorrect even for ob1. I am not expecting to have time to look tomorrow or Monday.

@rhc54
Copy link
Contributor

rhc54 commented Sep 2, 2021

would just become permanent since nobody will want to commit time for a real fix.

The counter to that statement is: since nobody will want to commit time to create a fix, this would block all future OMPI releases...which is why we propose the alternative. If you truly feel that this feature must work for OMPI to release, then it sounds to me like you are taking responsibility for making that happen in a timely fashion - yes?

@bosilca
Copy link
Member

bosilca commented Sep 2, 2021

The proposed solution has the unfortunate side-effect of decreasing the performance for two-sided communications over TCP, a quite drastic "alternative".

I tried to follow the discussion across the issues and past PR, but it is not clear to me what the root cause is. Can someone summarize it?

@awlauria
Copy link
Contributor Author

awlauria commented Sep 2, 2021

This test fails under IBM mtt with a wrong answer, here:

mpirun  --np 2  --mca pml ob1 --mca btl self,tcp --mca osc rdma,sm --mca ompi_display_comm
MPI_Finalize --prefix /tmp/mpiczar/scratch/ompi-5.0.x_xl/installs/luoy/install onesided/c_post_start

step 99916, rank 0 last element not updated
step 99916, rank 1 last element not updated
step 99917, rank 0 last element not updated
step 99917, rank 1 last element not updated
step 99918, rank 0 last element not updated
step 99918, rank 1 last element not updated
step 99919, rank 0 last element not updated
step 99919, rank 1 last element not updated
step 99920, rank 1 last element not updated
step 99920, rank 0 last element not updated
step 99921, rank 0 last element not updated
step 99921, rank 1 last element not updated
step 99922, rank 0 last element not updated
step 99922, rank 1 last element not updated
step 99923, rank 1 last element not updated
step 99923, rank 0 last element not updated
step 99924, rank 0 last element not updated
step 99924, rank 1 last element not updated
step 99925, rank 0 last element not updated
step 99925, rank 1 last element not updated
step 99926, rank 0 last element not updated
step 99926, rank 1 last element not updated
step 99927, rank 1 last element not updated
step 99927, rank 0 last element not updated
step 99928, rank 0 last element not updated
step 99928, rank 1 last element not updated
step 99929, rank 0 last element not updated
step 99929, rank 1 last element not updated
step 99930, rank 1 last element not updated
step 99930, rank 0 last element not updated
step 99931, rank 0 last element not updated
step 99931, rank 1 last element not updated
step 99932, rank 0 last element not updated
step 99932, rank 1 last element not updated
step 99933, rank 1 last element not updated
step 99933, rank 0 last element not updated
step 99934, rank 0 last element not updated
step 99934, rank 1 last element not updated
step 99935, rank 1 last element not updated
step 99935, rank 0 last element not updated
step 99936, rank 0 last element not updated
step 99936, rank 1 last element not updated
step 99937, rank 0 last element not updated
step 99937, rank 1 last element not updated
step 99938, rank 1 last element not updated
step 99938, rank 0 last element not updated
step 99939, rank 1 last element not updated
step 99939, rank 0 last element not updated
step 99940, rank 0 last element not updated
step 99940, rank 1 last element not updated
step 99941, rank 1 last element not updated
step 99941, rank 0 last element not updated
step 99942, rank 0 last element not updated
step 99942, rank 1 last element not updated
step 99943, rank 1 last element not updated
step 99943, rank 0 last element not updated
step 99944, rank 0 last element not updated
step 99944, rank 1 last element not updated
step 99945, rank 0 last element not updated
step 99945, rank 1 last element not updated
step 99946, rank 0 last element not updated
step 99946, rank 1 last element not updated
step 99947, rank 1 last element not updated
step 99947, rank 0 last element not updated
step 99948, rank 0 last element not updated
step 99948, rank 1 last element not updated
step 99949, rank 1 last element not updated
step 99949, rank 0 last element not updated
step 99950, rank 1 last element not updated
step 99950, rank 0 last element not updated
step 99951, rank 0 last element not updated
step 99951, rank 1 last element not updated
step 99952, rank 1 last element not updated
step 99952, rank 0 last element not updated
step 99953, rank 0 last element not updated
step 99953, rank 1 last element not updated
step 99954, rank 1 last element not updated
step 99954, rank 0 last element not updated
step 99955, rank 1 last element not updated
step 99955, rank 0 last element not updated
step 99956, rank 0 last element not updated
step 99956, rank 1 last element not updated
step 99957, rank 1 last element not updated
step 99957, rank 0 last element not updated
step 99958, rank 0 last element not updated
step 99958, rank 1 last element not updated
step 99959, rank 0 last element not updated
step 99959, rank 1 last element not updated
step 99960, rank 0 last element not updated
step 99960, rank 1 last element not updated
There were 16044 transfer errors
There were 18044 transfer errors
Host 0 [c656f6n09] ranks 0 - 1
 host | 0
======|=====
    0 : tcp

Connection summary: (btl)
  on-host:  all connections are tcp
  off-host: all connections are n/a

Here's the test source: https://github.com/open-mpi/ompi-tests/blob/master/ibm/onesided/c_post_start.c

My thinking was this path shouldn't be used for the osc/rdma path, but perhaps I am wrong and there is an alternative fix. I am not sure if this path can be disabled only for osc and not pml/ob1 - I'm not familiar with it enough.

@awlauria awlauria closed this Oct 25, 2021
@awlauria awlauria deleted the fix_c_post_start_wrong_answer_single_node branch March 17, 2022 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

osc rdma/tcp: Wrong answer with c_post_start in MTT
6 participants