Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4.1.x: libnbc fix for iallreduce count*extent overflowing int #10094

Merged
merged 1 commit into from
Mar 14, 2022

Conversation

markalle
Copy link
Contributor

@markalle markalle commented Mar 8, 2022

In the libnbc iallreduce ring algorithm at -np 4, if the datatype is
MPI_LONG_LONG of 8bytes and a count is used like 1.5 billion so the
total bytes is 12Gb, the offsets for some of the iterations were going
negative.

gist for testcase:
https://gist.github.com/markalle/61e05fed6de4cd201d5e7d22b0c175a1
% mpicc -o x iallreduce_overflow.c
% mpirun -np 4 --mca coll_libnbc_iallreduce_algorithm 1 ./x 12000000000

The testcase picks a random number of bytes for the allreduce buffer if one
isn't specified on the command line.

Signed-off-by: Mark Allen markalle@us.ibm.com
(cherry picked from commit b8d6b6b)

In the libnbc iallreduce ring algorithm at -np 4, if the datatype is
MPI_LONG_LONG of 8bytes and a count is used like 1.5 billion so the
total bytes is 12Gb, the offsets for some of the iterations were going
negative.

gist for testcase:
    https://gist.github.com/markalle/61e05fed6de4cd201d5e7d22b0c175a1
% mpicc -o x iallreduce_overflow.c
% mpirun -np 4 --mca coll_libnbc_iallreduce_algorithm 1 ./x 12000000000

The testcase picks a random number of bytes for the allreduce buffer if one
isn't specified on the command line.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
(cherry picked from commit b8d6b6b)
@markalle
Copy link
Contributor Author

markalle commented Mar 8, 2022

bot:mellanox retest

@jsquyres jsquyres changed the title libnbc fix for iallreduce count*extent overflowing int v4.1.x: libnbc fix for iallreduce count*extent overflowing int Mar 9, 2022
@jsquyres jsquyres added this to the v4.1.3 milestone Mar 9, 2022
@jsquyres jsquyres requested a review from gpaulsen March 9, 2022 14:30
@jsquyres jsquyres merged commit 21a2855 into open-mpi:v4.1.x Mar 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants