Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MueLu: Notay unit tests fail in parallel #7860

Closed
mayrmt opened this issue Aug 18, 2020 · 9 comments
Closed

MueLu: Notay unit tests fail in parallel #7860

mayrmt opened this issue Aug 18, 2020 · 9 comments
Labels
pkg: MueLu type: bug The primary issue is a bug in Trilinos code or tests

Comments

@mayrmt
Copy link
Member

mayrmt commented Aug 18, 2020

Bug Report

@trilinos/muelu

Description

When running the MueLu unit tests with Tpetra enabled (and Epetra disabled), the units tests pass on 1 MPI rank, but fail on 4 MPI ranks. In particular, the test NotayAggregation_double_int_int_Kokkos_Compat_KokkosSerialWrapperNode_InitialAggregation2D_UnitTest fails with this screen output:

2: 94. NotayAggregation_double_int_int_Kokkos_Compat_KokkosSerialWrapperNode_InitialAggregation2D_UnitTest ... 
2:  version: MueLu development
2:  numUnaggregatedNodes = 0 == 0 = 0 : passed
2:  sizes[i] = 2 == expected = 2 : passed
2:  sizes[i] = 2 == expected = 2 : passed
2:  sizes[i] = 2 == expected = 2 : passed
2:  sizes[i] = 1 == expected = 1 : passed
2:  NOTE: Global reduction shows failures on other processes!
2:  (rerun with --output-to-root-rank-only=-1 to see output
2:  from other processes to see what process failed!)
2:  NOTE: Unit test failed on processes = {1, 2}
2:  (rerun with --output-to-root-rank-only=<procID> to see output
2:  from individual processes where the unit test is failing!)
2:  [FAILED]  (0.000335 sec) NotayAggregation_double_int_int_Kokkos_Compat_KokkosSerialWrapperNode_InitialAggregation2D_UnitTest
2:  Location: /ssd/codes/mayrmt_trilinos/src-trilinos/packages/muelu/test/unit_tests/NotayAggregationFactory.cpp:126

All other Notay unit tests pass.

Additional Information

I have set the -D Teuchos_GLOBALLY_REDUCE_UNITTEST_RESULTS:BOOL=ON option. When setting this option to OFF, all unit tests pass.

Steps to Reproduce

  1. SHA1: 855be51
  2. Configure script: do-configure-trilinos_muelu_tpetra.txt
  3. Configure log: configure.txt
  4. Build log: build passes just fine.
  5. Navigate in MueLu's unit test directory and run ctest -R UnitTests
  6. Run log: see above
@mayrmt mayrmt added type: bug The primary issue is a bug in Trilinos code or tests pkg: MueLu labels Aug 18, 2020
@mayrmt
Copy link
Member Author

mayrmt commented Aug 18, 2020

@csiefer2, you added the Notay aggregation in the first place (PRs #6865 and #7291). Do you have any more insight into this error?

@mayrmt
Copy link
Member Author

mayrmt commented Aug 25, 2020

Fix has been merged to master via PR #7899. Closing.

@mayrmt mayrmt closed this as completed Aug 25, 2020
@cgcgcg
Copy link
Contributor

cgcgcg commented Aug 25, 2020

Is this actually fixed? Looking at the dashboard, it seems that it's still failing in a couple of builds.

@mayrmt
Copy link
Member Author

mayrmt commented Aug 25, 2020

@cgcgcg It was merged to master as of today. So, wait until tomorrow and check the dashboard again. If it's still failing then, let's reopen.

@mayrmt
Copy link
Member Author

mayrmt commented Aug 26, 2020

@cgcgcg Dashboard as of last night seems to show this test to be passing: https://testing.sandia.gov/cdash/index.php?project=Trilinos&date=2020-08-26&subproject=MueLu

@cgcgcg
Copy link
Contributor

cgcgcg commented Aug 26, 2020

I was going to show this as an example that the issue is still there: https://testing.sandia.gov/cdash/test/27665978
But instead, it looks like several of our build are stuck on some old SHA :-/

@mayrmt
Copy link
Member Author

mayrmt commented Aug 26, 2020

I leave it up to you to reopen, if we're not sure, whether this has really been fixed. I have to admit, that I currently don't have a clear overview over the different dashboards.

@mayrmt
Copy link
Member Author

mayrmt commented Aug 27, 2020

@cgcgcg I had another look at the dashboard (https://testing.sandia.gov/cdash/index.php?subproject=MueLu&project=Trilinos) and it seems that the Notay tests is not failing anymore.

@cgcgcg
Copy link
Contributor

cgcgcg commented Aug 27, 2020

Yup, Jonathan fixed our testing :-)

@jhux2 jhux2 added this to MueLu Aug 12, 2024
@jhux2 jhux2 moved this to Done in MueLu Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: MueLu type: bug The primary issue is a bug in Trilinos code or tests
Projects
Status: Done
Development

No branches or pull requests

2 participants