-
Notifications
You must be signed in to change notification settings - Fork 573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Select set of builds for initial mandatory auto PR testing process #2317
Comments
I sent the following email. I will give it until the week of 3/12/2-18 after the SIAM PP conference to have this meeting. That should be urgent enough. To: sandia-trilinos-developers@software.sandia.gov Hello Sandia Trilinos Developers, Following up on the action item I was given at the Trilinos Developers Planning meeting on Monday with notes listed at:
I am going to organize a discussion on the selection of the set of builds the auto PR testing process should be using before it becomes the required way to test and push to the Trilinos ‘develop’ branch. Therefore, if you are interested in being part of this discussion, please add your name in the first comment in the existing list under the “Description” section in the new issue: and then I will try to organize a meeting of all the interested parties. Since this selection of build will impact every Trilinos developer and every close customer and collaboration of Trilinos, it is important that we get input from many different people in making this selection. Thanks, -Ross |
Any machine used for PR testing must be accessible (either directly or by an identical system) by all developers so they can fix issues that PR testing exposes. |
I'm with @csiefer2 -- for example, if we intend to support Windows builds, the Windows Dashboard build should be a VM that we can access (either log into, or download and run on our own) and use for testing. |
I don't think there is no way that we would make a Windows build a auto PR build or even a CI build anytime soon. There are so many other builds that more important than that. But I think the rule should be that before any build can enter the "Nightly", "Clean", or "Auto PR" category, that it must be easy for any Sandia staff member to access a machine where the build can be exactly reproduced. I agree with @csiefer2 and @mhoemmen on this point. |
FYI: @trilinos/muelu It appears that the current auto PR testing GCC 4.8.4 posting to CDash never runs any MueLu tests as shown at: Not sure why that is because the standard CI build set up for last 1.5 has been running MueLu tests without any issues (except for real failures) as shown in recent history at: |
FYI: @trilinos/anasazi It also appears that the auto PR builds are not running any Anazazi tests either as shown by: |
FYI: @trilinos/ifpack2 It also appears that the auto PR builds are not running any Ifpack2 tests as shown by: |
@bartlettroscoe Thank you for the reminder about the MueLu, Ifpack2, and Anasazi tests. These packages had test failures a while back and were disabled for that reason (needed 100% passing). Some of these issues I believe have been resolved. I am testing that now. |
Just the failing tests should be disabled, not all the tests for a package. You write a GitHub issue for the failing test(s), then disable just those tests. Generally one wants to disable with a scapulae, not a machete. |
Agreed. At the time I didn't want the issues to be lost and not notice that the tests were disabled. Much of this has to do with ETI. |
If the developers of those packages don't care to fix the broken tests for GitHub issues associated with their packages then that is on them. And you only disable the tests just for that particular build, not for all builds. Look at the git log file for It is important to fix this quickly because developers might be thinking that the auto PR builds are testing Ifpack2, Anasasi, and MueLu but they are not. This means that the auto PR builds have even a higher chance of breaking tests in these packages. |
Wait, what?!? Ifpack2 and MueLu tests were totally disabled?!? |
Hello, it is after SIAM PP. Can we set up a meeting on this? |
@jwillenbring and/or @bmpersc, Do you guys want to be included in this meeting? If so, please add your GitHub IDs to the list in the above Description field. I will then try to set up a meeting using people's SNL calendars to select a time that works for everyone. (Or if I can't find a time, I will set up a dreaded doodle.com poll.) |
One issue to consider is that according to: GCC 4.8.4 should implement OpenMP 3.1 while GCC 4.9.3 should implement OpenMP 4.0. Is that an issue for testing OpenMP code with Trilinos? How much difference is there in the Kokkos implementation of threading of OpenMP 3.1 vs OpenMP 4.0? Or is this not a concern because updates to Kokkos do testing with many different compilers before merging to Trilinos 'develop'? If we are not worried about the OpenMP 3.1 vs. OpenMP 4.0 issue, then a single GCC 4.8.4 build for auto PR testing would seem to be okay. |
@bartlettroscoe - OpenMP3.1 would be good for now because we have a range of platforms where this is the most up to date complete supported standard. |
@trilinos/shylu developers, Why are there no tests being run for any of the ShyLU packages in the existing auto PR and CI builds:
? These packages are listed a Primary Tested (PT) packages but they don't have any tests? |
@bartlettroscoe each package in each of ShyLU's subpackages is marked either ST or EX: I'm not sure why they are being listed as PT packages, the use of
in the Opening a new issue after speaking with @srajama1 to get packages migrated to PT testing, #2375 Edit: Added reference to ShyLU issue. |
NOTE: Defects like #2391 show why we need to be enabling OpenMP in our auto PR build. |
CC: @trilinos/zoltan2 I was just reminded today by #2397 that the auto PR builds based on SEMS should also enable the Scotch TPL. That CI build (and the builds derived from it) are the only automated builds of Trilinos that enable Scotch and therefore are the only builds that run these tests that depend on Scotch. See #2065. |
@mhoemmen said:
That makes sense. I just remembered that the GCC C++11 threads library requires you link in the pthread lib in order to work. This is used, for example, in the threaded testing of the new thread-safe Teuchos Memory Management class mode. What is the impact then of enabling the Pthread TPL and OpenMP as long as Kokkos just uses OpenMP? |
@bartlettroscoe enabling the Pthread TPL and Edit: the most common configuration enables the Kokkos OpenMP and Serial backends. |
@ibaned, does this mean that we should or should not allow the enable of the Pthread TPL when we configure Trilinos with OpenMP or Serial Kokkos backends? @mhoemmen, other than some than some tests that need |
It was requested that we use GCC 4.9.3 heades with Intel 17.0.1 builds of Trilinos (see trilinos#2317 and trilinos#2463).
The Pthread TPL is fine to enable regardless of Kokkos backends. It doesn't cause any issues with OpenMP or otherwise. It is also fine to disable, unless the Kokkos Threads backend is used. |
@bartlettroscoe wrote:
I actually searched Trilinos for |
@bartlettroscoe My above comments suggest that maybe we don't need the Pthread TPL at all. I've been fine with it enabled but perhaps it's not necessary. On the other hand, sometimes OpenMP implementations or other TPLs need it, so I'm not sure we should just turn it off by default. |
@mhoemmen, the experience with the Teuchos MM classes thread-safe work with that you need to explicitly link in the |
@bartlettroscoe good point; not sure if actual C++11 implementations are supposed to require that (vs. the |
FYI: As pointed out by @etphipp in #2628 (comment), setting:
seems to fix the problem of OpenMP threads all binding to the same core on a RHEL6 machine. Perhaps this will let us enable OpenMP for the GCC 4.8.4 auto PR build being set up in #2462? |
FYI: As mentioned in the new issue #2674, EMPIRE is now enabling the options:
Therefore, the PR builds should enable these as well. This is the same argument for why we agreed to add the "Experimental" enables for Xpetra and MueLu described above. |
This was the agreement as part trilinos#2317. NOTE: This is using 'mpiexec --bind-to none ...' to avoid pinning the threads in differnet MPI ranks to the same cores. See trilinos#2422.
This was the agreement as part trilinos#2317. NOTE: This is using 'mpiexec --bind-to none ...' to avoid pinning the threads in differnet MPI ranks to the same cores. See trilinos#2422.
@bartlettroscoe Since the PR builds have been running for some time I propose we close this ticket and deal with additional issues in other tickets. |
@jwillenbring, sure, the only remaining build is the CUDA build in #2464 and we have a plan for that (see the issue). Closing. |
CC: @trilinos/framework
Next Action Status
This ship has sailed on "Initial" a long time ago. The only remaining build in the CUDA PR build and that is being tracked in #2464.
Description
This story is to involve interested members of the Trilinos team to collaborate to select the best build configurations in order to allow the new Trilinos auto PR testing and merging tool and process (#1155) to become the mandatory way to test and push to the main Trilinos ‘develop’ branch (#2312). This selection must be done considering that there are limited computing resources (on the Jenkins build farms) to run automated builds. So while we would like to run many different useful builds as part of pre-push automated PR testing, we have to be strategic about what builds we run where to not overwhelm current capacity. As more build machines are added to the Jenkins build farm, more builds can be added to the auto PR testing process.
This story is a follow on from the action item:
in the 2018-02-26 Trilinos Planning Meeting.
The set of Trilinos team members interested in being part of this discussion (and meetings) include:
Since this selection of these builds will impact every Trilinos developer and every close customer and collaboration of Trilinos, it is important that we get input from many different people in making this selection.
Definition of Done
Related Issues
Task
.cmake
files for each proposed configuration and set up nightly builds sumitting to "Specialized" CDash Track/Group:a. GCC 4.8.4: See Modify existing GCC 4.8.4 CI build to match selected auto PR build #2462
b. Intel 17.x: See Set up new Intel 17.x build to use as auto PR build #2463
c. CUDA: See Set up a CUDA build for an auto PR build #2464
The text was updated successfully, but these errors were encountered: