-
Notifications
You must be signed in to change notification settings - Fork 876
WeeklyTelcon_20201123
Austen Lauria edited this page Nov 24, 2020
·
12 revisions
- Dialup Info: (Do not post to public mailing list or public wiki)
- Jeff Squyres (Cisco)
- Harumi Kuno (HPE)
- Hessam Mirsadeghi (NVIDIA)
- Austen Lauria (IBM)
- Howard Pritchard (LANL)
- Ralph Castain (Intel)
- Todd Kordenbrock (Sandia)
- William Zhang (AWS)
- Brendan Cunningham (Intel)
- Raghu Raja (AWS)
- Naughton III, Thomas (ORNL)
- Michael Heinz (Intel)
- Matthew Dosanjh (Sandia)
- David Bernholdt (ORNL)
-
Akshay Venkatesh (NVIDIA)
-
Aurelien Bouteiller (UTK)
-
Christoph Niethammer (HLRS)
-
Edgar Gabriel (UH)
-
George Bosilca (UTK)
-
Joseph Schuchart (HLRS)
-
Josh Hursey (IBM)
-
Noah Evans (Sandia)
-
Geoffrey Paulsen (IBM)
-
Joshua Ladd (nVidia/Mellanox)
-
Artem Polyakov (nVidia/Mellanox)
-
Tomislav Janjusic (nVidia/Mellanox)
-
Brandon Yates (Intel)
-
Charles Shereda (LLNL)
-
David Bernhold (ORNL)
-
Erik Zeiske (HPE)
-
Geoffroy Vallee (ARM)
-
Mark Allen (IBM)
-
Matias Cabral (Intel)
-
Nathan Hjelm (Google)
-
Scott Breyer (Sandia?)
-
Shintaro iwasaki
-
Xin Zhao (nVidia/Mellanox)
-
mohan (AWS)
Blockers All Open Blockers
Review v4.0.x Milestones v4.0.5
- No v4.0 rc this week.
Issue #8246: ROMIO/Luster -
- Thought 4.0.x was on track for an RC, but RM's now want a better idea of Luster problem.
Need to do ROMIO refresh to 3.3.2. Lots of changes between 3.3 and 3.3.2.
- Pretty large delta for a release branch.
- Want to get a better understanding of what's going on before another rc.
- It may be that the right thing to do is put this on 4.1.x instead of 4.0.x.
- Thinking of putting all unit tests for ROMIO into IBM folder. It might help catch this issue earlier.
- This is highest priority for RM's- Howard will start testing new ROMIO this week to see if it fixes the issue.
Issue #8217: Memory leaks -
- Do we have a PR on this?
- Asked creator of ticket - we don't think he created a PR yet.
- Would be easy for us to create the PR.
- Howard/Geoff Paulsen will try to do this patch next week.
- Almost all of this patch will apply to 4.1.x as well.
Issue #8252:
- Thomas Naughton found an issue with UCX in OSU benchmark. Issue opened.
Review v4.1.x Milestones v4.1.0
- Was close to an rc. Had the tarball's ready. But #8246 is holding it up now.
- If upgrading ROMIO is part of the solution, it is best to put it now.
- Other than that, RM's believe they have everything ready for an rc.
- Going to go ahead and release an rc anyway, so please test it!
- Not going to lose anything if RM's do another rc with new ROMIO.
- Ralph is still getting a flood of warnings on v4.1.x.
- Jeff Squyres will take a look again.
Review v5.0.0 Milestones v5.0.0
- No updates from RM's. Haven't met in a couple weeks due to conflicting schedules.
- Ralph has updated PMIx/PRRTE pointers.
MTT master failures:
- MTT compile failures with Clang.
- Invalid window failures.
- Jeff Squyres will ask Nathan Hjelm. These are happening because OSC pt2pt is gone.
- Attribute tests reporting an invalid communicator.
- Other than that, MTT looks pretty clean on master.
- Jeff: Docs issue
- Sphinx / ReadTheDocs / RST going well. README's done. Working on FAQ. Man pages will come later (waiting for students to finish their part).
- Doing some minor restructuring.
- We could really use a definitive list in the README section
(i.e., near the top of the docs) about:
- What Operating Systems are supported
- What Network stacks are supported
- What versions of 3rd-party libraries are supported:
- PMIx
- PRRTE
- hwloc
- libevent
- We could really use a definitive list in the README section
(i.e., near the top of the docs) about:
- Jeff/George: State of the State Of the Union
- Howard: ROMIO issue: some problem with UCX...?
- Howard: some other smaller random issues
- ECP update
- George said he asked, and has not gotten an answer yet. Will re-ping.
- Ralph would ideally want to do PMix the same way.
- Not a done deal in ECP land - was querying interest. If ECP doesn't happen, at worst would be a delay. No downside to trying.
- Don't want to go to much later than January. If we start getting to Feb/March/April, makes Super Computing a little more difficult in November.
- If ECP decides not to go with this, can do a stand-alone OMPI webinar in January.