-
Notifications
You must be signed in to change notification settings - Fork 876
WeeklyTelcon_20180327
Geoffrey Paulsen edited this page Jan 15, 2019
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- oops, forgot to gather this week, but had a good quarum.
- Ralph not here.
Review All Open Blockers
Review v2.x Milestones v2.1.3
- v2.1.4 - Oct 15th,
- Merged in a bunch of stuff.
- One-sided multithreaded bugs that came up.
- Doesn't feel like it's worth it to fix in v2.1.x, so instead pulled configurey changes from v2.0 to v2.1.x
- Relelase v2.1.3 - Saint Patty's day release.
Review v3.0.x Milestones v3.0.1
- Merged late minute fix from IBM. PR4955
- IBM's MTT looked much better last night.
- write memory barrier in vader.
- PR4977 causes corruption. Nathan just discovered, will PR today, possibly release tomorrow.
Review v3.1.x Milestones v3.1.0
- Brian got no useful feedback, so not interested in releasing.
- some appear to be getting a segfault.
- PR4977 causes corruption. Nathan just discovered, will PR today, possibly release tomorrow.
- Cisco is seeing a number of Spawn issues on v3.1.x in MTT.
- Most of these were oversubscription issues, needed ini changes
- Still seeing another 100 or so more failures, he expects the same. Possibly new regressions.
- cisco will look at and file issues tonight.
- Nathan says looking good, except for PR4977.
- Josh Ladd says it looks good as well.
Review Master Master Pull Requests
- Looks low on MTT results, still fighting with MTT download problem.
- downloads resolved, shut off old download.
- OSC compiler failure (gcc v4.1.2) at Absoft - 32bit build.
- Nathan will look and see. fixed in PR4941 just mergin now.
- believe he fixed this, but Absoft MTT failed.
- they may need new perl/json module
- Jeff is contacting Absoft about MTT
- Gave up on clean warning build on 32bit builds.
- Open MPI cleanup PR4953 - what do we want to delete from Open MPI v4.0
- C++ bindings (have been deleted from the spec for 10 years)
- Some are trivially yes - MPI_Address - trivial to replace.
- Should start asking the users the question
- Some people wont know and will just say they want to be concervative and gives us false positives if users are using.
- PR failing for real reasons, need to understand.
- likes that we dont use deprecated functions internally.
- Some tests use deprecatd funcs, need to take a sweep there.
- a macro to make them not appear in mpi.h
- Don't compile mpiC and friends.
- Is anyone USING the C++ bindings (packagers probably turn it on by default) - but could give data on what libs use it.
- Not mutually exclusive. disable and 12 months later delete.
- Put this on the FAQ - this is what was deleted, and here's how you change your code.
- LB and UB, Address, C++ bindings.
- Here's a list of things that we no longer will build by default.
- What about talking to mpich to do the samething?
- All be in the same room in June.
- Nathan could bring this message to MPICH.
- Giles question about our versioning compatibility
- He was thinking we said versions would work with other OMPI versions in a minor series.
- We thought that MPI and OSHMEM apps can be forward compatible within the same series.
- Use cases:
- cluster - same version of Open MPI everywhere
- direct launch in containers everything the same version.
- singularity containers - orted one version, app/libs running a different.
- docker style - mpirun outside of container, app and libs all different versions than mpirun
- cluster - same version of Open MPI everywhere
- Amazon recommends that all pieces of orte/mpi be same version in container
- not everyone agrees.
- Open MPI version | PMIx version Cross-version mpirun interoperability v2.0.x | v1.1.5 v2.1.x | v1.2.5+ v3.0.x | v2.0.3+ ------------ newer any client/server combo. PMIx v2.1.x - fixed cross compatibility for dstore and other stuff. v3.1.x | v2.1.x master | v3.0.x
- Nathan - we should version ORTE differently (we do for sharedlibs now, but not for orte 'project')
- Brian though we agreed for OMPI v3.0+ we were going to support cross version fro both singularity and docker.
- Realy talking about mpirun ONLY, not some nodes have OMPI vX and OMPI vY.
- Agreed that app has to link against same version of libOMPI everywhere
- Agreed that all ORTEDs need to be same everywhere.
- Need to decide if we want Version of orte in mpirun/launcher and orte in orted different - to support docker or not.
- singularity mpirun is in the same boat as orteds.
- PMIx is still in the picture too.
- Brian thought we agreed to for OMPI v3.0 and beyond would support both docker and singularity use cases.
- (not neccisarily what he wants)
- Discuss this next week.
- A lot of mixed cases here.
- need to make sure someone documents.
- Certainly not being tested.
- Need to start testing version compatibility compiled with / run with.
Review Master MTT testing
- Mellanox, Sandia, Intel
- LANL, Houston, IBM, Fujitsu
- Amazon,
- Cisco, ORNL, UTK, NVIDIA