Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faqs: what versions of OpenMPI work with Flux? #94

Merged
merged 3 commits into from
Apr 2, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions faqs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,21 @@ The interesting part of the versioning comes from the multi-repo structure. Flux
A 'flux' meta-package (such as in spack or distro package managers) that would pull in compatible versions of the various sub-packages/repos is also versioned independently of any of its subcomponents. It is a similar situation for the flux-docs repo and the documentation up on readthedocs. Each repo has it's own documentation and that gets tagged and released along with the code, but the high-level "meta" documentation has it's own versioning that is divorced from any particular sub-packages/repos versioning.

.. TODO: we should make a table and put it in the docs too

----------------------------------------
What versions of OpenMPI work with Flux?
----------------------------------------

Flux plugins were added to OpenMPI 3.0.0. Generally, these plugins enable OpenMPI major versions 3 and 4 to work with Flux. OpenMPI must be configured with the Flux plugins enabled. Your installed version may be checked with:

.. code-block:: console

$ ompi_info|grep flux
MCA pmix: flux (MCA v2.1.0, API v2.0.0, Component v4.0.3)
MCA schizo: flux (MCA v2.1.0, API v1.0.0, Component v4.0.3)

Unfortunately, `an OpenMPI bug <https://github.com/open-mpi/ompi/issues/6730>`_ broke the Flux plugins in OpenMPI versions 3.0.0-3.0.4, 3.1.0-3.1.4, and 4.0.0-4.0.2. The `fix <https://github.com/open-mpi/ompi/pull/6764/commits/d4070d5f58f0c65aef89eea5910b202b8402e48b>`_ was backported such that the 3.0.5+, 3.1.5+, and 4.0.2+ series do not experience this issue.

A slightly different `OpenMPI bug <https://github.com/open-mpi/ompi/pull/8380>`_ caused segfaults of MPI in ``MPI_Finalize`` when UCX PML was used. `The fix <https://github.com/open-mpi/ompi/pull/8380>`_ was backported to 4.0.6 and 4.1.1. If you are using UCX PML in OpenMPI, we recommend using 4.0.6+ or 4.1.1+.

For the upcoming 5.0 release, the OpenMPI project dropped the Flux plugins and abstraction layer that contains them. There are two solutions being discussed, neither of which is implemented yet: `Running PRRTE inside a Flux allocation <https://github.com/flux-framework/flux-core/issues/3539>`_ and `Implementing a PMIx job shell plugin <https://github.com/flux-framework/flux-core/issues/3536>`_.
1 change: 1 addition & 0 deletions spell.en.pws
Original file line number Diff line number Diff line change
Expand Up @@ -477,3 +477,4 @@ dmesg
eventlog
eventlogs
nodelist
backported