Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMIx packaging #4072

Closed
rhc54 opened this issue Aug 10, 2017 · 26 comments
Closed

PMIx packaging #4072

rhc54 opened this issue Aug 10, 2017 · 26 comments
Assignees
Labels
question RTE Issue likely is in RTE or PMIx areas

Comments

@rhc54
Copy link
Contributor

rhc54 commented Aug 10, 2017

@opoplawski @amckinstry @bmwiedemann

We have a growing number of packages integrating against PMIx. I know @opoplawski was packaging it separately at one point for use by OMPI, but may have stopped. I was wondering if you folks would be willing to begin distributing PMIx on its own so that SLURM, OMPI, MPICH, etc. can be built against it?

@rhc54 rhc54 added the question label Aug 10, 2017
@rhc54 rhc54 self-assigned this Aug 10, 2017
@bmwiedemann
Copy link
Contributor

I think, the general wish in distributions is to have smaller packages (that can be updated individually), reducing code-duplication, e.g. in openSUSE we split a 300MB texlive package into many small packages in the past.
and we got the 'poppler' package to handle pdf rendering, so that we do not have to do security updates for xpdf and several other packages that kept their own copy before.

When renaming or splitting rpm packages, you can to use the Obsoletes and Provides lines in .spec files so that all other packages will keep working.

@amckinstry
Copy link

I'd have no problems with doing so.
I'll investigate further what needs to be done.

@rhc54
Copy link
Contributor Author

rhc54 commented Aug 11, 2017

Thanks! Please let us know if there is anything we can do to help, or to make things easier.

Our only dependency is on libevent, and I have verified that we can build/run against both libevent 2.0.22 and 2.1.8.

@amckinstry
Copy link

So. PMIx has been accepted into the Debian unstable archive, and builds ok on most archs (trivial build fix for Hurd required, patch to follow).
I've uploaded a version of openmpi-3.0 to Debian experimental which builds against the external PMIx. We also build aginst the external libevent (currently 2.1)

@rhc54
Copy link
Contributor Author

rhc54 commented Nov 17, 2017

Thank you! Is there some mechanism by which you would like us to notify you of PMIx releases?

@opoplawski
Copy link
Contributor

opoplawski commented Nov 17, 2017 via email

@nmorey
Copy link
Contributor

nmorey commented Nov 20, 2017

The only issue I could foresee with PMIx being package separatly issue is compatibility problems.
Between the 3 versions of OpenMPI we provide, MPICH and SLURM. I doubt they will all be using the same version...

@rhc54
Copy link
Contributor Author

rhc54 commented Nov 20, 2017

The PMIx community recognized that problem about a year ago and began addressing it. Cross-version support is provided starting with the release of version 2.1, and extends backwards to v1.2.5 per the following rules:

  • v1.2.5, v2.0.2 => requires that the PMIx server be at or above the version of the client
  • v2.1 and above => any combination of client/server

We can support earlier versions, and relax the requirements on v1.2.5 and v2.0.2, but doing so requires that the user provide some parameters and/or configuration options, so we don't recommend it. These rules allow for automatic handshake resolution of the version differences.

@amckinstry
Copy link

For clarity: are there (now or expected to be) multiple implementations of pmix, or just pmi?

@rhc54
Copy link
Contributor Author

rhc54 commented Nov 22, 2017

We are unaware of any plans to create an alternative implementation. The reference implementation includes the ability to use proprietary binary plugins to mitigate any need for someone to do so.

@rhc54
Copy link
Contributor Author

rhc54 commented Dec 21, 2017

Question from the SLURM folks: starting at version 2.0, PMIx installs two backward compatibility libraries (libpmi and libpmi2) that translate the PMI-1 and PMI-2 calls to their PMIx equivalents. This allows applications that have hardcoded the name of the PMI library to still use libpmix where supported. There are some speed benefits in doing so, and people have asked for that option.

Unfortunately, SLURM's rpm also install libraries of that name, and so which implementation you get depends on the order of installation. SchedMD proposes that we both split the PMI-1 and PMI-2 libraries into a separate rpm (they called theirs "libpmi") with an explicit "conflict" directive against the other to avoid confusion. We thought perhaps ours would be named "pmix-compat" to avoid confusion with "libpmix", and would contain only our libpmi.* and libpmi2.*.

However, we only distribute an "all-in-one" source rpm and source tarballs, never binary, so I believe (and admit ignorance up front) that this split would have to be done by you folks?

Any thoughts/guidance on this would be much appreciated.

@amckinstry
Copy link

This problem came up within Debian. See the discussion at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=882033

TLDR: It was solved by splitting the binary packages: libpmix2 ships libpmix.so.2* library;
libpmi2-pmix ships the pmix implementation of libpmi2, and the Debian alternatives system can be used to select which libpmi2 library gets used by default.

Similarly for the -dev libraries for include files and symlinks for *.so files.

@rhc54
Copy link
Contributor Author

rhc54 commented Jan 2, 2018

Ah, okay - I read thru it, sounds like you have things well in hand. Please let us know if there is anything we can do to make life easier.

@rhc54
Copy link
Contributor Author

rhc54 commented Jan 2, 2018

@koomie do you have a similar issue in OpenHPC?

@pkovacs
Copy link
Contributor

pkovacs commented Jan 7, 2018

A problem tangential to the use of external pmix:

The pmix project allows you to configure it such that its headers can be installed anywhere you like. When a project contains more than a single header, it is desirable to be able to install those headers to a subdirectory of, e.g. /usr/include. The following is a perfectly reasonable configure for pmix:

# pmix configure
./configure --includedir=/usr/include/pmix ...

The above installs the pmix headers exactly as indicated, to /usr/include/pmix.

The problem is here, on the OMPI side, you cannot configure OMPI to find headers in the above directory because of its appending an unwanted /include, so

# ompi configure
./configure --with-pmix=/usr/include/pmix ...

looks in /usr/include/pmix/include, doesn't find the headers and cannot continue.

I would like to be able to configure OMPI to use the pmix headers in /usr/include/pmix.

The relevant OMPI m4 code is in opal_check_pmi.m4 and opal_check_withdir.m4.

@pkovacs
Copy link
Contributor

pkovacs commented Jan 8, 2018

There are actually several problems with the m4 that searches for external pmix. In addition to the header problem I mentioned above, it also does not look in DIR/lib64 at all for libpmix, but only DIR/lib.

I am going to file a PR to correct these.

@pkovacs
Copy link
Contributor

pkovacs commented Jan 8, 2018

Looks like master already has code commited to fix the DIR/lib64 problem, so I just addressed the DIR/include/pmix issue with #4683.

@nmorey
Copy link
Contributor

nmorey commented Mar 16, 2018

@rhc54 Is there any chance that openmpi 2.x will support pmix 2.1.x ?
I started looking into packaging PMIx by itself for SUSE.
As we both support openmpi 2.X and 3.X we would ideally need a version that can be used by both.
As PMIx 2.1 is compatible with any other PMIx version, it feels like the natural choice

@rhc54
Copy link
Contributor Author

rhc54 commented Mar 16, 2018

Just checked and the answer, unfortunately, is no - in fact, OMPI 2.x has a check in configure that will error out if a PMIx version greater than 1.x is provided.

We are working to avoid such disconnects in the future - we just didn't catch it in time for OMPI 2.x and cannot really go back to correct it. Is it possible for you to leave OMPI 2.x as a complete package (i.e., using the embedded PMIx), and have OMPI starting with v3 use the independent PMIx package? OMPI is about to release v3.1.0 - it has more flexibility on PMIx, but not looking upward (I think it accepts v2.x down to v1.2.5). We can look at updating that going forward.

@nmorey
Copy link
Contributor

nmorey commented Mar 16, 2018

Yes we can do that. But openmpi2 is the one that will be in SLES15 and maintained for the next 7+ years so I would have liked to get it on the same track too

@rhc54
Copy link
Contributor Author

rhc54 commented Mar 16, 2018

Really sorry - I can understand the concern. I have added this to the OMPI devel meeting next week so we discuss how to make this better for packagers going forward:

https://github.com/open-mpi/ompi/wiki/Meeting-2018-03

@rhc54
Copy link
Contributor Author

rhc54 commented Mar 16, 2018

I took a quick gander at this and think it can be resolved, but it would require a change and so it won't help with the existing 2.x/3.x releases already out there. Is that a show-stopper? Or should I go ahead and (a) see if the release managers would accept it, and if so (b) build the commit?

@rhc54
Copy link
Contributor Author

rhc54 commented Mar 16, 2018

@opoplawski Quick question for you. I've been asked by a few folks about PMIx packages available from the various distros. IIRC, you were posting those to Fedora at one time, but I don't see anything past 1.2.2 on the CentOS site. Are you still doing them? Is there anything we can do to help with updated versions?

The key versions people want are 1.2.5, 2.0.3, and 2.1.1 as those are guaranteed to inter-operate, so we'd like to see those available if possible.

@pkovacs
Copy link
Contributor

pkovacs commented Mar 16, 2018

@rhc54 Ralph I'm the Fedora guy for pmix (and slurm). I have pmix 2.1.0 in Fedora 28/29 and will be bumping to 2.1.1, perhaps this weekend. ompi in Fedora still uses internal pmix.

@rhc54
Copy link
Contributor Author

rhc54 commented Mar 16, 2018

Oh - I didn't know that! Thanks! I think the container folks would like to see the OMPI rpm use an external PMIx, but we are going to discuss how we make that easier at the devel meeting next week anyway - will get back to you here after that discussion.

@rhc54 rhc54 added the RTE Issue likely is in RTE or PMIx areas label Jun 26, 2018
@rhc54
Copy link
Contributor Author

rhc54 commented Oct 1, 2018

Closing this as we now have the packager mailing list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question RTE Issue likely is in RTE or PMIx areas
Projects
None yet
Development

No branches or pull requests

6 participants