Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have configure abort if building for 32 bit #11282

Merged

Conversation

jsquyres
Copy link
Member

@jsquyres jsquyres commented Jan 9, 2023

Per #11248, have configure abort if it detects that it is building in a 32-bit environment.

The intent here is to see -- especially after v5.0.0 has been released and gets real-world usage -- if anyone cares about 32-bit builds any more (we suspect that no one does). This PR is trivial to revert if someone cares about 32-bit builds.

If v5.0.x gets "enough" real-world usage and no one cares about 32-bit builds, we can start thinking about removing 32-bit infrastructure from within the Open MPI code base.

Per open-mpi#11248, have configure fail
if we're building for a 32-bit environment.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Also explicitly mention that 32-bit builds are no longer supported as
of v5.0.0.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
@jsquyres
Copy link
Member Author

Brought this up on the 10 Jan 2023 webex: no one cared, and no one wanted to be a 32-bit maintainer.

Sent this to the devel list for a wider distribution. If no one steps up as maintainer by Fri, 13 Jan 2023, we'll merge this PR (and PR it to v5.0.x).

@wickberg
Copy link

If it helps justify this, Slurm deprecated 32-bit support four years ago, and we've yet to hear of anyone complaining. Although we do provide a flag (--enable-deprecated) to bypass that check:

SchedMD/slurm@90b551f

@bertwesarg
Copy link
Member

@wickberg does this mean, you cannot submit i386 jobs with SLURM anymore. If not, I do not see the connection to this issue?

@jsquyres jsquyres marked this pull request as ready for review January 18, 2023 18:02
@jsquyres
Copy link
Member Author

bot:aws-v2:retest

1 similar comment
@jsquyres
Copy link
Member Author

bot:aws-v2:retest

@rhc54
Copy link
Contributor

rhc54 commented Jan 19, 2023

If no one steps up as maintainer by Fri, 13 Jan 2023, we'll merge this PR (and PR it to v5.0.x).

So it is a week after your deadline - any conclusions?

@bwbarrett
Copy link
Member

@rhc54 the conclusion appears to be that we're going to commit this change. I think CI brokenness and travel have just delayed clicking the button.

@jsquyres the CI all passes now, so you should be good to go.

@rhc54
Copy link
Contributor

rhc54 commented Jan 19, 2023

Thanks - I have committed the PMIx version and backported it to the v4.2 branch for inclusion in upcoming release.

@jsquyres jsquyres merged commit 5dc2398 into open-mpi:main Jan 21, 2023
@jsquyres jsquyres deleted the pr/disable-32-bit-builds-in-configure branch January 21, 2023 00:28
@glaubitz
Copy link

glaubitz commented Feb 3, 2023

So, every architecture here that is 32-bit will then fail once openmpi 5.0.0 is uploaded to Debian unstable:

https://buildd.debian.org/status/package.php?p=openmpi&suite=sid

Since the package has several reverse dependencies, it will make those reverse dependencies unbuildable.

Thus, your change would break the following large list of packages on Debian unstable for 32-bit architectures:

glaubitz@suse-laptop:~> ssh mirror.ftp-master.debian.org "dak rm -Rn openmpi"
Will remove the following packages from unstable:

libopenmpi-dev |    4.1.4-3 | amd64, arm64, armel, armhf, i386, mips64el, mipsel, ppc64el, s390x
libopenmpi3 |    4.1.4-3 | amd64, arm64, armel, armhf, i386, mips64el, mipsel, ppc64el, s390x
   openmpi |    4.1.4-3 | source
openmpi-bin |    4.1.4-3 | amd64, arm64, armel, armhf, i386, mips64el, mipsel, ppc64el, s390x
openmpi-common |    4.1.4-3 | all
openmpi-doc |    4.1.4-3 | all

Maintainer: Alastair McKinstry <mckinstry@debian.org>

------------------- Reason -------------------

----------------------------------------------

Checking reverse dependencies...
# Broken Depends:
abinit: abinit
abyss: abyss [amd64 arm64 mips64el ppc64el s390x]
aces3: aces3
adios: libadios-bin
       python3-adios
ampliconnoise: ampliconnoise
apbs: apbs
arpack: libparpack2
boost1.74: libboost-graph-parallel1.74.0
           libboost-mpi-python1.74.0
           libboost-mpi1.74.0
boost1.81: libboost-graph-parallel1.81.0
           libboost-mpi-python1.81.0
           libboost-mpi1.81.0
bornagain: bornagain
           python3-bornagain
cctools: coop-computing-tools
code-saturne: code-saturne-bin
combblas: libcombblas1.16.0
cp2k: cp2k
deal.ii: libdeal.ii-9.4.0 [amd64 ppc64el s390x]
dolfin: libdolfin2019.2
        libdolfin64-2019.2
        python3-dolfin-real
        python3-dolfin64-real
dolfinx-mpc: libdolfinx-mpc0.5 [amd64 arm64 armel armhf i386 ppc64el s390x]
dune-grid: libdune-grid-dev
dune-pdelab: libdune-pdelab-dev [amd64 arm64 armel armhf i386 mips64el ppc64el s390x]
dune-uggrid: libdune-uggrid-dev
eckit: libeckit-dev [amd64 arm64 mips64el ppc64el s390x]
       libeckit0d [amd64 arm64 mips64el ppc64el s390x]
elkcode: elk-lapw
elpa: libelpa19
espresso: quantum-espresso [amd64 arm64 armhf i386 mips64el mipsel ppc64el s390x]
esys-particle: esys-particle
examl: examl
eztrace: libeztrace0 [amd64 arm64 armhf i386 mips64el ppc64el s390x]
fckit: libfckit0d [amd64 arm64 mips64el ppc64el s390x]
fenics-dolfinx: libdolfinx-complex0.5
                libdolfinx-real0.5
                python3-dolfinx-complex
                python3-dolfinx-real
fenicsx-performance-tests: fenicsx-performance-tests
ffindex: ffindex
fftw: fftw2
      sfftw2
fftw3: libfftw3-mpi3
fiat-ecmwf: fiat-utils [amd64 arm64 mips64el ppc64el s390x]
            libfiat-0 [amd64 arm64 mips64el ppc64el s390x]
form: form
freefem++: freefem++
           libfreefem++
garli: garli-mpi
genomicsdb: genomicsdb-tools [amd64 mips64el]
            libgenomicsdb-jni [amd64 mips64el]
            libgenomicsdb0 [amd64 mips64el]
gerris: gerris
        libgfs-1.3-2
getdp: getdp
       getdp-sparskit
gfsview: gfsview-batch
         libgfsgl0
gpaw: gpaw
gretl: gretl [amd64 arm64 s390x]
gromacs: gromacs [amd64 arm64 mips64el ppc64el s390x]
         libgromacs7 [amd64 arm64 mips64el ppc64el s390x]
gyoto: gyoto-bin
       yorick-gyoto
h5py: python3-h5py-mpi
hdf5: libhdf5-openmpi-103-1
      libhdf5-openmpi-dev
      libhdf5-openmpi-fortran-102
hpcc: hpcc
hyphy: hyphy-mpi
hypre: libhypre-2.26.0
       libhypre64-2.26.0 [amd64 arm64 mips64el ppc64el s390x]
       libhypre64m-2.26.0 [amd64 arm64 mips64el ppc64el s390x]
iqtree: iqtree [amd64 i386]
lammps: lammps
        liblammps0
libghemical: libghemical5v5 [amd64 arm64 i386 mips64el mipsel ppc64el s390x]
liggghts: libliggghts3
          liggghts
lrslib: mplrs
mathgl: libmgl-mpi8
med-fichier: libmed11
meep-mpi-default: libmeep-mpi-default30
meep-openmpi: libmeep-openmpi30
              meep-openmpi
molds: molds [amd64 arm64 armhf i386 mips64el ppc64el s390x]
mpb: mpb-mpi
mpgrafic: mpgrafic
mpi-defaults: mpi-default-bin
              mpi-default-dev
mpi4py: python3-mpi4py
mpqc: libsc-dev
      libsc7v5
      mpqc [amd64 arm64 i386 mips64el mipsel ppc64el s390x]
mrmpi: libmrmpi1
mshr: libmshr-dev [amd64 arm64 i386 mips64el ppc64el s390x]
      libmshr2019.2 [amd64 arm64 i386 mips64el ppc64el s390x]
      libmshr64-2019.2 [amd64 arm64 i386 mips64el ppc64el s390x]
mumps: libmumps-5.5
       libmumps-64pord-5.5
       libmumps-64pord-ptscotch-5.5
       libmumps-ptscotch-5.5
       mumps-test
murasaki: murasaki-mpi
music: libmusic1v5
       music-bin
       python3-music
netcdf-parallel: libnetcdf-mpi-19
                 libnetcdf-pnetcdf-19 [amd64 arm64 mips64el ppc64el s390x]
netgen: libnglib-6.2 [amd64 arm64 armel armhf i386 mips64el s390x]
        netgen [amd64 arm64 armel armhf i386 mips64el s390x]
netpipe: netpipe-openmpi [amd64 arm64 armel armhf i386 ppc64el]
neuron: neuron [amd64 arm64 armel armhf i386 ppc64el s390x]
        neuron-dev [amd64 arm64 armel armhf i386 ppc64el s390x]
nwchem: nwchem-openmpi
octave-mpi: octave-mpi
open-coarrays: libcaf-openmpi-3
               libcoarrays-openmpi-dev
openfoam: libopenfoam [amd64 arm64 armhf i386 mips64el mipsel ppc64el s390x]
openmx: openmx
opm-grid: libopm-grid [amd64 arm64 ppc64el]
          libopm-grid-bin [amd64 arm64 ppc64el]
opm-simulators: libopm-simulators [amd64 arm64 ppc64el]
                libopm-simulators-bin [amd64 arm64 ppc64el]
                python3-opm-simulators [amd64 arm64 ppc64el]
opm-upscaling: libopm-upscaling [amd64 arm64 ppc64el]
               libopm-upscaling-bin [amd64 arm64 ppc64el]
otf: otf-trace
p4est: libp4est-2.2 [amd64 arm64 mips64el ppc64el s390x]
       libp4est-sc-2.2 [amd64 arm64 mips64el ppc64el s390x]
palabos: libplb1
paraview: paraview [amd64 arm64 i386 ppc64el s390x]
          python3-paraview [amd64 arm64 i386 ppc64el s390x]
parmetis/non-free: libparmetis4.0
                   parmetis-test
petsc: libpetsc-complex3.18
       libpetsc-complex3.18-dbg
       libpetsc-complex3.18-dev
       libpetsc-real3.18
       libpetsc-real3.18-dbg
       libpetsc-real3.18-dev
       libpetsc64-complex3.18
       libpetsc64-complex3.18-dbg
       libpetsc64-complex3.18-dev
       libpetsc64-real3.18
       libpetsc64-real3.18-dbg
       libpetsc64-real3.18-dev
petsc4py: python3-petsc4py-64-complex3.18
          python3-petsc4py-64-real3.18
          python3-petsc4py-complex3.18
          python3-petsc4py-real3.18
phyml: phyml
pnetcdf: libpnetcdf0d [amd64 arm64 mips64el ppc64el s390x]
         pnetcdf-bin [amd64 arm64 mips64el ppc64el s390x]
prime-phylo: prime-phylo
purify: purify [amd64 arm64 armel armhf i386 mipsel ppc64el s390x]
pyhst2/contrib: python3-pyhst2-cuda [amd64]
pysph: python3-pysph [amd64 arm64]
python-escript: python3-escript-mpi [amd64 arm64 armel armhf i386 mipsel ppc64el s390x]
pyzoltan: python3-pyzoltan [amd64 arm64 ppc64el s390x]
r-cran-metamix: r-cran-metamix
ray: ray
relion: relion
relion-cuda/contrib: relion-cuda [amd64]
                     relion-gui-cuda [amd64]
rheolef: librheolef1 [amd64 arm64 i386 mips64el mipsel ppc64el s390x]
         rheolef [amd64 arm64 i386 mips64el mipsel ppc64el s390x]
rmpi: r-cran-rmpi
ruby-mpi: ruby-mpi
scalapack: libscalapack-openmpi2.2
           scalapack-mpi-test
scotch: libptscotch-7.0
        ptscotch
silo-llnl: libsiloh5-0
slepc: libslepc-complex3.18
       libslepc-real3.18
       libslepc64-complex3.18
       libslepc64-real3.18
slepc4py: python3-slepc4py-64-complex3.18
          python3-slepc4py-64-real3.18
          python3-slepc4py-complex3.18
          python3-slepc4py-real3.18
sopt: libsopt-dev
      libsopt3.0
spfft: libspfft1
spooles: libspooles2.2
starpu: libstarpumpi-1.3-3
        starpu-examples
starpu-contrib/contrib: libstarpu-contribmpi-1.3-3 [amd64]
                        starpu-contrib-examples [amd64]
stopt: libstopt-dev
       libstopt5
       python3-stopt
sundials: libsundials-nvecparallel-hypre6
          libsundials-nvecparallel-mpi6
          libsundials-nvecparallel-petsc6
superlu-dist: libsuperlu-dist-dev
              libsuperlu-dist8
tachyon: libtachyon-openmpi-0
         libtachyon-openmpi-0-dev
tree-puzzle: tree-ppuzzle
trilinos: libtrilinos-amesos-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-amesos2-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-anasazi-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-aztecoo-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-belos-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-epetra-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-epetraext-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-galeri-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-ifpack-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-ifpack2-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-isorropia-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-ml-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-moertel-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-muelu-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-nox-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-pamgen-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-phalanx-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-pike-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-piro-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-pliris-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-rtop-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-rythmos-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-stokhos-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-stratimikos-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-teko-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-teuchos-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-thyra-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-tpetra-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-trilinoscouplings-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-triutils-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-xpetra-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-zoltan-13.2 [amd64 arm64 ppc64el s390x]
          libtrilinos-zoltan2-13.2 [amd64 arm64 ppc64el s390x]
vtk6: libvtk6.3
vtk7: libvtk7.1p
vtk9: libvtk9.1
      python3-vtk9
yade: libyade [amd64 i386 s390x]
yorick: yorick-mpy-openmpi [amd64 arm64 armel armhf i386 mipsel ppc64el s390x]

# Broken Build-Depends:
abyss: libopenmpi-dev
adios: libopenmpi-dev
       openmpi-bin
aevol: libopenmpi-dev
amgcl/contrib: libopenmpi-dev
ampliconnoise: libopenmpi-dev
armci-mpi: libopenmpi-dev
atlas-ecmwf: libopenmpi-dev
bitshuffle: openmpi-bin
code-saturne: libopenmpi-dev
ectrans: libopenmpi-dev
examl: libopenmpi-dev
ffindex: libopenmpi-dev
         openmpi-bin
ga: libopenmpi-dev
garli: libopenmpi-dev
genomicsdb: libopenmpi-dev
hdf5: libopenmpi-dev
iitii: libopenmpi-dev
meep-openmpi: libopenmpi-dev
              openmpi-bin
meson: openmpi-bin
mpi-defaults: libopenmpi-dev (1.4.3-2.1 >=)
              openmpi-bin (1.4.3-2.1 >=)
murasaki: libopenmpi-dev
music: libopenmpi-dev
netgen: libopenmpi-dev
netpipe: libopenmpi-dev
nwchem: libopenmpi-dev
open-coarrays: libopenmpi-dev
phyml: libopenmpi-dev
prime-phylo: libopenmpi-dev
python-escript: libopenmpi-dev
                openmpi-bin
r-cran-metamix: libopenmpi-dev
relion: libopenmpi-dev
relion-cuda/contrib: libopenmpi-dev
scalapack: libopenmpi-dev
sopt: libopenmpi-dev
tachyon: libopenmpi-dev
tree-puzzle: libopenmpi-dev
trilinos: libopenmpi-dev
          openmpi-bin
yorick: libopenmpi-dev

Dependency problem found.

glaubitz@suse-laptop:~>

Not sure whether you really want to do that.

@rhc54
Copy link
Contributor

rhc54 commented Feb 3, 2023

So just to be clear: you are advocating that every software package maintain 32-bit support for all eternity - even if nobody is asking for it? Otherwise, this isn't a discussion of 32-bit or no-32-bit - it is simply a question of when the backward breakage occurs.

@glaubitz
Copy link

glaubitz commented Feb 3, 2023

So just to be clear: you are advocating that every software package maintain 32-bit support for all eternity - even if nobody is asking for it? Otherwise, this isn't a discussion of 32-bit or no-32-bit - it is simply a question of when the backward breakage occurs.

No, I am not advocating for that. I am advocating against breaking software intentionally like what this PR does.

There is a difference between »we need to drop this code because it puts a big maintenance burden on us« vs. »I don't want you to use my software on target system XYZ«.

@rhc54
Copy link
Contributor

rhc54 commented Feb 3, 2023

Okay - not trying to be confrontational about this, but the reason we dropped it is the first one. We are a volunteer organization, and we have no volunteers willing to maintain 32-bit support. We know it is broken. So I'm curious to know how you suggest we proceed?

We could, I guess, not release OMPI v5, or we could advise Debian not to include it any more. Neither of those seems satisfactory nor helpful to the general community.

Given that 32-bit support will not be fixed, how then would you suggest we resolve the dilemma? Letting people try to build it - and fail - seems counterproductive. Not sure what the Debian maintainer would/could do with that situation.

Any other suggestions would be welcome.

@glaubitz
Copy link

glaubitz commented Feb 3, 2023

If you can let us know how it's broken, we're happy to help in fixing it. There are several downstream projects and developers that still work on 32-bit targets, even more exotic ones such as hppa or m68k. Most of these people are volunteers, too.

In most cases, developers will just send patches to the corresponding upstream projects if they run into any issues. I know that both Debian and Gentoo developers are doing that regularly for many projects.

You don't have to spend any time or effort in maintaining 32-bit support. However, if someone is sending you a patch to fix an issue on a 32-bit target, it would be nice if you could accept it.

FWIW, Debian is even working on rebootstrapping 32-bit ARM with 64-bit timestamp support, so I expect there will be enough testing and development on 32-bit targets for the foreseeable future.

@rhc54
Copy link
Contributor

rhc54 commented Feb 3, 2023

We can discuss it - no promises on what we'll decide. All sounds kinda strange, and it would mean putting release timelines outside our own control, so that has to be considered as well.

@jsquyres
Copy link
Member Author

jsquyres commented Feb 6, 2023

@glaubitz Can you clarify something for me? I have no doubt that there are valid 32-bit use cases of general computing in a variety of environments. My question is: do these use cases involve Open MPI and/or HPC-class applications? I.e., do you have citeable cases where users are using Open MPI for computation in 32 bit-only environments?

I'm not asking to be snarky; I'm asking because HPC applications that are used for academic, scientific, and/or commercial purposes tend to require a lot of compute power, and therefore tends to stay within the last few years of computing hardware. The transition to 64 bit in commodity hardware happened ~20 years ago (per https://en.wikipedia.org/wiki/64-bit_computing); supporting >= 20-year-old hardware is pretty far outside of the envelope of what we typically try to support with Open MPI.

We did conduct an informal search to see if we could find any academic, scientific, or commercial users who still use Open MPI in a 32 bit environment, and failed to find any. It certainly wasn't a comprehensive search, and perhaps there are some actual real-world usage of 32 bit MPI out there. If we could find out exactly who those are and what their use cases are, that would be most helpful to the discussion. Thanks!

@hjelmn
Copy link
Member

hjelmn commented Feb 6, 2023

Figured I would join in on this discussion despite not being that active at this time.

32-bit support is not in any way simple to support even if there are incoming patches. We have components throughout the code base with separate code to handle these archaic architectures. Ideally, we want to clean them all out because modern HPC has long since moved on to 64-bit. Even if there are patches coming in I don't think it is feasible long term to maintain the support. i386, ppc32, arm32, etc are just not worth the effort if 99.999% of usage is on 64-bit.

Also, and I know I have said this a lot, if Open MPI 5.0.0 and beyond no longer support 32-bit it doesn't mean there is not an Open MPI version that will run on these architectures. Yes, 32-bit systems will not be able to use new MPI features, but that is to be expected when using something old. Maybe, to keep a working version around around for a bit, it would be worth taking new patches into 4.1.x for 32-bit support.

@glaubitz
Copy link

glaubitz commented Feb 6, 2023 via email

@glaubitz
Copy link

glaubitz commented Feb 6, 2023 via email

@rhc54
Copy link
Contributor

rhc54 commented Feb 6, 2023

I believe what this all boils down to is what I said in the beginning - this isn't a question of if we discontinue 32-bit support, but rather when we discontinue it. We can't maintain it forever as that is simply an unrealistic expectation, even if there are outside people willing to contribute patches. As Nathan has indicated, the code just gets too messy.

By our own rules, any discontinuation must occur at a major version change. So the question we have to decide is: do we discontinue it at v5? Is there a compelling reason to wait for v6? At the moment, the decision was made to do it for v5, but we'll discuss that again.

As I said before, Linux distributions usually don’t work like that. We cannot ship different package versions for each architecture.

Not sure I fully understand this statement. I can download different versions of OMPI for Debian if I want to - been doing that for quite some time, though I confess to mostly using other distros (and so perhaps Debian has limitations I'm unfamiliar with). I do agree this is true for the default version, but that isn't relevant to this discussion.

Within one distribution version, a package version is the same across all architectures.

This seems like a rather odd and somewhat arbitrary constraint. I'm not sure how much weight we should give it in our decision process, but we'll see what people think.

@rhc54
Copy link
Contributor

rhc54 commented Feb 6, 2023

Just for grins, I went to the Debian official package site and found the following OMPI versions available:

Package libopenmpi-dev

    [stretch (oldoldstable)](https://packages.debian.org/stretch/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    2.0.2-2: amd64 arm64 armel armhf i386 mips mips64el mipsel ppc64el s390x
    [buster (oldstable)](https://packages.debian.org/buster/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    3.1.3-11: amd64 arm64 armel armhf i386 mips mips64el mipsel ppc64el s390x
    [bullseye (stable)](https://packages.debian.org/bullseye/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    4.1.0-10: amd64 arm64 armel armhf i386 mips64el mipsel ppc64el s390x
    [bookworm (testing)](https://packages.debian.org/bookworm/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    4.1.4-3: amd64 arm64 armel armhf i386 mips64el mipsel ppc64el s390x
    [sid (unstable)](https://packages.debian.org/sid/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    4.1.4-3: alpha amd64 arm64 armel armhf hppa i386 ia64 m68k mips64el mipsel ppc64 ppc64el riscv64 s390x sh4 sparc64 x32

So one could literally go back to install the old OMPI v2.0.2 release, if so inclined, for any architecture. I therefore don't see the issue with Nathan's proposal - seems like Debian supports it. Agreed, someone interested in 32-bit would need to look for the older versions, but that seems appropriate given the status of 32-bits in general.

It does, though, have the secondary impact of constraining such users to the older versions of the dependent packages. Not sure that is an OMPI responsibility, though something worth pondering.

@jsquyres
Copy link
Member Author

jsquyres commented Feb 6, 2023

As outlined earlier in this discussion, the problem are the reverse dependencies.Distributions usually do not ship architecture-specific versions of package but all supported architectures are shipping the same version in their rolling release/unstable distributions.If openmpi did stop working on 32-bit targets, it would mean that the reverse dependencies eventually become unbuildable due to unsatisfied build dependencies because they build-depend on the development files of OpenMPI.

Understood. But at least a bunch of those packages with build dependencies on Open MPI are:

  1. HPC/MPI applications.
    • Any package that requires libopenmpi-dev probably needs mpi.h, mpif.h, and/or the Fortran mpi or mpi_f08 modules.
    • That means that they're MPI/HPC applications, and that's the open question: does anyone really run these in 32-bit environments any more?
  2. Packages that provide support for HPC/MPI applications
    • For example, boost is a generalized C++ API library, but it has a section of its API explicitly for wrapping the MPI API in the boost abstractions.
    • MPI support in boost can simply be turned off, and then the rest of the boost library still works just fine.

While I don't know what every single one of the packages is that you cite in your list, I'd be willing to bet that the majority of them are either MPI applications or HPC-related applications for which it's not clear that there's any use case in 32 bit any more. Put differently: do people really have clusters of 32-bit environments where they run HPC applications any more?

I guess it boils down to: are there still real-world use cases for HPC/MPI in 32-bit platforms? Or is this just a generalized desire to keep the support matrix for all the packages as large as possible?

Out of curiosity: What architecture-specific code are you shipping to support 32-bit PowerPC, for example?

Is there any inline assembly being used?

Yes.

In some of the gnarliest bits of the code, we have some assembly where the standardized libraries do not provide high enough performance. This is what High Performance Computing is all about, after all. 😄

@hjelmn and others can speak in more detail about these parts of the code base than I can. @hjelmn's point above is not incorrect: 32 bit support is not a trivial thing. If none of our users are using it, it's hard to justify the effort to support it.

As I said before, Linux distributions usually don’t work like that. We cannot ship different package versions for each architecture.

Different package versions are released in different distribution versions. Within one distribution version, a package version is the same across all architectures.

Someone recently replied on the devel mailing list that they would be willing to support 32 bit -- we'll see where that discussion goes. It is possible that 32 bit will become supported in v5.0.x.

@amckinstry
Copy link

For Debian, we currently support OpenMPI and MPICH as MPI implementations.

As @glaubitz points out, the issue is reverse-deps and the fact that we implement only one version of a package in a distribution (release). Disabling 32-bit means either (1) dropping the reverse-dep or MPI support for that package (conditionally for 32-bit only), or (2) switching MPI implementation for that arch.
Note that we have code support in Debian to have different default MPI distributions per arch. It just so happens that OpenMPI is currently the default for all archs.

So if 32-bit is removed for OMPI 5.0, we will probably move to MPICH as the default for i386, etc.
Its not possible to have two OpenMPIs in Debian at the same time, and the dependency structure means that while 4.1.4 binaries for i386/arm etc are present in the archive to download, dependencies and reverse-deps will have changed, so it won't really be possible to use 4.1.4 in a Debian >= 13 system.

One possibility is introducing a "new OpenMPI4" package that is just built for 32-bit, ie having 3 implementations (OMPI,MPCICH,OMPI4) as different packages providing libmpi-dev etc. This is technically possible, it depends on
(1) security support for OMPI4 continuing
(2) dep package support continuing (eg will pmix etc continue for 32-bit)?

@coldtobi
Copy link

coldtobi commented Feb 7, 2023

Just for grins, I went to the Debian official package site and found the following OMPI versions available:

Package libopenmpi-dev

    [stretch (oldoldstable)](https://packages.debian.org/stretch/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    2.0.2-2: amd64 arm64 armel armhf i386 mips mips64el mipsel ppc64el s390x
    [buster (oldstable)](https://packages.debian.org/buster/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    3.1.3-11: amd64 arm64 armel armhf i386 mips mips64el mipsel ppc64el s390x
    [bullseye (stable)](https://packages.debian.org/bullseye/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    4.1.0-10: amd64 arm64 armel armhf i386 mips64el mipsel ppc64el s390x
    [bookworm (testing)](https://packages.debian.org/bookworm/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    4.1.4-3: amd64 arm64 armel armhf i386 mips64el mipsel ppc64el s390x
    [sid (unstable)](https://packages.debian.org/sid/libopenmpi-dev) (libdevel): high performance message passing library -- header files
    4.1.4-3: alpha amd64 arm64 armel armhf hppa i386 ia64 m68k mips64el mipsel ppc64 ppc64el riscv64 s390x sh4 sparc64 x32

So one could literally go back to install the old OMPI v2.0.2 release, if so inclined, for any architecture. I therefore don't see the issue with Nathan's proposal - seems like Debian supports it. Agreed, someone interested in 32-bit would need to look for the older versions, but that seems appropriate given the status of 32-bits in general.

Those are all packages from different releases. Stretch is Debian 9, Buster is Debian 10, Bullseye is Debian 11. You cannot (easily) mix packages from different releases.

And it doesn't help for the reverse dependencies as packages must be built with the packages in the targeting release.
You cannot (as in "it is not available on the machines building the packages) compile a package for the next release (Debian 12, Bookworm, coming this summer) with a package from Debian 8,9,10,11…

@ggouaillardet
Copy link
Contributor

ggouaillardet commented Feb 7, 2023

@amckinstry What is Debian's plan for MPICH w.r.t. PMIx?

If mpich is built on top of PMIx (this is definitely an option, but I cannot remember if this is mandatory or only optional) and PMIx drops 32bit support as well, does it mean a bumpy road for MPI on 32bit systems?

@amckinstry
Copy link

@ggouaillardet Yes, if pmix goes 64-bit it complicates matters. We do arch-dependent package and configuration deps already (eg ucx is only available on [amd64 arm64 ppc64el] but mpi is built with it on those ).
Pmix is definitely optional in MPICH, not used in the ch4 devices I think, only ch3.

@rhc54
Copy link
Contributor

rhc54 commented Feb 7, 2023

FWIW: PMIx has never made any effort to have 32-bit support, so I'm rather surprised to hear folks implying that earlier versions supported it. So far as I know, there has been no meaningful change that would have impacted 32 vs 64 bit, so either (a) it never supported 32-bit builds, or (b) it inadvertently broke it.

Either way, I'm curious about Debian's plans for the future. Are you saying you must support 32-bit operations for all eternity? So when we switch the norm from today's 64-bit machines to tomorrow's 128-bit machines, you still plan to require everyone to support 32-bit?

If not, then again, this is a question of when and not if we drop 32-bit support. So what drives the decision point?

@rhc54
Copy link
Contributor

rhc54 commented Feb 7, 2023

pmix is definitely optional in MPICH, not used in the ch4 devices I think, only ch3.

It's the other way around - ch4 can operate with the old PMI-1, but is limited by it. Of course, for an old 32-bit arch, it won't matter - you are already pretty constrained.

@amckinstry
Copy link

@rhc54 I don't think Debian will support 32-bit "for eternity", and indeed I've had to drop 32-bit support for a bunch of numerical packages that have ceased to care about non-64 bit.

MPI is a more complex beast, as is shown by the reverse-dep list @glaubitz posted. There are a bunch of libraries that use MPI that would cause widespread breakage if 32-bit MPI breakage were gone.

My current thinking is that we'd move to MPICH (either per-arch or as a default) to retain 32-bit compatibility;
the when we drop is probably based on security support: ie I could create an OpenMPI4 package stuck at 4.1.x to hold onto 32-bit support (and have the normal OpenMPI progress to 5,6,... ) but only for as long as there was upstream 4.x support for potential security bugs.

@jsquyres
Copy link
Member Author

jsquyres commented Feb 7, 2023

@amckinstry What's your thought on how many users actually run MPI-based apps in 32-bit environments? The list that @glaubitz posted included at least a fair number of MPI applications -- would it really impact any real users if those MPI applications became unavailable in 32-bit environments?

Put differently: yes, it looks like a long, scary list of applications that would disappear if 32-bit MPI disappears. But I think the operative question is: given the typical environments where HPC/MPI applications are run, would anyone notice? (that's a genuine question, not a snarky question)

@rhc54
Copy link
Contributor

rhc54 commented Feb 7, 2023

I don't think Debian will support 32-bit "for eternity", and indeed I've had to drop 32-bit support for a bunch of numerical packages that have ceased to care about non-64 bit.

Not surprising - and I'm not trying to be snarky here either. Just trying to figure out if this whole discussion is based on user demand, or simple inertia. I think the questions from @jsquyres direly address this.

Just to be clear, since it is PMIx that is getting dragged over the coals here. We (PMIx) have never made any conscious effort to support 32-bit (as nobody has ever asked for that environment), but we also have no philosophical issue with it. We just never cared since (to our knowledge) nobody uses it there. If someone wants to dive in and make it work, fine - we just don't have anyone interested/willing to do so.

I've raised the question of whether OMPI itself even supports 32-bit today, and nobody can say for certain either way. The only reason PMIx is getting raised here is because it is the first compile point that raises the flag - so we cannot get to the rest of the OMPI code to see if it is likewise broken for 32-bit. Ditto for PRRTE (which is also part of OMPI v5).

@amckinstry
Copy link

@jsquyres Its a good question: I doubt there's anyone seriously doing HPC on 32-bit; I'd like to know different.
The exception worth thinking about is/was the "x32" arch: this is an ABI with 32-bit integers, pointers etc but on x86_64 hardware: for smaller cache footprint and executables. You get more registers, cache-friendlier code if your task fits inside 4 GB - if not you just run amd64 on the same hardware. But I don't see much effort in the x32 port.

Its mostly a complexity issue: the scary list is not just applications but libraries. Reworking them all to do MPI/non-MPI is significant breakage. So its mostly inertia on avoiding this.

@rhc54
Copy link
Contributor

rhc54 commented Feb 7, 2023

FWIW: I have released PMIx v4.2.3, which is the PMIx being discussed here (it is the minimal level required by OMPI v5):

https://github.com/openpmix/openpmix/releases/tag/v4.2.3

I removed the configure "abort if 32-bit" check so PMIx (a) doesn't stand in the way of this conversation, and (b) doesn't get blamed for OMPI losing its "default" status on Debian. Note that the code still doesn't build as 32-bit, so all we've done is move the failure from the "configure" step to the "make" step, which doesn't feel quite right to me...but at least gets me out of the middle of this debate. 😄

If someone wants to work on enabling 32-bit in PMIx, please feel free to do so and submit pull requests. Note that it may be a few months before the next point release, barring someone making it an urgent priority (cookies and other goodies help with such requests).

@ggouaillardet
Copy link
Contributor

@rhc54 I might have chosen my words poorly and I apologize if you felt "dragged" into this conversation and/or you thought I was somehow trying to make PMIx the scapegoat here.

I was simply trying to exhibit a potential issue that could impact the two major MPI implementations on 32-bit systems, in order to emphasize the important question is (as you already elaborated on that) "when" will (Open) MPI be no more available on upcoming 32-bit systems. I was not willing to discuss the "if" and had zero intent to point finger at "why/who should be blamed"

@rhc54
Copy link
Contributor

rhc54 commented Feb 8, 2023

My apologies, Gilles - I wasn't pointing back to you or anyone. I was only explaining (in a somewhat tongue-in-cheek manner) that I don't want PMIx to "block" OMPI 32-bit support with its configure settings. Barring someone actually resolving the 32-bit warnings Jeff reported, we still won't be able to cleanly compile - which means OMPI can't do so either - but that is a separate question.

@jsquyres
Copy link
Member Author

jsquyres commented Feb 8, 2023

FWIW: there is a user on the devel list who is looking into submitting a PR to PMIx for the 32 bit issues: https://www.mail-archive.com/devel@lists.open-mpi.org/msg21447.html

@rhc54
Copy link
Contributor

rhc54 commented Feb 8, 2023

FWIW: I reverted the 32-bit configure abort in the PMIx master branch to make it easier for people to work on those issues: openpmix/openpmix#2960

@jsquyres
Copy link
Member Author

jsquyres commented Feb 8, 2023

FWIW: I reverted the 32-bit configure abort in the PMIx master branch to make it easier for people to work on those issues: openpmix/openpmix#2960

I actually referred to your comment about that (above) in the thread on the devel mailing list 😄 https://www.mail-archive.com/devel@lists.open-mpi.org/msg21457.html

@rhc54
Copy link
Contributor

rhc54 commented Feb 8, 2023

Yeah, but my earlier comment was me removing it from the v4.2 release - not from the master branch. 😜

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants