Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a notion of toolchain-neutral software #570

Open
gribozavr opened this issue Mar 26, 2013 · 17 comments
Open

Introduce a notion of toolchain-neutral software #570

gribozavr opened this issue Mar 26, 2013 · 17 comments

Comments

@gribozavr
Copy link
Contributor

It does not have a lot of sense to build many variants of tools that don't have a public ABI. For example: CMake, subversion, git, mercurial. These packages can be built once and used with any toolchain.

Currently there is a 'dummy-dummy' toolchain that allows to do this if no other package depends on such toolchain-neutral software. If there is a dependency, a special easyconfig should be created and built.

A simple solution is to allow easyconfigs to be marked toolchain-neutral, and to allow these packages to satisfy dependencies of software that is being built with a non-dummy toolchain.

@boegel
Copy link
Member

boegel commented Mar 26, 2013

There are a couple of caveats here, mainly the compilers/libs used to build this toolchain-neutral (tc-n) software.

To avoid problems when a certain toolchain is used in combination with the tc-n software, the latter has to be fully statically linked.

The request seems valuable though...

@gribozavr
Copy link
Contributor Author

Or it can be built with rpath.

@fgeorgatos
Copy link
Collaborator

fyi. xbesseron has been suggesting/promoting the idea to go into the direction of this issue, too.

If we do so, I think I'd favor to promote static linking.

Why? dynamic linking can be a complicated business in itself, even when using rpath:
http://gcc.gnu.org/ml/gcc-help/2008-06/msg00118.html # "8 ways to leave your linker"
(after all, rpath just confines directories, not individual library versions with functions!).

ie. my concern is that the dynamic aspect may become an annoyance...

@gribozavr
Copy link
Contributor Author

We are talking about linking against OS-provided versions of libraries. These libraries should have a stable ABI, so I don't see an issue here yet.

@stdweird
Copy link
Contributor

imho toolchain-neutral software is only needed for build requirements of (sub)toolchains, nothing else. the big issue with those is that OS dependencies are a pain to define / determine, and regardless of the ABI, if the OS version is too old, you are screwed.
i do not see it as an issue to rebuild those with a toolchain, even if they are maybe not scrictly speaking required. the fact that you don't have to care about static/dynamic linking etc etc is more then enough reason to just rebuild them. after all, it's just a few extra modules and some extra storage, in return you have worry free tools to offer to your users.

@fgeorgatos
Copy link
Collaborator

+1 to latest stdweird's comment: I favor the "worry-free" "debugless" approach -in relation to persons' manhours-,
full rebuild imposes only a slight higher expense of a system's time/space. (and with future tools like lmod it makes plenty of sense to do it anyhow)

@fgeorgatos
Copy link
Collaborator

what is valuable to keep promoting as part of this issue, is the notion that we don't really bother with providing, say, a2ps for 4 different toolchains; although I'm the guy creating all those, I'd be the first to admit that it is quite overkill.

We may still have reasons to provide the different builds, but we should consolidate the easyconfigs, at least.

@fgeorgatos
Copy link
Collaborator

this is just to confirm that there is good merit in this issue, and I expect it to pop up during the next hackathon, if debuggers & some performance tools are to be discussed. Namely, members of this bundle may be done otherwise:
https://github.com/hpcugent/easybuild-easyconfigs/tree/master/easybuild/easyconfigs/h/HPCBIOS_Debuggers

the reproducibility argument is still there, which could be something like a hash in a versionsuffix!

Having said that, I am not very convinced for the purpose of commit 05deccc (ie. introduce toolchains for Debuggers and Profilers), since only the tools that fiddle with the MPI stack in itself (fi. Scalasca) should be having dynamic libraries dependencies over toolchains etc. I've never been convinced, really ;-)

As a proof of this, notice that DDT and TotalView are statically linked (and probably that is the true one objective).

@geimer
Copy link
Contributor

geimer commented Feb 20, 2014

I really support the idea of providing at least the option to create toolchain-neutral software. There are certainly good reasons for going one or the other route, but in the end it should be the admin's choice...

@Bart-Ver
Copy link

On 20/02/14 23:27, geimer wrote:

I really support the idea of providing at least the /option/ to create
toolchain-neutral software. There are certainly good reasons for going
one or the other route, but in the end it should be the admin's choice...

That is true if there is only one admin.
We have more than 5 people installing software. EB is sometimes a little
strict, but it is consistent. The more choice me and my colleagues have,
to more it will become a mess (again).

Just my humble opinion,
Bart


Reply to this email directly or view it on GitHub
#570 (comment).

Dr. Bart Verleye
Centre for e-Research
Level G, Room 409-G21
24 SYMONDS ST
Auckland 1010
New Zealand
+64 (0) 9 923 9740 ext 89740

@boegel
Copy link
Member

boegel commented Feb 22, 2014

@Bart-CER: How about using a system-wide EasyBuild configuration file, and agreeing on the policy not to fiddle with the configuration otherwise (e.g. env vars or command line options to override the config file)?
EasyBuild should allow people to achieve the things they want, with reasonable defaults. EasyBuild is not a team manager. ;-)

@stdweird
Copy link
Contributor

some more feedback after the julich hacakthon:

in my opinion toolchain neutral software implies that static binaries are produced and no dependencies on anything external. the "no dependencies" can be loosened a bit (eg assume bash is available), but this becomes quickly another (arbitrary) distinction what and what not can be assumed to be present.
and, for completeness, the version suffix for toolchain neutral software should probably include the builddepencies (eg Doxygen/0.1-dummy-dummy-GCC-4.8.2, but this becomes a naming issue quickly).

toolchain neutral software however is not "software provided by the OS wrapped by easybuild" (the whole "why can't i use the gcc on my system instead forcing me to compile one from scratch" discussion). we should use proper terminology here (i'd call it system software).
EB should provide an easy way to generate EB-compliant modules around existing 3rd party modules or system software, but this is to be avoided at all cost. in particular, before EB generates this module, it should run the sanitychecks. problems with these system software modules are beyond easybuild control, and this should also be made clear to whoever wants to go this way. (and i really hope we can avoid support for the case that the system provides gcc (the c compiler) and not gfortran, and that users want to fake the GCC subtoolchain, or have EB only install gfortran)

and a 3rd remark, easybuild should make an effort to specify minimal (sub)toolchain requirements in the easyconfigs. this effort could be part of moving to the new format2.0 (i don't think it's wise to modify the current format 1.0 files). eg if a software package is truly only dependent on GCC, the toolchain requirements should specifiy this; and easybuild should provide a way to either install it with the subtoolchain or figure out a matching toolchain and use that toolchain (as is done now).
in an extreme case, the current toolchain info is dropped and everything becomes a (build)dependency.
EB can figure out based on the dependencies what toolchain would have been specified. the naming will become a mess (the main reason why toolchains are used), but this might be solved by hierarchical modules.

@boegel
Copy link
Member

boegel commented May 7, 2014

One (other) use case of this is PerfExpert (cfr. easybuilders/easybuild-easyconfigs#839).

The installed PerfExpert module should be toolchain-neutral, in the sense that you should be able to use together with any other module, regardless of with which compiler toolchain it was built.
In this particular case, you can't get away with a statement like "just build PerfExpert with whatever toolchain was used to build the software package you're analysing". Building PerfExpert with e.g. ictce is a no-go, mostly because of its dependencies (e.g. ROSE even requires GCC 4.4.x or an older GCC 4.x (but not too old)), but locking down the build dependencies (e.g. the compiler) used for building PerfExpert and all deps is important for reproducibility.

Another aspect of this is that ideally, PerfExpert should be provided via a single module (not a module that loads a bunch of other modules as dependencies), to avoid problems with dependencies that are common for other applications (e.g. Boost).
Only one application version (even regardless of toolchain) can be loaded at a time via a module, however the linker is able to correctly handle multiple versions of the Boost library to be available at runtime (in $LD_LIBRARY_PATH), so even without static linking 'collapsing' dependencies together has proper use cases...

@fgeorgatos
Copy link
Collaborator

Hi Kenneth, all,

fyi. the argumentation you made is exactly of the same type I have been
promoting all along last year, as regards debuggers's easyconfigs!

For the same reason, I have never been really convinced about this commit on HPCBIOS_Debuggers:
easybuilders/easybuild-easyconfigs@05deccc
We may not really want multiple builds of such tools, when exactly one does the job equally well.
If you agree with that statement, that commit could be reverted (pending an aligned treat of GDB)!

On Wed, May 7, 2014 at 3:30 PM, Kenneth Hoste notifications@github.comwrote:

One (other) use case of this is PerfExpert (cfr.
easybuilders/easybuild-easyconfigs#839easybuilders/easybuild-easyconfigs#839
).

The installed PerfExpert module should be toolchain-neutral, in the sense
that you should be able to use together with any other module, regardless
of with which compiler toolchain it was built.

Of course, the reproducibility argument still applies, so nothing wrong with the concept
of actually confining the build with specific compiler versions, libraries etc.
We have discussed this again before and during the JSC hackathon and see merit:
#570 (comment)

To summarize:

  • debugger tools like DDT, TotalView and members of the same family,
    would rather be delivered best in a single build, preferably with static compilation.
  • Contrast to that performance tools, which may need to be aligned in compiler
    or mpi stack variant and they MUST be recompiled (currently EB focuses only on the latter use-case)

There are quite a few of us (@xbesseron, @gribozavr, @geimer, @georgets, @fgeorgatos)
interested to see how this PR will evolve, because it has good impact on future work!

@citibeth
Copy link

citibeth commented Jan 3, 2016

Since every piece of software has to be built with SOME toolchain, I don't understand what marking an .eb as "toolchain-neutral" would mean.

Maybe the right approach is to allow dependencies to allow a wildcard toolchain when specifying dependencies. For example, netCDF has a build dependency on CMake. It doesn't matter which toolchain was used to build CMake, as long as bin/cmake runs. One could then build these basic tools with some kind of "plain vanilla" toolchain.

@boegel
Copy link
Member

boegel commented Jan 3, 2016

@citibob: marking something as 'toolchain-neutral' basically means that the end result doesn't depend on the toolchain at runtime in any way

We already support the wildcard you suggest, in the sense that we have support for resolving dependencies with taken subtoolchains into account, cfr. http://easybuild.readthedocs.org/en/latest/Manipulating_dependencies.html#minimal-toolchains.

The toolchain-neutral idea goes a bit further though, in some sense. If something is toolchain-neutral, you only need a single build of it, and then it can be used in a build with any (other) toolchain.

@citibeth
Copy link

citibeth commented Jan 3, 2016

We already support the wildcard you suggest

An interesting feature, but not quite the same. Suppose I built CMake with
Clang, and now I'm compiling netCDF with GCC, which has a build dependency
on CMake. As far as I can tell, --minimal-toolchains will not be able to
use the Clang version of CMake, since Clang is not an ancestor of the GCC
toolchain. The wildcard I'm suggesting would work for ANY toolchain.

If something is toolchain-neutral, you only need a single build of it,
and then it can be used in a build with any (other) toolchain.

I think the core issue here is, WHO declares a property of a piece of
software:
a) If it's a wildcard, then the USER declares that the toolchain used
doesn't matter.
b) If it's a toolchain-neutral feature, then the PRODUCER declares the
toolchain used doesn't matter.

I'm fearful that EasyBuild might be running down the path of an
ill-conceived collection of ad-hoc dependency-matching mechanisms. One big
problem with --minimal-toolchains is it's an EB config parameter, not a
property of an .eb file itself. If its use breaks ANY .eb files (likely),
then people will turn it off. Add on three or four such "features," and it
becomes hard to tell what will match with what.

I would recommend the following:

  1. Figure out if it is feasible to stop using LD_LIBRARY_PATH and use
    RPATH instead, as is used with Spack. This would seem to be a prerequisite
    for any serious mix-n-match between toolchains. (For example, what if our
    Clang-compiled CMake doesn't work with the LD_LIBRARY_PATH required to
    compile NetCDF with our toolchain?)

  2. Think through the dependency matching problem carefully and come up
    with a consistent, systematic way to match dependencies. The system would
    need to provide:

    a) In specifying a dependency (toolchain, lib, build or otherwise), a
    standard way to specify what you're asking for. The standard needs to
    allow to specify optional toolchains, version ranges, version blackouts,
    wildcards, etc.

    b) In writing an easyblock or easyconfig, a standard way to specify what
    we THINK might be able to use us as a dependency in the future. CMake
    might be able to specify "anyone can use us." A Clang-compiled C library
    might specify "anyone using Clang or GCC can use us." A Fortran library
    compiled with GCC 4.9.3 would specify "only projects compiled with
    GCC-4.9.3 can use us."

    c) A way to match specs of (a) to specs of (b) to provide dependency
    match. For example, MyApp-GCC-4.9.3.eb might specify "I need version 1.5
    or greater of MyLib, I don't care what toolchain it's compiled with." From
    the user side, this can match with just about anything. But this would NOT
    match with MyLib-1.5-GCC-4.8.1.eb because THAT config specified that it
    will NOT work with other versions of GCC.
    This algorithm needs to be simple enough to intuitively understand
    and debug. We should not be left scratching our heads wondering "why did
    EasyBuild match to THAT dependency, and how can I convince it to match to
    the one I really wanted?"

I would suggest someone think through this and propose a system, after
reviewing similar dependency-matching systems out there (i.e. the one in
Spack). Then we weigh in and get a design everyone can live with. Then we
implement it.

On Sun, Jan 3, 2016 at 11:22 AM, Kenneth Hoste notifications@github.com
wrote:

@citibob https://github.com/citibob: marking something as
'toolchain-neutral' basically means that the end result doesn't depend on
the toolchain at runtime in any way

We already support the wildcard you suggest, in the sense that we have
support for resolving dependencies with taken subtoolchains into account,
cfr.
http://easybuild.readthedocs.org/en/latest/Manipulating_dependencies.html#minimal-toolchains
.

The toolchain-neutral idea goes a bit further though, in some sense. If
something is toolchain-neutral, you only need a single build of it, and
then it can be used in a build with any (other) toolchain.


Reply to this email directly or view it on GitHub
#570 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants