Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conflicts with system/python/matlab shared libraries #426

Open
KrisThielemans opened this issue Jul 25, 2020 · 3 comments
Open

conflicts with system/python/matlab shared libraries #426

KrisThielemans opened this issue Jul 25, 2020 · 3 comments

Comments

@KrisThielemans
Copy link
Member

KrisThielemans commented Jul 25, 2020

This is yet another issue to discuss difficulties with finding the correct libraries. This is based on my experience with our university CentOS system, but the problem will be the same elsewhere. See also UCL/STIR#582 for some pointers.

On our university system, we have a few Python versions in /opt/Python/ and the system version, but that comes with essentially no packages, so I wanted to avoid using that. The versions in /opt/Python come with a lot of libraries in lib (probably conda versions).

To be able to link to Python, we have to specify the Python location during building (in CMake via setting PYTHON_EXECUTABLE etc, which will add for instance /opt/Python/Python-3.6/lib to the linker path). The consequence is that it also finds other things installed there, including for instance HDF5 include/libraries.

So I thought I'd try with USE_SYSTEM_HDF5=ON (see below for more on that).

Option 1: compile libraries without RPATH settings but use LD_LIBRARY_PATH

(This is the default CMake option)

Problem: As our programs/python packages use HDF5, and that was found in /opt/Python/Python-3.6, they won't run unless I include /opt/Python/Python-3.6/lib in LD_LIBRARY_PATH, but that then breaks other programs because they expect more recent libraries for other stuff (e.g. /opt/swig/*/swig)

initial solution: when using our programs, add /opt/Python/Python-3.6/lib to LD_LIBRARY_PATH (could be done via adding it in env_ccppetmr.sh) but make sure this isn't done when you want to run anything else. However, it does NOT work. I get an error like

sirf_resample: /opt/Python/Python-3.6/lib/libgomp.so.1: version `GOMP_4.0' not found (required by sirf_resample)

Reason: during building, gcc used its version of OpenMP, which is more recent than the one distributed with Python-3.6

Solution: figure out where the system libgomp sits and then we can do

LD_PRELOAD=/usr/lib64/libgomp.so.1 sirf_resample

Setting LD_PRELOAD globally is terrible, so we could set LD_PRELOAD only when using our programs as above, or add it to env_ccppetmr.sh (but this feels quite dangerous).

Option 2: compile libraries with RPATH settings

This can be achieved by using the CMake option CMAKE_INSTALL_RPATH_USE_LINK_PATH=ON.

Result: executables/libraries have RPATH set, including /opt/Python/Python-3.6/lib

Problems: exactly the same run-time problems as the LD_LIBRARY_PATH option.

Option 3: compile libraries with RPATH settings but saving our own install location only

This means that LD_LIBRARY_PATH is used for anything not in that location (as RPATH is used first.
This is the strategy that @rijobro introduced for SIRF on MacOS.

Result: executables/libraries have RPATH set, but will rely on other mechanisms to find libgomp.so

Problems: I haven't tried this, but my predication is that SIRF executables will work fine (as the system libgomp.so will be used), but Python will fail with exactly the same run-time problems as previous options (as the Python libgomp.so will be used)

Ideas for work-arounds (but none really work well):

Work-around 1: use a more recent Python and hope for the best

We can try to use a more recent Python such that its libgomp.so.1 is at least as recent as the system one. This could work until one thing is upgraded somewhere. However, it doesn’t work on our university system as I then get the following when building our software

/usr/lib64/libSM.so: undefined reference to `uuid_generate@UUID_1.0'
/usr/lib64/libSM.so: undefined reference to `uuid_unparse_lower@UUID_1.0'

This presumably is because /opt/Python/Python-3.7 comes with its own libuuid.so. I have no clue really. (Edit: the previous statement about our cluster is not true. It must have happened during experimenting, but I can now successfully build with its Python 3.7)

Work-around 2: build HDF5 (and other things) ourselves and link to that, i.e. USE_SYSTEM_HDF5=OFF

This is our current default strategy. We can then either set LD_LIBRARY_PATH to that directory only, and not to the Python one (current strategy), or include only that directory only in the RPATH. Our executables work fine. However,

  • SIRF Python packages still do not work, as Python will loads its libgomp.so, and SIRF packages that use OpenMP fail with the error message above.
  • there's likely going to be conflicts on HDF5 libraries in Python unless we build the same HDF5 library version as the one used by Python. Possibly the RPATH option doesn't have this problem. Note also that we have to rebuild everything when using Matlab to match its HDF5 libraries. See HDF5 conflicts #129 and Need to use same HDF5 version as Matlab #208.

Work-around 3: use a lean Python

Find a Python version that doesn’t insist on distributing its own libraries (i.e. a non-Conda version I guess). (This option is not available for MATLAB).

Then we can set USE_SYSTEM_*=ON for packages and rely on the system package manager. (This is the option chosen on the VM). Of course, this option is not available on university systems.

Work-around 4: go conda all the way

But I haven't found a way to let CMake pick up relevant conda-provided libraries only. Also, this strategy breaks down as soon as you need a non-conda package.

Any other ideas?

@DANAJK
Copy link
Contributor

DANAJK commented Jul 27, 2020

Docker?

@KrisThielemans
Copy link
Member Author

Docker?

we cannot use docker on our university system. Singularity might work, but we haven't made any progress on that. It is one of our tasks.

My feeling though is that containers are excellent for deployment, but don't really help during development. jupyter notebooks are great, but I suppose not the best tool for development (awkward for debugging). If you're on Linux, you can set-up X forwarding. Or if you have a matching environment, you can use Visual Studio Code to talk to your container apparently.

Anyway, this should be discussed further on our mailing list, not in this issue I suppose.

@KrisThielemans
Copy link
Member Author

oh, and none of the container solutions work with MATLAB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants