-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python wheel libnvrtc-builtins.so
#1193
Comments
Running The version installed from the distributed wheel however depends on Cmake 3.28.3 was used locally, on ubuntu 22.04, while the manylinux2014 container from January 2024 was CMake 3.28.1 and is Cent7 based. During pyflamepgu linking libnvrtc-builtins is present in the command for both cases, but it is unversioned in both, so the source of the difference is still unclear. |
adding the following to CI to readelf in ubuntu and manylinux builds on CI in the
This could be a difference in gcc, binutils, some environment vairiable, platform specific cmake behaviour, platform specific cuda packages or something else entirely, and I am unsure where to start looking to actually pin this down. If we can encourage cent/alma to not add the explicit link, then we wouldn't need any other workarounds (although adding the dependeing on libnvrtc.so via python package might still be a good idea, but without also dlopening it it won't help with manylinux compliance). |
Checking linker defaults via local Ubuntu
alma 8
|
Linker command for Ubuntu 22.04 CI
Centos 7 (manylinux2014 ci)
|
Readelf output from CI: notably centos includes Ubuntu
Centos
|
Suggestion. Try a build of a simple example on Sheffield HPC system (Cent 7) to see if we can replicate this. |
A possible fix is to use patchelf to remove the libnvrtc-builtins.so.MM.mm dependency at pyflamegpu build time. Patchelf is availble in manylinux, so doable, but getting the command in the right place of our cmake during building |
A build on Stanage of current master (b5173e7) using:
Did not result in libnvrtc-builtins.so.12 being a dependency. $ readelf -d ./lib/Release/python/src/pyflamegpu/_pyflamegpu.so
Dynamic section at offset 0x79be2e0 contains 36 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libnvrtc.so.12]
0x0000000000000001 (NEEDED) Shared library: [libnvJitLink.so.12]
0x0000000000000001 (NEEDED) Shared library: [libcuda.so.1]
0x0000000000000001 (NEEDED) Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [librt.so.1]
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
0x000000000000000e (SONAME) Library soname: [_pyflamegpu.so]
0x000000000000000c (INIT) 0x14c000
... So this is not universal on centos 7, however the version of gcc/ld on Stanage is provided by easybuild (i.e. |
Confirmed via a CI run that manylinux_2_28 also results in libnvrtc-builtins.so.12.0 being linked, so its not centos7 specific, but either manylinux specific or devtoolset specific? https://github.com/FLAMEGPU/FLAMEGPU2/actions/runs/8850510611/job/24304952167
|
Could possibly look into other libraries distributing binaries that are linked against NVRTC that are built on centos using devtoolset compilers, to see if they have similar issues with builtins (i.e. pytorch, though they might handle it differently entirely for manylinux complaince. This could even just be downloading a wheel and checking readelf output). Alternativley trying a build on a EL derived system, with a devtoolset provided host compiler could be worth trying, to narrow down if it's EL or if it's a manylinux container difference to narrow the cause down. This could be done in a docker container, or a HPC system with EL provided compilers (Bede with the native host compiler perhaps, though platform specific differences might also be a factor then). |
Originally identified as part of #1191, although nvrtc sonames are now major version only (since 11.3, using
.so.11.2
or.so.12.0
),libnvrtc-builtins.so
is still explicitly versioned, and is dependended upon by the genericlibnvrtc.so
which we are linking against.In practice, this means that currently you must have the exact version of the CTK installed and available at runtime as was used to build the python wheel, rather than a "compatible" version.
This was not noticed when testing locally, as CMake causes a
RPATH
/RUNPATH
to be set pointing at the exact location of the shared object, so if installed in the same location it is found even ifLD_LIBRARY_PATH
does not point to it.A workaround has been added to the google-colab notebooke in FLAMEGPU/FLAMEGPU2-tutorial-python@47cc1c1, which installs the
nvidia-cuda-nvrtc-cu12==12.0.140
python package to bring in the matching version oflibnvrtc-builtins.so
, and then explicitly imports it viactypes.CDLL
to ensure it is loaded.A robust verison of this fix, would be to add the exact version of
nvidia-cuda-nvrtc-cu12
(ornvidia-cuda-nvrtc-cu11
) required to our python packages'extra_install_requires
insetup.py
, when the packages are intended to be distributed (i.e. only in our CI, not in all local builds to avoid binary bloat).In
__init__.py
, we would then need to ensure that the appropraite .so is loaded at runtime if it is not found implicilty, i.e. try and import it system wide, catch errors and if the error is appropraite explicitly load the exactlibnvrtc.so
provided by the python package.Unfortunately,
nvidia-cuda-nvrtc-cu11
is only provided on pypi for11.7.99
and11.8.89
, so this does not work for our 11.2 wheels...Alternative options include:
libnvrtc
. Bigger wheels, requires CMake >= 3.26 and CUDA >= 11.5 to do robustly.libnvrtc.so
andlibnvrtc-builtins.so
in the wheel (like vis dlls) and ensure that version is used if needed. Bloated wheels, but avoids pip conflicts. Related to Manylinux compliant wheels (dlopen) #647 (but not exact)nvidia-cuda-nvrtc-cuXX
might still be a good idea).More detailed investigation notes can be found in #1191
The text was updated successfully, but these errors were encountered: