Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation: strange behaviour on macos Catalina 10.15.7 #3168

Closed
j-a-ferguson opened this issue Mar 28, 2021 · 17 comments
Closed

Compilation: strange behaviour on macos Catalina 10.15.7 #3168

j-a-ferguson opened this issue Mar 28, 2021 · 17 comments

Comments

@j-a-ferguson
Copy link

Hello, I hope you are well.

I am observing some strange behaviour when I compile both v0.3.14 and v0.3.10 on macos.
I avoided using 0.3.13 as I ran into this issue: [https://github.com//issues/3037].
The compilation proceeds as expected until I reach the tests where I encounter
messages like:

dyld: lazy symbol binding failed: Symbol not found: ___emutls_get_address Referenced from: /usr/local/opt/gcc/lib/gcc/10/libgfortran.5.dylib Expected in: /usr/lib/libSystem.B.dylib

and

ld: warning: dylib (/usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10/libgfortran.dylib) was built for newer macOS version (10.15) than being linked (10.8)

In addition to these messages it seems that the shared library libopenblas_haswellp-r0.3.14.dylib is not built at all, only the static library is built.

I am using the following setup

  • Intel x86_64 mac running Catalina 10.15.7
  • Homebrew gcc version 10.2.0

And I called the following make command:

make CC=gcc-10 FC=gfortran-10 USE_THREADS=0

I have attached the full output of the make command, those linker messages I cited earlier appear at the end of the file.
compilation.txt

Thank you in advance for your time,
Best wishes,
John.

@isuruf
Copy link
Contributor

isuruf commented Mar 28, 2021

Both of those errors seems like an error in your toolchain. Have you asked homebrew developers? (which is where I assume you got gcc-10 and gfortran-10

@martin-frbg
Copy link
Collaborator

Very strange - the link error looks almost spurious, at least I do not understand why it would fail to build only the single-precision complex BLAS3 test after successful compilation of the single and double precision real BLAS3 tests with the exact same options and libraries. And neither of these would require thread-local storage support in any way. As make aborts at this point, the shared library does not get built.

@martin-frbg
Copy link
Collaborator

Do you have multiple versions of homebrew gcc installed (if that is even possible), or did you run a make with different parameters and omit to run ´make clean` in the toplevel OpenBLAS directory before the latest attempt by any chance ?

@j-a-ferguson
Copy link
Author

j-a-ferguson commented Mar 28, 2021

Hello Isuru and Martin,

Thanks for getting back to me so quickly!

My setup is perhaps a little atypical for macos as I'm ultimately trying to mimic the environment of the HPC cluster I have access to.

I use Enironment Modules http://modules.sourceforge.net/ to "load" the compiler toolchain into my environment, this does the following (output from module show gcc/10.2.0):

module-whatis   {The GNU Compiler Collection (GCC) Version 10.2.0 }
setenv          GCC_HOME /usr/local/Cellar/gcc/10.2.0_3
setenv          GCC_VERSION 10.2.0
setenv          CC gcc-10
setenv          CXX g++-10
setenv          F90 gfortran-10
setenv          F77 gfortran-10
setenv          FC gfortran-10
prepend-path    PATH /usr/local/Cellar/gcc/10.2.0_3/bin
prepend-path    LD_LIBRARY_PATH /usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10
prepend-path    LIBRARY_PATH /usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10

where /usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10 contains all of the gcc c/c++ runtimes and libraries.

Loading toolchains in this way avoids possible conflicts as only one may be loaded at a time.

To your question @martin-frbg about a possible "make clean" error, at first I thought that was a possiblilty so I create a separate directory for each build configuration and copy over the unconfigured source tree from OpenBLAS-0.3.14/. This way I avoid corrupting the original source tree if I'm compiling for different toolchains.

If this is a bug in the Homebrew gcc installation then I would be more than happy to open an issue with them and keep you posted.

Let me know if that's something you would like me to do.

@isuruf
Copy link
Contributor

isuruf commented Mar 28, 2021

The fact that gfortran-10 is looking at /usr/local/opt/gcc/lib/gcc/10/libgfortran.5.dylib and not /usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10/libgfortran.5.dylib means you've set up your environment wrong.

@martin-frbg
Copy link
Collaborator

As isuruf suggested, please dump your environment to check the active settings for LIBRARY_PATH and LD_LIBRARY_PATH, probably at least one of your settings did not work out as intended. I have added a gcc10,gfortran10,USE_THREADS=0 build to our Azure CI in PR #3166 and that job succeeded.

@j-a-ferguson
Copy link
Author

Hombrew places symlinks in /usr/local/opt which point to directories in usr/local/Cellar.
In this case /usr/local/gcc -> ../Cellar/gcc/10.2.0_3.

So in theory /usr/local/opt/gcc/lib/gcc/10/libgfortran.5.dylib should resolve to /usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10/libgfortran.dylib.

But I agree, clearly something is not right with my setup.

I will do some more digging.
Thank you both for your help.

Since this does not appear to actually be a bug in OpenBLAS I'm happy for you to close the issue.

@isuruf
Copy link
Contributor

isuruf commented Mar 28, 2021

ld: warning: dylib (/usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10/libgfortran.dylib) was built for newer macOS version (10.15) than being linked (10.8)

Do you have MACOSX_DEPLOYMENT_TARGET` env variable set?

@j-a-ferguson
Copy link
Author

Ah, I don't. Let me give that a try.

@j-a-ferguson
Copy link
Author

So setting MACOSX_DEPLOYMENT_TARGET=10.15 solves that error however the error:

dyld: lazy symbol binding failed: Symbol not found: ___emutls_get_address Referenced from: /usr/local/opt/gcc/lib/gcc/10/libgfortran.5.dylib Expected in: /usr/lib/libSystem.B.dylib

persists.

@isuruf
Copy link
Contributor

isuruf commented Mar 28, 2021

___emutls_get_address error is because gfortran-10 is not using -lemutls_w in the link line which is weird.

@martin-frbg
Copy link
Collaborator

Not sure if OSX has nm (or what its equivalent would be), but it might help to find out which if any of the system libraries provides this symbol. (google finds lots of similar issues across several versions of (homebrew) gcc and from what I have seen, the general consensus for solving it appears to have been to update or reinstalli the compiler package)

@j-a-ferguson
Copy link
Author

j-a-ferguson commented Mar 28, 2021

It turns out that symbol ___emutls_get_address is not in /usr/lib/libSystem.B.dylib.
Running nm -a | grep ___emutls_get_address comes up empty.

@isuruf
Copy link
Contributor

isuruf commented Mar 28, 2021

___emutls_get_address is in libemutls_w.a which should be there along side libgfortran.a

@j-a-ferguson
Copy link
Author

Well that must be issue, there is no libemutls_w.a alongside libgfortran in /usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10/
in the Homebrew installation I have.

@j-a-ferguson
Copy link
Author

I have found the problem!

It seems to be due to the way homebrew symlinks libraries.

When I run otool -L /usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10/libgfortran.dylib I get:

libgfortran.5.dylib:
	/usr/local/opt/gcc/lib/gcc/10/libgfortran.5.dylib (compatibility version 6.0.0, current version 6.0.0)
	/usr/local/Cellar/gcc/10.2.0_3/lib/gcc/10/libquadmath.0.dylib (compatibility version 1.0.0, current version 1.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.100.1)
	/usr/local/lib/gcc/10/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0)

The bottom library libgcc_s.1.dylib is in the same directory as libgfortran however libgfortran tries to link to it via a symlink /usr/local/lib/gcc -> /usr/local/Cellar/gcc/10.2.0_3/lib/gcc
which was removed when I used brew unlink gcc because I wished to use Environment Modules to access it rather than Homebrew's symlink method.

The library libgcc_s.1.dylib contains the symbol ___emutls_get_address.

So, to summarise libgfortran was looking for a library that does exist but via a symlink that did not.

So, when I call brew link gcc the symlink /usr/local/lib/gcc -> /usr/local/Cellar/gcc/10.2.0_3/lib/gcc is recreated and when I compile OpenBLAS with gcc linked the tests all pass without any problems!

@j-a-ferguson
Copy link
Author

Sorry for wasting both of your time on a completely avoidable bug.

Thank you both again. Your suggestions were very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants