-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: GESDD fails when GESVD succeeds, depends on number of threads #3044
Comments
Unusual that a bug would appear in single-threaded mode when there should be less to go wrong. Is "this 204x204 array" available somewhere ? (And are you sure that your numpy actually picked up the 0.3.13 you built, rather than whatever Ubuntu has linked through its alternatives mechanism ?) |
Agreed it's weird. Sorry I forgot to upload the problematic file: I [openblas]
libraries = openblas
library_dirs = /opt/OpenBLAS/lib
include_dirs = /opt/OpenBLAS/include
runtime_library_dirs = /opt/OpenBLAS/lib And it looks like everything is in order: $ ldd numpy/linalg/lapack_lite.cpython-38-x86_64-linux-gnu.so
...
libopenblas.so.0 => /opt/OpenBLAS/lib/libopenblas.so.0 (0x00007fbd980fc000)
...
$ ls -al /opt/OpenBLAS/lib/libopenblas.so.0
lrwxrwxrwx 1 root root 35 Dec 18 08:38 /opt/OpenBLAS/lib/libopenblas.so.0 -> libopenblas_haswellp-r0.3.13.dev.so
$ ls -al /opt/OpenBLAS/lib/libopenblas_haswellp-r0.3.13.dev.so
-rwxr-xr-x 1 root root 14140208 Dec 18 08:37 /opt/OpenBLAS/lib/libopenblas_haswellp-r0.3.13.dev.so
$ python -c "import numpy; numpy.show_config()"
blas_mkl_info:
NOT AVAILABLE
blis_info:
NOT AVAILABLE
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/opt/OpenBLAS/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
runtime_library_dirs = ['/opt/OpenBLAS/lib']
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/opt/OpenBLAS/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
runtime_library_dirs = ['/opt/OpenBLAS/lib']
extra_compile_args = ['-march=native']
lapack_mkl_info:
NOT AVAILABLE
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/opt/OpenBLAS/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
runtime_library_dirs = ['/opt/OpenBLAS/lib']
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/opt/OpenBLAS/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
runtime_library_dirs = ['/opt/OpenBLAS/lib']
extra_compile_args = ['-march=native'] |
And note that I have the same problem with NumPy's linalg calls directly (I just showed SciPy's linalg because it allows switching betwen GESDD and GESVD backends, whereas NumPy just uses GESDD I think):
|
Try to import with dtype="double", at least it converges like a charm in R (where all floats are doubles) like |
@brada4 what CPU do you have, in case it matters? Mine is:
Maybe it's a problem that I end up with a |
I tried on a broadwell E5-2620 |
I have never (manually) upgraded CPU microcode -- are you suggesting that this might be causing the bug? From a quick naive search I already have
So I think it might be up to date already? |
It is perfect, lets deal with software bug ;-) |
Cannot reproduce this so far with older versions of gcc |
Not reproducible here with gcc 10.2 either (Haswell target, python 3.6, numpy 1.14 in a CentOS 8.2 VM on i7-7500U) |
How did you get 1.14 to use latest OpenBLAS master? Did you build 1.14 from scratch for some reason? Or did you I could try to figure out what version of OpenBLAS 1.14 used (it's almost 2 years old) and try using that version of OpenBLAS. If I also can't replicate there, it would give me at least some path to |
I took their packaged numpy and overwrote the /usr/lib64/libopenblas.so it linked to (0.3.3 or something equally ancient) with my own build of the current |
Not reproduced with a recent snapshot of numpy (1.20.0.dev0+08edcad) built from source against current |
Can you retry with a more recent snapshot of OpenBLAS please - in particular, the next commit after the one you built - c73d8ee worked around an oddity |
On latest
master
:Then
And also:
So there is something about the combination of using 1 thread with the GESDD driver on this 204x204 array that causes it to fail.
The text was updated successfully, but these errors were encountered: