0.3.8 tests returns fatal errors on Skylake with GCC 9.2.0 #2408

akesandgren · 2020-02-11T14:30:33Z

We get multiple fatal errors when running the tests on Skylake systems.
CFLAGS = -O2 -ftree-vectorize -march=native -fno-math-errno

For instance:

 ******* FATAL ERROR - COMPUTED RESULT IS LESS THAN HALF ACCURATE *******
           EXPECTED RESULT   COMPUTED RESULT
       1      0.267033          0.267033    
       2     -0.335365         -0.335365    
       3     -0.308392         -0.308392    
       4      0.347952          0.347952    
       5     -0.476523E-01     -0.476523E-01
       6     -0.200500         -0.200500    
       7      0.276024          0.276024    
       8     -0.416284         -0.416284    
       9      0.419880          0.466533    
      10      0.383916          0.426573    
      11      0.410889          0.456543    
 ******* DGEMV  FAILED ON CALL NUMBER:
   2176: DGEMV ('N', 11,  7, 0.0, A, 12, X, 1, 0.9, Y, 2)         .

 DGBMV  PASSED THE TESTS OF ERROR-EXITS

 ******* FATAL ERROR - COMPUTED RESULT IS LESS THAN HALF ACCURATE *******
           EXPECTED RESULT   COMPUTED RESULT
       1      0.276024          0.276024    
       2     -0.416284         -0.416284    
       3      0.419880          0.419880    
       4      0.872128E-01      0.872128E-01
       5      0.383916          0.383916    
       6      0.168132          0.168132    
       7     -0.344356         -0.344356    
       8     -0.227473         -0.227473    
       9     -0.380320         -0.422577    
      10      0.962038E-01      0.106893    
      11      0.240060          0.266733    
 ******* DGBMV  FAILED ON CALL NUMBER:
   8656: DGBMV ('N', 11,  7,  0,  0, 0.0, A,  2, X, 1, 0.9, Y, 2) .

Any thoughts on that?

The text was updated successfully, but these errors were encountered:

akesandgren · 2020-02-11T14:32:07Z

PS, a build on Broadwell with the same config settings works as it should

akesandgren · 2020-02-11T14:40:17Z

Currently rebuilding without vectorize...

lexming · 2020-02-11T14:40:25Z

I confirm the same error on a Intel Xeon Gold 6126 CPU using GCC-9.2.0.

The build command is make BINARY='64' CC='gcc -ftree-vectorize' FC='gfortran -ftree-vectorize' USE_OPENMP='1' USE_THREAD='1'
Specifically, the compilation option triggering those errors is -ftree-vectorize.

List of failed tests:

DGEMV
DGBMV
DSYMV
DSBMV
DSPMV
cblas_dgemv

akesandgren · 2020-02-11T15:00:34Z

confirmed, removing -ftree-vectorize makes the problem go away.

akesandgren · 2020-02-11T15:02:31Z

Would be nice if there was a proper fix for this.

martin-frbg · 2020-02-11T15:10:07Z

Disabling the dgemv_n microkernel for SkylakeX in kernel/x86_64/dgemv_n_4.c would be my first bet. (Unless 0.3.7 or earlier worked with -ftree-vectorize - that file is unchanged from 0.3.4 or so)

akesandgren · 2020-02-11T15:13:09Z

It didn't work with vectorize before either, just bringing this up again, since we accidentally forgot to turn if off at the first build.

bartoldeman · 2020-02-11T15:57:50Z

0.3.7 fails too, but not with GCC 8.3.0, only with GCC 9.2.0.

Diazonium · 2020-02-11T16:55:29Z

Similar issues happened before I think, when a newer GCC version started shuffling/clobbering different registers. @wjc404 did a lot of work on the AVX-512 assembly, maybe he has a better idea what goes wrong.

bartoldeman · 2020-02-11T17:02:48Z

The test fails with alpha=0, which means it's actually an issue with DSCAL I suspect.

bartoldeman · 2020-02-11T17:10:13Z

Indeed putting the generic
DSCALKERNEL = ../arm/scal.c
in kernel/x86_64/KERNEL.SKYLAKEX
fixes the failures. Now digging deeper.

martin-frbg · 2020-02-11T17:22:19Z

Passes with a snapshot of gcc 10 ... possible gcc9 bug ? Trying attribute(no_tree_vectorize) on the dscal_kernel_8 in dscal_microk_skylakex-2.c now... works.
Now why would the gcc tree vectorizer fall over
https://github.com/xianyi/OpenBLAS/blob/cb6ef49857719b64c5f882e32957f5de2fb1d302/kernel/x86_64/dscal_microk_skylakex-2.c#L35-L44 ?

bartoldeman · 2020-02-11T18:20:56Z

noinline does the trick too. What is weird is that the increment is 2, and if I read the code correctly dscal_kernel_8 should not even be invoked for that case.

akesandgren · 2020-02-12T08:21:05Z

dscal_kernel_8 noinline confirmed as a working solution.
I.e.,
static void dscal_kernel_8( BLASLONG n, FLOAT *alpha, FLOAT *x) attribute ((noinline));

Using
static void dscal_kernel_8( BLASLONG n, FLOAT *alpha, FLOAT *x) attribute ((no_tree_vectorize));
does not work.

martin-frbg · 2020-02-12T09:12:02Z

Strange - no_tree_vectorize definitely worked for me, only difference is that I prepended it as
__attribute__((optimize("no-tree-vectorize"))) static void...

akesandgren · 2020-02-12T11:59:36Z

Ok, will check that way then...

akesandgren · 2020-02-12T13:04:45Z

Confirmed, the no-tree-vectorize works:
static void dscal_kernel_8( BLASLONG n, FLOAT *alpha, FLOAT *x) attribute ((optimize("no-tree-vectorize")));

martin-frbg · 2020-02-12T13:38:17Z

As another datapoint, compilation with gcc 7.4 also shows no problem, so it looks as if it is only 9.x that miscompiles the dscal kernel.

bartoldeman · 2020-02-12T14:01:57Z

The issue is in dscal.c, so far openblas has been lucky it hasn't caused issues before: warning - tabs corrupted, I'll file a PR:

--- dscal.c.orig        2020-02-12 13:53:48.831716193 -0000
+++ dscal.c     2020-02-12 13:55:28.026247370 -0000
@@ -137,10 +137,10 @@
        "jnz    1b                                          \n\t"
 
         :
-          "+r" (n)      // 0
+          "+r" (n),     // 0
+          "+r" (x),     // 1
+          "+r" (x1)     // 2
         :
-          "r" (x),      // 1
-          "r" (x1),     // 2
           "r" (alpha),  // 3
           "r" (inc_x),  // 4
           "r" (inc_x3)  // 5

martin-frbg · 2020-02-12T14:25:31Z

Ouch. Here we go again... guess the earlier clobber list should have warned me that it is not just n
that needs to be flagged as input/output... ~~The same bug is probably in at least some of the other scal kernels that got the (incomplete) fix from #2010 as well.~~

Fix inline asm in dscal: mark x, x1 as clobbered. Fixes #2408

The leaq instructions in dscal_kernel_inc_8 modify x and x1 so they must be declared as input/output constraints, otherwise the compiler may assume the corresponding registers are not modified.

akesandgren mentioned this issue Feb 11, 2020

{numlib}[GCC/9.2.0] OpenBLAS v0.3.8 easybuilders/easybuild-easyconfigs#9852

Merged

akesandgren changed the title ~~0.3.8 tests returns fatal errors on Skylake~~ 0.3.8 tests returns fatal errors on Skylake with GCC 9.2.0 Feb 11, 2020

lexming mentioned this issue Feb 11, 2020

OpenBLAS installs regardless of failed tests easybuilders/easybuild-easyblocks#1953

Closed

martin-frbg added this to the 0.3.9 milestone Feb 11, 2020

lexming mentioned this issue Feb 12, 2020

OpenBLAS/0.3.8-GCC-9.2.0 loss of accuracy errors in CPUs with AVX512 easybuilders/easybuild-easyconfigs#9861

Closed

martin-frbg closed this as completed in 7ea5e07 Feb 12, 2020

martin-frbg added a commit that referenced this issue Feb 12, 2020

Merge pull request #2410 from bartoldeman/fix-dscal-inline-asm

8a9e9a8

Fix inline asm in dscal: mark x, x1 as clobbered. Fixes #2408

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.3.8 tests returns fatal errors on Skylake with GCC 9.2.0 #2408

0.3.8 tests returns fatal errors on Skylake with GCC 9.2.0 #2408

akesandgren commented Feb 11, 2020 •

edited

Loading

akesandgren commented Feb 11, 2020

akesandgren commented Feb 11, 2020

lexming commented Feb 11, 2020 •

edited

Loading

akesandgren commented Feb 11, 2020

akesandgren commented Feb 11, 2020

martin-frbg commented Feb 11, 2020

akesandgren commented Feb 11, 2020

bartoldeman commented Feb 11, 2020

Diazonium commented Feb 11, 2020

bartoldeman commented Feb 11, 2020

bartoldeman commented Feb 11, 2020

martin-frbg commented Feb 11, 2020 •

edited

Loading

bartoldeman commented Feb 11, 2020

akesandgren commented Feb 12, 2020

martin-frbg commented Feb 12, 2020

akesandgren commented Feb 12, 2020

akesandgren commented Feb 12, 2020

martin-frbg commented Feb 12, 2020

bartoldeman commented Feb 12, 2020

martin-frbg commented Feb 12, 2020 •

edited

Loading

0.3.8 tests returns fatal errors on Skylake with GCC 9.2.0 #2408

0.3.8 tests returns fatal errors on Skylake with GCC 9.2.0 #2408

Comments

akesandgren commented Feb 11, 2020 • edited Loading

akesandgren commented Feb 11, 2020

akesandgren commented Feb 11, 2020

lexming commented Feb 11, 2020 • edited Loading

akesandgren commented Feb 11, 2020

akesandgren commented Feb 11, 2020

martin-frbg commented Feb 11, 2020

akesandgren commented Feb 11, 2020

bartoldeman commented Feb 11, 2020

Diazonium commented Feb 11, 2020

bartoldeman commented Feb 11, 2020

bartoldeman commented Feb 11, 2020

martin-frbg commented Feb 11, 2020 • edited Loading

bartoldeman commented Feb 11, 2020

akesandgren commented Feb 12, 2020

martin-frbg commented Feb 12, 2020

akesandgren commented Feb 12, 2020

akesandgren commented Feb 12, 2020

martin-frbg commented Feb 12, 2020

bartoldeman commented Feb 12, 2020

martin-frbg commented Feb 12, 2020 • edited Loading

akesandgren commented Feb 11, 2020 •

edited

Loading

lexming commented Feb 11, 2020 •

edited

Loading

martin-frbg commented Feb 11, 2020 •

edited

Loading

martin-frbg commented Feb 12, 2020 •

edited

Loading