Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abs() on structured array gives invalid values in numpy 1.19.0 #16660

Closed
ahnitz opened this issue Jun 22, 2020 · 14 comments · Fixed by #16666
Closed

abs() on structured array gives invalid values in numpy 1.19.0 #16660

ahnitz opened this issue Jun 22, 2020 · 14 comments · Fixed by #16666
Labels
00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery

Comments

@ahnitz
Copy link

ahnitz commented Jun 22, 2020

Since the numpy 1.19.0 release, calculating ABS on a complex field of a structured dtype gives incorrect results.
The exact output looks as if there is an indexing problem.

I am able to reliably reproduce following bug on Travis-CI (but not other machines, such as my personal laptop and local compute cluster). It is unclear if this is hardware or environment specific in some way. I've created as simple an example
as I could. One possible hint is that it seems that Travis has AVX512 in its environment, whereas my test machines do not.

Reproducing code example:

import numpy
import sys

val = [(0, 0., 0, -31.666483, 200, 0., 0.,  1.      ,  -2.8499086-4.852268j  ,  613090),
 (1, 0., 0, 260.91525 ,  42, 0., 0.,  1.      ,  11.672331 -0.39883116j,  787315),
 (1, 0., 0,  52.15155 ,  42, 0., 0.,  1.      ,  -3.9071944-3.9876528j ,  806641),
 (1, 0., 0,  52.430195,  42, 0., 0.,  1.      ,   3.5721765-4.4356694j , 1363540),
 (2, 0., 0, 304.43646 ,  58, 0., 0.,  1.      ,  -0.9019805+6.379318j  ,  787323),
 (3, 0., 0, 299.42108 ,  52, 0., 0.,  1.      ,   6.365695 +3.1266918j ,  787332),
 (4, 0., 0,  39.4836  ,  28, 0., 0.,  9.182192, -16.804018 +9.028097j  ,  787304),
 (4, 0., 0,  76.83787 ,  28, 0., 0.,  1.      ,   3.2854116-5.3171387j , 1321869),
 (5, 0., 0, 143.26366 ,  24, 0., 0., 10.996129,  14.33073  +1.0027167j ,  787299)]
 
 
dtype = [('template_id', '<i8'), ('bank_chisq', '<f4'), ('bank_chisq_dof', '<i8'), ('chisq', '<f4'), ('chisq_dof', '<i8'), ('cont_chisq', '<f4'), ('psd_var_val', '<f4'), ('sg_chisq', '<f4'), ('snr', '<c8'), ('time_index', '<i8')]

vec = numpy.array(val, dtype=dtype)


a = vec['snr']
g = abs(a)

b = vec['snr'].copy()
h = abs(b)

print ((h-g).sum(), a, b, g, h)

# Uh, oh, we got a different answer!
if (h-g).sum() != 0:
    sys.exit(1)

Error message:

Output of the above test script when BROKEN (i.e. numpy 1.19.0)

-52.852776 [ -2.8499086-4.852268j    11.672331 -0.39883116j  -3.9071944-3.9876528j
   3.5721765-4.4356694j   -0.9019805+6.379318j     6.365695 +3.1266918j
 -16.804018 +9.028097j     3.2854116-5.3171387j   14.33073  +1.0027167j ] [ -2.8499086-4.852268j    11.672331 -0.39883116j  -3.9071944-3.9876528j
   3.5721765-4.4356694j   -0.9019805+6.379318j     6.365695 +3.1266918j
 -16.804018 +9.028097j     3.2854116-5.3171387j   14.33073  +1.0027167j ] [5.6272984e+00 1.1715089e+01 1.0000000e+00 0.0000000e+00 0.0000000e+00
 7.2867520e-44 3.9483601e+01 7.6837868e+01 0.0000000e+00] [ 5.6272984 11.679143   5.58279    5.6952267  6.442769   7.092128
 19.07568    6.2502713 14.3657675]

Output the above test script when WORKING (i.e. numpy 1.18.5)

0.0 [ -2.8499086-4.852268j    11.672331 -0.39883116j  -3.9071944-3.9876528j
   3.5721765-4.4356694j   -0.9019805+6.379318j     6.365695 +3.1266918j
 -16.804018 +9.028097j     3.2854116-5.3171387j   14.33073  +1.0027167j ] [ -2.8499086-4.852268j    11.672331 -0.39883116j  -3.9071944-3.9876528j
   3.5721765-4.4356694j   -0.9019805+6.379318j     6.365695 +3.1266918j
 -16.804018 +9.028097j     3.2854116-5.3171387j   14.33073  +1.0027167j ] [ 5.6272984 11.679143   5.58279    5.6952267  6.442769   7.092128
 19.07568    6.2502713 14.3657675] [ 5.6272984 11.679143   5.58279    5.6952267  6.442769   7.092128
 19.07568    6.2502713 14.3657675]

The 'broken' values looks suspiciously like an indexing error, as the second value output is what abs would be if using the wrong fields, and so on.

Numpy/Python version information:

Broken -> numpy == 1.19.0
Last Working -> numpy == 1.18.5

see https://travis-ci.org/github/ahnitz/test-numpy/builds/700834047

@eric-wieser
Copy link
Member

This looks like a vectorization bug to me, possibly introduced in #13885. @r-devulap, thoughts?

We've had bugs like this appear elsewhere, and the cause is usually that the vectorized code assumes the data is contiguous. You'd see the same effect comparing arr[::2] and arr[::2].copy() if this were the case, without needing to involve structured arrays at all.

@charris charris added this to the 1.19.1 release milestone Jun 22, 2020
@r-devulap
Copy link
Member

This was introduced in #15408. The vectorized code assumes that strides is a multiple of the element size, which breaks in this case of mixed data types. I will have a fix for it soon.

@eric-wieser eric-wieser added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Jun 22, 2020
@seberg
Copy link
Member

seberg commented Jun 22, 2020

Hmm, in general we use the uint-alignment here, but I guess that only really matters for complex. I am not sure it only breaks for mixed datatypes? I would think it breaks for complex specifically, because its alignment is considered to be the alignment of the underlying float type? (not quite sure though).

@eric-wieser
Copy link
Member

@r-devulap, presumably this applies to all vectorized loops?

@r-devulap
Copy link
Member

@seberg I am pretty sure it does affect other datatypes too. Just surprised it was never reported.

@r-devulap
Copy link
Member

@eric-wieser I think so.

@seberg
Copy link
Member

seberg commented Jun 22, 2020

@r-devulap the ufunc machinery should normally ensure aligned data by using buffering IIRC. Most loops are not written in a way that handle unaligned data in general.

@r-devulap
Copy link
Member

r-devulap commented Jun 22, 2020

myval = numpy.array([(0, 0.0),                                                  
                     (1, 1.0),                                                  
                     (2, 2.0),                                                  
                     (3, 3.0),                                                  
                     (4, 4.0),                                                  
                     (5, 5.0)], dtype=[('myint', '|S1'), ('myflt', '<f8')])     
temp = myval['myflt']                                                           
print(temp.__array_interface__)         

{'data': (94163961240769, False), 'strides': (9,), 'descr': [('', '<f8')], 'typestr': '<f8', 'shape': (6,), 'version': 3}

The SIMD code assumes that the spacing between strided elements is a multiple of the element data size, which doesn't seem to be the case in the above example where the float64 elements are spaced 9 bytes apart.

@r-devulap
Copy link
Member

Also, it doesn't look like the float64 elements are 8 byte aligned in the above example.

@seberg
Copy link
Member

seberg commented Jun 22, 2020

@r-devulap yes, but within the ufunc machinery, they will be. Try np.add(myval["myflt"], 0), it will use buffering to ensure aligned access from the inner-loop. The problem should really just be complex, because complex alignment happens to be smaller than the itemsize. myval["myflt"].flags shows aligned=False, but:

In [4]: arr = np.ones(10)                                                                                               

In [5]: arr.view(np.complex128).flags                                                                                    
Out[5]: 
  ALIGNED : True

In [6]: arr[1:-1].view(np.complex128).flags                                                                              
Out[6]: 
  ALIGNED : True

is True in both cases, because arr[1:-1].view(np.complex128).dtype.alignment is 8 bytes, and not 16 bytes.

@mattip
Copy link
Member

mattip commented Jun 22, 2020

matmul uses a is_blasable2d function to make sure the data pointer and strides are appropriate for use. Perhaps there should be a similar avx512 check.

@r-devulap
Copy link
Member

@seberg ok, that makes this little less scary of a bug :) While I work on fixing this for complex types, could you point me to the source code for the ufunc machinery? Lack of that knowledge has bitten me more than once now.

@seberg
Copy link
Member

seberg commented Jun 22, 2020

@r-devulap Its a bit hidden. Here is the fast path:

if (op[1] == NULL &&
(order == NPY_ANYORDER || order == NPY_KEEPORDER) &&
PyArray_TRIVIALLY_ITERABLE(op[0])) {
Py_INCREF(dtypes[1]);

But, typically this goes through the iterator here (EDIT: Ops, that link is too far down, somewhere AdvancedNew is created though):

op_it = NpyIter_GetOperandArray(iter);

Which guarantees buffering for unaligned inputs, e.g. here:

if (IsUintAligned(out)) {

(this is exposed as np.nditer in python, as you are probably aware of, which you could use to try around a bit). Interestingly, I do not think there is a way to flag the iterator that unaligned access is OK.

@seberg
Copy link
Member

seberg commented Jun 22, 2020

I suppose it might be reasonable to add an assert for SIMD code though. Since it is conceivable to relax that requirement if flagged to the iterator. On most typical hardware, unaligned reads are not big issue and probably much faster than buffering, AFAIK.

@charris charris removed this from the 1.19.1 release milestone Jul 8, 2020
duncanmmacleod added a commit to duncanmmacleod/pycbc that referenced this issue Nov 11, 2020
the issue reported in numpy/numpy#16660 was fixed in numpy/numpy#16666 and released as part of numpy 1.19.1, so pycbc should now be able to use those newer versions, and just avoid 1.19.0
duncanmmacleod added a commit to duncanmmacleod/pycbc that referenced this issue Nov 20, 2020
the issue reported in numpy/numpy#16660 was fixed in numpy/numpy#16666 and released as part of numpy 1.19.1, so pycbc should now be able to use those newer versions, and just avoid 1.19.0
ahnitz pushed a commit to gwastro/pycbc that referenced this issue Nov 20, 2020
the issue reported in numpy/numpy#16660 was fixed in numpy/numpy#16666 and released as part of numpy 1.19.1, so pycbc should now be able to use those newer versions, and just avoid 1.19.0
OliverEdy pushed a commit to OliverEdy/pycbc that referenced this issue Apr 3, 2023
the issue reported in numpy/numpy#16660 was fixed in numpy/numpy#16666 and released as part of numpy 1.19.1, so pycbc should now be able to use those newer versions, and just avoid 1.19.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants