-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ak.sum
produces incorrect structure for outer dimension
#1266
Comments
I'm not yet familiar enough with how we use outstarts = np.zeros(4, dtype=np.int64)
outstops = np.zeros(4, dtype=np.int64)
distincts, distincts_length, gaps, outlength = (
np.array([0, 0, 0, 2, 2, 2, 1, 1, 1, 3, 3, 3], dtype=np.int64),
12,
np.array([1, 2, 1, 140187859207408], dtype=np.int64),
4,
)
def awkward_ListOffsetArray_reduce_nonlocal_outstartsstops_64(
outstarts, outstops, distincts, lendistincts, gaps, outlength
):
maxcount = lendistincts if outlength == 0 else lendistincts // outlength
j = 0
k = 0
maxdistinct = -1
lasti = -1
for i in range(lendistincts):
j_0 = j
if maxdistinct < distincts[i]:
maxdistinct = distincts[i]
extra = (i - lasti) // maxcount
lasti = i
numgappy = gaps[j]
if numgappy < extra:
numgappy = extra
# Create gaps for blank sublists with start=stop
for gappy in range(numgappy):
outstarts[k] = i
outstops[k] = i
k = k + 1
j = j + 1
if distincts[i] != -1:
outstops[k - 1] = i + 1
# Fill the remaining start-stops to produce empty lists
for k in range(k, outlength):
outstarts[k] = lendistincts + 1
outstops[k] = lendistincts + 1
awkward_ListOffsetArray_reduce_nonlocal_outstartsstops_64(
outstarts, outstops, distincts, distincts_length, gaps, outlength
) |
I don't think this is actually the cause of this problem though (because we never read these o-o-b values). Instead, something looks funky with the distincts ( |
This does look wrong. Does it work in v1? Cases like this were extensively tested (and I thought I had ported all of those tests to v2). Since it's a structure thing, it shouldn't matter which reducer we use, and I find that If it is a v1/v2 difference, the issue wouldn't be in the kernel itself, since that's the same for both. It would be a matter of how it's called and what Indexes it's writing into. |
@jpivarski it doesn't work in v1 either. As I understand our reducers, and based upon what you've said, the
which means that later kernel functions that appear to assume parents are monotonic fails (which is why I suspected |
This simpler example is import awkward as ak
x = ak._v2.Array([
[
[
[1],
[4]
],
[
[5],
[8]
]
],
[
[
[9],
[12]
],
[
[13],
[16]
]
]
])
print(ak._v2.sum(x, axis=1))
print(ak._v2.to_numpy(x).sum(axis=1)) |
I modified if depth == 3:
nextparents[:] = [0, 0, 1, 1, 2, 2, 3, 3]
nextcarry[:] = [0, 2, 1, 3, 4, 6, 5, 7] and this then produces the correct result, making me think it's the aforementioned kernel computing |
Let me backpedal on that statement: at least integer values are guaranteed to be contiguous—i.e. the second Argmin and argmax need a
Edit: for the moment, I'm going to be non-committal about whether |
The The "below" case is the easier one, and for The "above" case was motivated by scikit-hep/awkward-0.x#166. The implementation strategy started by just trying to reproduce NumPy for fixed list lengths by reordering the next level down (reordering Index is Things like "maxcount," "findgaps," and "offsetscopy" are complications for dealing with variable list lengths, beyond NumPy. Since you saw an error in a purely NumPy-like case, those aren't relevant to the problem. Similarly, the kernels that manage the The testing started in tests/test_0115-generic-reducer-operation.py. Notice that they start by dealing with the NumPy-like case before moving on to issues with "gaps" and such. The NumPy-like tests all work on a 3D array >>> np.array(primes)[:2*3*5].reshape(2, 3, 5)
array([[[ 2, 3, 5, 7, 11],
[ 13, 17, 19, 23, 29],
[ 31, 37, 41, 43, 47]],
[[ 53, 59, 61, 67, 71],
[ 73, 79, 83, 89, 97],
[101, 103, 107, 109, 113]]]) and test all possible >>> ak.sum(ak.Array([[[[1], [4]], [[5], [8]]], [[[9], [12]], [[13], [16]]]]), axis=0).tolist()
[[[10], []], [[16, 18], []]]
>>> ak.sum(ak.Array([[[[1], [4]], [[5], [8]]], [[[9], [12]], [[13], [16]]]]), axis=1).tolist()
[[[6], []], [[12, 22], []]] The Reduction does not treat RegularArrays any differently from List/ListOffsetArrays, except inasmuch as they preserve regularity in the output (a later correction, #447). For what it's worth, the reducer algorithms were first developed outside the context of Awkward Arrays, in studies/reducers.py, to be able to make tests rapidly and avoid the formality of defining kernels right away. This file might be useful to see how the kernels work together without the special cases that have accumulated over the years. See, for instance, for a straight-through implementation of One last thing: the only kernel that updates a specially prepared Index is awkward_ListOffsetArray_reduce_nonlocal_preparenext, which uses and changes-in-place |
I've looked into this some more, and I now understand that this bug is a manifestation of parents having non well-defined semantics. Some kernels ( It looks like these kernels are computing I'd like to just clarify my understanding of what these arrays mean:
As far as the actual reduction is concerned, the order of parents doesn't matter - it just reduces and writes to the output. It's only when rebuilding the structure that we care about this order. More concretely, consider this example: x = ak._v2.Array([
[
[
[1],
[4]
],
[
[5],
[8]
],
[
[6],
[7]
]
],
[
[
[9],
[12]
],
[
[13],
[16]
]
]
]) Assume this layout is only data = [1, 4, 5, 8, 6, 7, 9, 12, 13, 16] At the nonlocal level, we want to produce a carry = [0, 2, 4, 1, 3, 5, 6, 8, 7, 9]
nextparents = [0, 0, 0, 1, 1, 1, 2, 2, 3, 3] Whereas right now we have carry = [0, 2, 4, 6, 8, 1, 3, 5, 7, 9]
nextparents = [0, 0, 0, 2, 2, 1, 1, 1, 3, 3] If this is the case, I think we need |
Especially given the fact that I don't remember now, it's entirely possible that I had different ideas about the monotonicity of A way to test this idea, and possibly as a long-term fix, is to put a The studies script should still run as-is. If you put one of your reproducing examples into it, you should see the same error there because it's a flaw in the logic, and that script never tested a 4D regular array. Adding the reordering step would be a lot faster and hackable in that script than it would be in the real thing—you'd get a confirmation that it works before spending the time needed to create a new kernel. What do you think? |
This would work as a band-aid, I tested it earlier :) j = self._nplike.argsort(nextparents)
nextcarry = ak._v2.index.Index64(self._nplike.asarray(nextcarry)[j])
nextparents = ak._v2.index.Index64(self._nplike.asarray(nextparents)[j]) I think we probably can avoid this by re-writing |
There are so many index-rewrites in this code-path that adding one more as a band-aid wouldn't disturb the performance much. If the question is whether that would obfuscate the code more, that's a more serious concern, but I think a kernel that only reorders could be easier to read. It has the advantage that we can guarantee that it is not wrong only by reading it—in isolation, unconnected to anything else. Certainly none of the other kernels are going to break by having monotonic If the band-aid can be applied in short order (within your time constraints), then that would be more valuable than doing a more thorough job later, with the risk of it not happening at all. Things do fall through the cracks, and this is a bug-fix. Bug-fixes are important! If you can write down, here perhaps, what that kernel should look like, another one of us can implement the kernel (i.e. adding to YAML and cpu-kernels). @ianna and I have experience in that workflow. Oh! I see that the code you wrote is a complete v2 fix. We might even have an equivalent of |
Yes to both! Yeah, if we put a todo to just fix the I understand |
Hmm... I managed to crash v1 with the following test: array = np.array([
[
[
[1],
[4]
],
[
[5],
[8]
]
],
[
[
[9],
[12]
],
[
[13],
[16]
]
]
])
nplike = ak.nplike.of(array)
ak_array = ak.from_numpy(array)
assert ak.to_list(ak.sum(ak_array, axis=1)) == ak.to_list(array.sum(axis=1)) |
@ianna did this segfault or raise an Exception? I see only a |
yep, the first :-( |
I think, I need some help here. It seems to me, I miss some extra information - This kernel calculates the |
@ianna Yes, I'm pretty sure that Please stop me if I am telling you things that you already know! This is just from my attempts to understand this logic: The However, I made an assumption about |
I'm looking at this further. Your test case is interesting @ianna , because it seems like the truly jagged case is not fixed by ordering the parents, which is why I missed it with the regular test case. |
How Jagged Reduction WorksThis is for my benefit in referencing how we do jagged reduction. There are two types of information running through jagged reduction:
Some of the context for this content is passed from the parent. In the case of the root content, the parent does not exist, so this information is given by the assumption of an outer list of length I'm ignoring From the parent (pertaining to this layout), we have:
From this content, we compute the following for the reduction of our child:
From this content, we compute the following for our wrapped reduction result:
|
* Reordered nextparents to be monotonic (in v2), but this isn't right. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [skip-ci] add a 4d study case * Fix: use mergesort, which is stable for like elements * This fixes the typetracer failures. * Fix: use `stable` instead of `mergesort` `stable` is the proper option for this requirement. Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ianna Osborne <ianna.osborne@cern.ch> Co-authored-by: Angus Hollands <goosey15@gmail.com>
Version of Awkward Array
56312ea
This is seen in both v1 and v2 in master
Description and code to reproduce
Given this layout:
Calling
ak.sum(x, axis=0)
does not match (in structure) the NumPy result:The text was updated successfully, but these errors were encountered: