Fixing #1266 by reordering nextparents (in v2; v1 is NOT FIXED!). #1274

jpivarski · 2022-02-01T17:56:01Z

@agoose77, just reordering the parents to be monotonic fixes the case you found, but it breaks others. I'm going to leave this open until we know what to do about it.

I included @ianna in the branch name, thinking that this PR in v1 might involve a kernel equivalent to np.argmax, but it might not.

for more information, see https://pre-commit.ci

codecov · 2022-02-01T18:27:21Z

Codecov Report

Merging #1274 (2d24309) into main (8e68200) will increase coverage by 1.61%.
The diff coverage is 75.86%.

Impacted Files	Coverage Δ
src/awkward/_v2/_connect/numba/arrayview.py	`96.24% <ø> (ø)`
src/awkward/_v2/_connect/numba/layout.py	`87.01% <ø> (ø)`
src/awkward/_v2/forms/emptyform.py	`78.18% <ø> (-0.25%)`	⬇️
src/awkward/_v2/operations/convert/ak_from_cupy.py	`25.00% <0.00%> (-50.00%)`	⬇️
src/awkward/_v2/operations/describe/ak_type.py	`44.11% <0.00%> (ø)`
src/awkward/_v2/operations/structure/ak_isclose.py	`100.00% <ø> (ø)`
src/awkward/_v2/operations/convert/ak_to_cupy.py	`8.19% <3.44%> (-66.81%)`	⬇️
src/awkward/_v2/operations/describe/ak_backend.py	`9.52% <9.52%> (ø)`
src/awkward/_v2/contents/unmaskedarray.py	`54.95% <25.00%> (-1.12%)`	⬇️
src/awkward/_v2/contents/indexedarray.py	`58.62% <37.14%> (-2.03%)`	⬇️
... and 69 more

agoose77 · 2022-02-04T22:03:11Z

Yes, clearly this isn't as simple a fix as I had first hoped. Maybe I've just missed something, but I'll take another look at some point :)

agoose77 · 2022-02-11T22:37:28Z

This fails on the positional reducers, I believe, because argsort isn't guaranteed to preserve the order between equal elements. A quick check with sorter = np.argsort(nextparents, kind="mergesort") seems to confirm this.

I'd quite like to try re-writing the kernel(s) in listoffsetarray so that we can make use of the guarantee that parents is ordered, but that's a bigger task than I have time for right now.

agoose77 · 2022-02-11T22:57:43Z

Now the test-failures are just typetracer failures (I think, I haven't run all the tests locally).

jpivarski · 2022-02-11T23:10:05Z

All the tests pass!

Wait—is it the case that the only thing we needed was to sort the parents with a stable sort? That's all that was wrong?

agoose77 · 2022-02-11T23:12:59Z

Wait—is it the case that the only thing we needed was to sort the parents with a stable sort? That's all that was wrong?

Yep! I was thinking about it whilst trying to redesign the kernel to compute nextparents and nextcarry directly, and realised that there was probably no guarantee of stability in the sort that we were using. It was the fact that it only failed for positional reducers that made me suspicious though.

agoose77 · 2022-02-11T23:16:24Z

Note to self - if we do merge this, then we should definitely explain why we are using mergesort.

jpivarski · 2022-02-11T23:36:08Z

No kidding—so that somebody doesn't just say, "Oh, here's a faster sorting algorithm; I'll use that" (and maybe doesn't get segfaults because they're not on a Mac...).

This is definitely a faceplam moment. I'm still astonished.

So yes, we should think about merging this. I'm still keeping in mind your initial assessment that it's a band-aid on a system that's grown pretty complicated. I was talking with @ianna earlier today—she has a different solution so we should take a look at that.

Also, all of these intermediate arrays have a noticeable impact on performance (ak.sum is significantly slower than np.sum), and I've been thinking for a while that we should perhaps have a separate code path for the axis=-1 case. In #579, I was thinking that some of these arrays could be skipped without a separate code path, but we could skip a lot more of them with one. The downside is that looks like more complexity, having two ways to do something, but that all-important axis=-1 case would be less difficult to understand if it's handled in isolation. (If, for instance, someone is looking at it with a profiling tool, trying to see what can be trimmed.)

Then we'd be presenting only the axis != -1 case as something unavoidably complex and slow. As we've learned this week, it takes great effort to understand what it's doing.

agoose77 · 2022-02-11T23:43:14Z

This is definitely a faceplam moment. I'm still astonished.

I don't think this is a facepalm moment! There is a lot going on here, and it's not always clear what our invariants are.

(and maybe doesn't get segfaults because they're not on a Mac...).

Thankfully I think even if the sorter were changed, this would only break positional reducer results, rather than any allocation-related intermediary arrays. It just so happens that if the sorter isn't stable, the values in the reduction groups are visited in different orders, so positional reducers produce incorrect results. I never expected to care about sort stability!

I agree about performance. I was thinking about it abstractly that each kernel is doing an allocation and some work, and as we get larger and deeper arrays this will add up. If we can do things in one (or fewer) passes then it means we can claw back some performance.

RE optimising axis=-1 reductions. I suspect these are very common. It would be interesting to see how much faster they are than the slow path. My only worry is more code to debug (e.g. this reduction bug that only shows up at n>=4), but I am also cognizant of how even a 10% perf gain is substantial. This is certainly more your area of experience in the Awkward domain, so I'm not going to try and push my opinion around!

`stable` is the proper option for this requirement.

ianna · 2022-02-14T09:13:26Z

Great! Well done @agoose77 ! Thanks!

jpivarski · 2022-02-18T02:21:51Z

I think we're all happy about this fix. There might be a more efficient solution, but this one is correct. I'll be merging it.

for more information, see https://pre-commit.ci

agoose77 · 2022-02-19T00:06:15Z

Thanks all! Note that we didn't fix v1 here, but maybe that's another PR.

jpivarski · 2022-02-19T01:02:54Z

I missed that. Now the title is less misleading.

Of course: to do v1, we'd need to run a sorting algorithm in C++. Maybe one of the argsort kernels can be used for that.

jpivarski · 2022-02-19T01:03:04Z

But yes, another PR. Not this one.

Reordered nextparents to be monotonic (in v2), but this isn't right.

47b575d

jpivarski linked an issue Feb 1, 2022 that may be closed by this pull request

ak.sum produces incorrect structure for outer dimension #1266

Closed

[pre-commit.ci] auto fixes from pre-commit.com hooks

15e9e45

for more information, see https://pre-commit.ci

[skip-ci] add a 4d study case

de2d62b

Fix: use mergesort, which is stable for like elements

f4f996f

This fixes the typetracer failures.

b0343a1

Fix: use stable instead of mergesort

92d9f82

`stable` is the proper option for this requirement.

Merge branch 'main' into jpivarski-ianna/fix-4D-reducers

1f480df

jpivarski marked this pull request as ready for review February 18, 2022 02:20

jpivarski enabled auto-merge (squash) February 18, 2022 02:21

pre-commit-ci bot and others added 2 commits February 18, 2022 02:22

[pre-commit.ci] auto fixes from pre-commit.com hooks

0771274

for more information, see https://pre-commit.ci

Fixed botched merge.

2d24309

jpivarski merged commit d0b383f into main Feb 18, 2022

jpivarski deleted the jpivarski-ianna/fix-4D-reducers branch February 18, 2022 03:23

jpivarski changed the title ~~Fixing #1266 (in v1 and v2), possibly by reordering nextparents.~~ Fixing #1266 by reordering nextparents (in v2; v1 is NOT FIXED!). Feb 19, 2022

jpivarski mentioned this pull request Feb 26, 2022

Specialize reducers with axis=-1 #1323

Open

jpivarski mentioned this pull request Mar 29, 2022

If this test passes, #1283 is fixed. #1388

Closed

agoose77 mentioned this pull request Aug 3, 2022

Development: use standardised PR title prefixes? #1577

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing #1266 by reordering nextparents (in v2; v1 is NOT FIXED!). #1274

Fixing #1266 by reordering nextparents (in v2; v1 is NOT FIXED!). #1274

jpivarski commented Feb 1, 2022

codecov bot commented Feb 1, 2022 •

edited

Loading

agoose77 commented Feb 4, 2022

agoose77 commented Feb 11, 2022 •

edited

Loading

agoose77 commented Feb 11, 2022

jpivarski commented Feb 11, 2022

agoose77 commented Feb 11, 2022

agoose77 commented Feb 11, 2022

jpivarski commented Feb 11, 2022

agoose77 commented Feb 11, 2022 •

edited

Loading

ianna commented Feb 14, 2022

jpivarski commented Feb 18, 2022

agoose77 commented Feb 19, 2022 •

edited

Loading

jpivarski commented Feb 19, 2022

jpivarski commented Feb 19, 2022

Fixing #1266 by reordering nextparents (in v2; v1 is NOT FIXED!). #1274

Fixing #1266 by reordering nextparents (in v2; v1 is NOT FIXED!). #1274

Conversation

jpivarski commented Feb 1, 2022

codecov bot commented Feb 1, 2022 • edited Loading

Codecov Report

agoose77 commented Feb 4, 2022

agoose77 commented Feb 11, 2022 • edited Loading

agoose77 commented Feb 11, 2022

jpivarski commented Feb 11, 2022

agoose77 commented Feb 11, 2022

agoose77 commented Feb 11, 2022

jpivarski commented Feb 11, 2022

agoose77 commented Feb 11, 2022 • edited Loading

ianna commented Feb 14, 2022

jpivarski commented Feb 18, 2022

agoose77 commented Feb 19, 2022 • edited Loading

jpivarski commented Feb 19, 2022

jpivarski commented Feb 19, 2022

codecov bot commented Feb 1, 2022 •

edited

Loading

agoose77 commented Feb 11, 2022 •

edited

Loading

agoose77 commented Feb 11, 2022 •

edited

Loading

agoose77 commented Feb 19, 2022 •

edited

Loading