-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing #1266 by reordering nextparents (in v2; v1 is NOT FIXED!). #1274
Conversation
for more information, see https://pre-commit.ci
Codecov Report
|
Yes, clearly this isn't as simple a fix as I had first hoped. Maybe I've just missed something, but I'll take another look at some point :) |
This fails on the positional reducers, I believe, because I'd quite like to try re-writing the kernel(s) in |
Now the test-failures are just typetracer failures (I think, I haven't run all the tests locally). |
All the tests pass! Wait—is it the case that the only thing we needed was to sort the |
Yep! I was thinking about it whilst trying to redesign the kernel to compute nextparents and nextcarry directly, and realised that there was probably no guarantee of stability in the sort that we were using. It was the fact that it only failed for positional reducers that made me suspicious though. |
Note to self - if we do merge this, then we should definitely explain why we are using |
No kidding—so that somebody doesn't just say, "Oh, here's a faster sorting algorithm; I'll use that" (and maybe doesn't get segfaults because they're not on a Mac...). This is definitely a faceplam moment. I'm still astonished. So yes, we should think about merging this. I'm still keeping in mind your initial assessment that it's a band-aid on a system that's grown pretty complicated. I was talking with @ianna earlier today—she has a different solution so we should take a look at that. Also, all of these intermediate arrays have a noticeable impact on performance ( Then we'd be presenting only the |
I don't think this is a facepalm moment! There is a lot going on here, and it's not always clear what our invariants are.
Thankfully I think even if the sorter were changed, this would only break positional reducer results, rather than any allocation-related intermediary arrays. It just so happens that if the sorter isn't stable, the values in the reduction groups are visited in different orders, so positional reducers produce incorrect results. I never expected to care about sort stability! I agree about performance. I was thinking about it abstractly that each kernel is doing an allocation and some work, and as we get larger and deeper arrays this will add up. If we can do things in one (or fewer) passes then it means we can claw back some performance. RE optimising |
`stable` is the proper option for this requirement.
Great! Well done @agoose77 ! Thanks! |
I think we're all happy about this fix. There might be a more efficient solution, but this one is correct. I'll be merging it. |
for more information, see https://pre-commit.ci
Thanks all! Note that we didn't fix v1 here, but maybe that's another PR. |
I missed that. Now the title is less misleading. Of course: to do v1, we'd need to run a sorting algorithm in C++. Maybe one of the |
But yes, another PR. Not this one. |
@agoose77, just reordering the
parents
to be monotonic fixes the case you found, but it breaks others. I'm going to leave this open until we know what to do about it.I included @ianna in the branch name, thinking that this PR in v1 might involve a kernel equivalent to
np.argmax
, but it might not.