Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Apply.TraverseStrategy #3993

Closed
wants to merge 5 commits into from
Closed

Introduce Apply.TraverseStrategy #3993

wants to merge 5 commits into from

Conversation

johnynek
Copy link
Contributor

This introduces an extension to Apply designed to make traverse and traverse-like functions faster.

The idea is to have a small set of function, sufficient to implement traverse, that work together to handle the laziness of the type constructor.

This allows strict types such as Either[E, *] or Try or List which can have useful short-circuiting behavior, to allow the Traverse instance to use that via Eval, but types such Eval, IO, Validated[E, *], NonEmptyList, etc.. that don't need that laziness, they can directly use map2 without the wasteful boxing.

related to #3790 and #3962

interested in your comments @djspiewak @non

@johnynek
Copy link
Contributor Author

It would be nice to benchmark this approach for IO and see if it matches a hand written loop.

Since I guess almost all instances will be direct or viaEval, I would hope the JIT won't see a megamorphic call site and will pretty much inline this as though it had been written by hand.

@djspiewak
Copy link
Member

I'll add some more thoughts shortly, but very quickly… this is indeed going to be megamorphic. All of the call sites on TraverseStrategy itself exist in generic code which will be shared amongst all traverse usages within a runtime. I thought for a moment that we would be saved by the fact that there are only two TraverseStrategy instances which make sense, but the monad transformers are more compositional than that, meaning that the moment anyone uses Kleisli within a given JVM, everything will deoptimize and unwind. That's going to end up being a pretty significant cost since it will interrupt the inlining within the hot path.

Is there any way we can exploit the fact that there are really only two modalities here, lazy and eager? When I was first thinking about this problem space, I actually assumed we would simply write two different implementations of traverseViaChain rather than trying to abstract them together. I'm not sure that addresses the complexity of Kleisli though.

@johnynek
Copy link
Contributor Author

I should have been more clear: I think it won't be megamorphic because I assume most apps will only be using Direct or ViaEval, but that may be incorrect. Also, I guess almost all of our typeclasses are megamorphic, so I don't think this would be a perf hit anyway, but it would be good to see some benchmarks I think.

I think the problem is there really aren't two ways to do this. Take Kleisli for instance, we actually have a note in the code that it's map2Eval implementation can blow the stack. We don't want Eval[Kleisli[F, A, B]] we want Kleisli[Lambda[x => Eval[F[x]]], A, B].

So, I think ideally we wouldn't have map2Eval at all, and we would just have this TraverseStrategy and let each type define its type Rhs[_].

@johnynek
Copy link
Contributor Author

johnynek commented Nov 3, 2021

@djspiewak I'm going to close this since I actually am not that concerned about the performance, but I think this approach is sound and is a good mechanism to allow types to communicate how their traverse should work.

Feel free to steal any of this code or ideas in any follow up.

It would be nice for traverse on IO to be faster I guess, but that's your department. ;)

@johnynek johnynek closed this Nov 3, 2021
@Daenyth
Copy link
Contributor

Daenyth commented Jan 27, 2022

Relaying some notes from discord:

I didn't have time to benchmark it to see who the winners and losers are. I feared we could be taking perf from some common cases to make IO marginally faster. If we could show that Option, Either, Validated traverse are all just as fast, but Eval, IO and Free are all significantly faster, and people got happy with the result, I think it is worth merging

and

as a carrot for anyone interested… it's definitely not a marginal improvement for IO
like legitimately 2x is about the right estimate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants