Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Make Shift Easier to Reason About (and a bit more efficient) #58

Closed
clhodapp opened this issue May 14, 2017 · 3 comments
Closed

Comments

@clhodapp
Copy link

Hey!

Link Summary:

I'm super-thrilled to see that a standard typeclass and datatype for pure IO in the cats is under development!

I read @djspiewak's introductory blog post yesterday and I was struck by how unintuitive the existing shift implementation is.

Before I explain why I think shift is confusing, though, let me first try to frame things by explaining the two goals I believe shift is trying to accomplish:

  1. We want to be able to allow the user to control which ExecutionContext runs their synchronous effects.
  2. We want to allow the user to definitively shift execution back to some "desired" ExecutionContext after some possibly-thread-jumping effects complete.

The way that the present shift goes about achieving these goals is to bracket a target effect with jumps onto a target ExecutionContext (jump to the new context, run the effects, jump to the new context again). That way, if the effects are synchronous, we can control which ExecutionContext runs them (yay, first goal accomplished!). Of course, if the effects are fully or partially asynchronous, they could very well end up shifting execution onto a thread outside of our target ExecutionContext. Therefore, in order to achieve our second goal, we need to insert a second jump to our target ExecutionContext after the target effects complete. That way, we have control of which ExecutionContext is active coming out of our bracket region, instead of our target effect. Interestingly, shift actually turns effects asynchronous. By this, I mean that if you call shift on one effect several times in a row on a single synchronous effect (eff.shift(a).shift(b).shift...), the only the first shift has the power to change which ExecutionContext the synchronous effect runs on. Subsequent calls to shift can only change which context will gain control after the effect completes. Exploiting this is actually critical to achieving both of our two goals at the same time, by leveraging the "double-shift" pattern described in Daniel's post: If we want to jump over to a specialExecutionContext to run a blocking effect and then come back to the main ExecutionContext, we can achieve that by doing something like eff.shift(ioEC).shift(mainEC).

I've just explained how the current implementation allows us to meet both of our goals (even both at the same time!). Why am I writing this issue, then? There are two reasons:

First, shift is extremely difficult to reason about, especially for people other than the original author of a piece of code. This is because each time shift is used, it actually generates two ExecutionContext jumps. This means that the user of shift has to keep the effects of both jumps in mind. This isn't too hard to do once you understand what's going on, but it does mean that you will often get two ExecutionContext jumps when you were only after one. For the reader, though, it's much worse. Any time you come across a call to shift, you generally don't have any clues as to what the author's intentions were. That is, you don't know whether a given call to shift is intended to move a synchronous effect to another ExecutionContext or whether the intention is to shift the execution of subsequent effects (or whether it's both). You end up having to go down the rabbit hole of finding out whether the target effect is contains blocking IO or is fully or partially asynchronous. And even doing that just gives you heuristics.

Things become worse still for learners, in my opinion, since I think the implementation creates a trap that will drive new users toward using shift improperly. This is because most people will initially have just one of the two problems that shift solves. For example, let's say that that a programmer wants to do a file read on their ExecutionContext for blocking operations. They are likely to come across example code like IO(println("Hello!")).shift(ioCtx). This looks really promising! This scan thing seems to allow you to cause some specific effect to run on the ExecutionContext of your choice! At this point, the programmer will probably try using shift this way in their code, find that it seems to work, and move on, unaware that they have introduced a bug into their code. Of course, a programmer that first learns to use shift in order to control the ExecutionContext for subsequent effects is likely to have a similar problem. In both cases, the bug is likely to go unnoticed too, since it probably doesn't affect any computed result. Basically, I believe, that in order to be able to use shift correctly, a programmer has to understand the details of shift's implementation and its interactions with IO.async, which I don't think is reasonable to expect of new learners.

Second, shift is likely to cause code to reschedule itself on ExecutionContexts about twice as often as it needs to. This is because, the vast majority of the time when you use shift, you are either trying to change which ExecutionContext a synchronous effect runs on (in which case, you only need the first ExecutionContext jump that shift inserts) or you are trying to jump to some target ExecutionContext after an asynchronous computation completes (in which casee, you only need the second ExecutionContext jump that shift inserts). In either case, your code will end up yielding its Thread and being rescheduled twice as often as it needs to.

So... What do I propose should be done instead? In short, I propose that shift become an independent, unit-returning effect that injects single ExecutionContext jumps. I've pushed working code showing what I mean to GitHub. You can see the most-relevant section at this link. As you can see, we retain the ability to control which ExecutionContext runs synchronous effects: we just have to shift to our desired ExecutionContext before running them. We also retain the ability to transfer execution back to our desired ExecutionContext after an asynchronous effect: we just have to add a shift after the target effect. There is less ambiguity for readers about the intention behind each call to shift. I believe that it's also night-and-day clearer for code-sample learners. It makes the double-shift pattern something so intuitive that pretty much any user could come up with it on their own on the spot, as opposed to something that you have to be taught as a weird non-intuitive trick. It also completely eliminates the unnecessary reschedulings that the current version of shift creates. In short, I feel this shift is just... better and I propose that it should be adopted in cats-effect.

One other thing... Why is this example tagged as proposal-one in my repo? Well... I have always felt that this pattern of sealing ExecutionContext objects inside of our effects is... well... awful. I've long had the idea that it would be better to directly encode the notion that there should be one pool for blocking IO and one for compute directly into our datatypes and defer the selection of those pools until we actually run an effect. It's less flexible but I'm not sure that it a good design ever needs more ThreadPools than that (plus maybe one extra usually-sleeping Thread for timers if it can't be cleanly integrated into the compute pool). And the flexibility is actually still retained by virtue of the existence of e.g. IO.async (it's just a bit harder to access). Anyway, I intend to play around with this some and may write another proposal if the results are promising. I make no commitment on that, though.

Note: This proposal is also available as a separate file in my cats-effect-scheduling-proposals repo, in case it proves to be hard to read as an issue. Here's a link.

@alexandru
Copy link
Member

alexandru commented May 14, 2017

@clhodapp see these previous discussions:

What do I propose should be done instead? In short, I propose that shift become an independent, unit-returning effect that injects single ExecutionContext jumps. I've pushed working code showing what I mean to GitHub. You can see the most-relevant section at this link

The idea is intriguing. Note you can achieve the same effect right now:

for {
  _ <- IO.unit.shift(blocking)
  a <- ioA
  _ <- IO.unit.shift(computing)
  b <- ioB
} yield a + b

Personally I think shift is not intuitive because it's two operations in one. In Scalaz the operation is called fork and what it does is to fork a logical thread before executing your task, in other words it doesn't say anything about where your task will end up being executed.

Having a shift that is exposed as an effect might be cool, but it doesn't protect the user much:

IO.async { cb =>
  compute.execute(new Runnable {
    def run() = cb(Right(1))
  })
}

In the case above, no matter what combination of shift or flatMap you have, you won't make that code execute on anything else other than compute and because the user sees IO[Int], he has no idea whether he can control that execution context or not.

In other words, if a shift operation is the difference between correct code and a bug, then the IO reference in question isn't constructed properly ... usually it's the producer's job to ensure correctness.

@djspiewak
Copy link
Member

djspiewak commented May 14, 2017

@clhodapp This is very elegant! I like the idea much better than either of the implementations that we've had (and also better than Task.fork). I'm also pretty happy that the implementation becomes simplified (no unsafe functions!) and it's twice as efficient to boot.

Would you be interested in formulating a PR? We should change shift on Effect as well (could put it on Async!), with identical semantics. I think that further discussion will likely materialize on such a pull request.

@alexandru I agree that there's nothing that can save us from a producer giving us an IO constructed with async which does things improperly. The apply analogue of this is handing people an IO which they then shift to an inappropriate pool. It's the same problem, just the other way around and quite a bit less likely to happen (because of how intuitively wrong it is).

In general, I would say that most of the time, improperly shifted IO is just going to result in suboptimal performance and (maybe) thread starvation, rather than outright crashes or misbehaviors. The clearest counter-example to this are UI frameworks like Swing and SWT. The latter will crash if you improperly thread shift, while the former will auto-magically convert all of its function calls from synchronous to asynchronous. Obviously, pool affinity in general is something that serious software needs to be very careful about. I'm not sure we can really make this problem easier (I see that as something that falls under the purview of frameworks like Monix and fs2), but we can at least make the APIs less confusing.

The proposal in the OP does a very nice job, imo, of collapsing the dual semantics of shift down to a single semantic which has the same meaning (both conceptually and semantically) regardless of its context. It does this while simultaneously unifying the notions of a prefix- and a suffix-shift, which I was trying to do with the current shift.

You're certainly right that we can achieve this effect now, but IMO this should be the only way to use shift, and obviously emulating the OP's shift semantic by applying it to IO.unit is going to be considerably less efficient. :-)

@clhodapp
Copy link
Author

Implemented by @djspiewak in #63

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants