Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a task object for per-task join, cancel, and result retrieval #410

Closed
Nikratio opened this issue Jan 20, 2018 · 10 comments
Closed

Comments

@Nikratio
Copy link

Nikratio commented Jan 20, 2018

I am trying to write a server with trio. The main loop looks roughly like this:

async def accept_loop(nursery, sock, context):
    '''Accept connections and start tasks to handle them'''

    while True:
        (client_sock, addr) = await sock.accept()
        nursery.start_soon(process_requests, client_sock, context)

Now, I want to be able to shutdown the server cleanly. I could emit a cancel signal for the nursery, but that wouldn't give the worker tasks a chance to finish processing the hcurrent request. Therefore, the workers check context.should_terminate after each request. By setting this flag and waiting a little, I can cleanly shut down the workers.

That leaves me with the problem of the stopping accept_loop(). I can't rely on a flag here, because sock.accept may block for a very long time. So I need to cancel this task. But if I do so while the worker tasks are still running, they will get terminated as well. And if I do it after the workers have shutdown, then I will keep accepting requests when I actually want to shut down.

It would make my life much easier if I could cancel a specific task, rather than the entire thread group. Is there a reason why nursery.start_soon doesn't return something like curio's Task object which I can then use to cancel just this task?

@njsmith
Copy link
Member

njsmith commented Jan 21, 2018

Curio's Task.cancel comes from a model of cancellation where tasks are both the units of concurrency and the units of cancellation. In Trio, we don't conflate these – you might want to cancel anything from a single function call up to a task up to multiple tasks (and in the nursery system means that these are all very similar in general). Or see this essay for way more detail than you need on cancellation system design...

Anyway, the reason nurseries have attached cancel scopes is because they need to to support exception unwinding, and then once we have the scopes we might as well expose them. And it turns out that nurseries are often a convenient unit to work with. But there's nothing particularly special about them.

We could also automatically provide scopes for each task. But each cancel scope does add some small overhead – not a lot, and further optimizations are possible, but they do have to get checked on very checkpoint, and given that it seems silly to create lots and lots of extra cancel scopes of which most will never be used. And, since it's a lot easier to add APIs than remove them, and we're still figuring out what the best idioms are for working with trio's primitives, I'd rather err on the side of keeping things simpler for now. Maybe later it will become clear that being able to target individual tasks for cancellation is super useful and not having it is annoying, but it's hard to say right now. Or maybe we'll implement something like the soft cancellation support discussed in #147, and that'll take care of your use case in a different way.

In the mean time, you can set up a cancel scope around your accept loop and somehow pass a reference out to where the rest of your code can find it:

async def accept_loop(nursery, sock, context):
    with trio.open_cancel_scope() as cscope:
        # Saves the given cancel scope in an internal list; later context.terminate() will call .cancel() on it
        context.cancel_immediately_on_termination(cscope)
        while True:
            (client_sock, addr) = await sock.accept()
            nursery.start_soon(process_requests, client_sock, context)

Or hey, this is a concurrency library; you could concurrently run the accept loop while also waiting for the termination event:

async def accept_loop(nursery, sock, context):
    async with trio.open_nursery() as inner_nursery:
        async def cancel_when_terminated():
            await context.should_terminate.wait()
            inner_nursery.cancel_scope.cancel()

        inner_nursery.start_soon(cancel_when_terminated)

        while True:
            (client_sock, addr) = await sock.accept()
            nursery.start_soon(process_requests, client_sock, context)

Closing because I don't think there's anything we want to do here right now, but lmk if you still have questions or concerns.

@njsmith njsmith closed this as completed Jan 21, 2018
@Nikratio
Copy link
Author

The way I understand the documentation,I don't see how either of these solutions help.

The new cancel scope in method 1 won't protect the tasks from the nursery's cancellation, so I'd end up cancelling not just the task running accept_loop, but all tasks running process_requests as well - which is precisely what I want to avoid..

Method 2 just seems to modify where the cancellation originates, but not which tasks are affected.

@Nikratio
Copy link
Author

The workaround that I'm using at the moment is that instead of cancelling the task running accept loop, I close the listening socket.

@njsmith
Copy link
Member

njsmith commented Jan 21, 2018

The way I understand the documentation,I don't see how either of these solutions help.

They do work, so maybe there's something unclear in the docs?

Did you see this section? https://trio.readthedocs.io/en/latest/reference-core.html#child-tasks-and-cancellation

instead of cancelling the task running accept loop, I close the listening socket.

I agree that this ought to work, but I'm surprised if it does, because we haven't implemented #36 yet...

@Nikratio
Copy link
Author

Yes, I read that. The section says that "child tasks inherit the parent nursery’s cancel scopes" and from the secton on cancellation I would clearly conclude that an inner cancel scope will not protect a task from cancellation by an outer cancel scope unless the "shield" attribute is set (though I can't point at one specific sentence that says that). So in your first example, the worker tasks will inherit the outer cancel scope, and if we cancel that outer scope (to cancel the task running accept_loop), then the worker tasks will get cancelled too. The inner scope doesn't actually have any effect here.

Were am I wrong?

@njsmith
Copy link
Member

njsmith commented Jan 22, 2018

So in your first example, the worker tasks will inherit the outer cancel scope, and if we cancel that outer scope (to cancel the task running accept_loop), then the worker tasks will get cancelled too. The inner scope doesn't actually have any effect here.

...yes, if you cancel the outer scope, then that doesn't work. The point of those examples is that you should cancel the inner scope :-).

@Nikratio
Copy link
Author

But that's the opposite if what I want to achieve. The task calling accept() should be canceled, the worker tasks should not be chancelled but terminate gracefully.

@njsmith
Copy link
Member

njsmith commented Jan 22, 2018

The task calling accept() should be canceled, the worker tasks should not be chancelled but terminate gracefully.

Right. That's what my examples do.

Or more precisely, the accept loop is cancelled, because it's inside the inner cancel scope, and then after it stops the accept task exits. The worker tasks are running in a nursery that's outside the inner cancel scope, so they're not affected by the inner scope being cancelled.

@Nikratio
Copy link
Author

Ouch, alright, got it! I was getting confused by the fact that the task are spawned in the context of the inner with statement, but actually only run in the context of the outer one. Sorry, and thanks for your patience!

That said, while this solves my particular use-case, it still feels odd to me that there is no task object that I can use to cancel, join (and potentially retrieve the return value) of an individual task. I don't currently have a need for it, but I strongly feel that without it, something is lacking in the API.

@Nikratio Nikratio changed the title Allow cancelling individual tasks Introduce a task object for per-task join, cancel, and result retrieval Jan 22, 2018
@njsmith
Copy link
Member

njsmith commented Jan 22, 2018

Ouch, alright, got it! I was getting confused by the fact that the task are spawned in the context of the inner with statement, but actually only run in the context of the outer one. Sorry, and thanks for your patience!

No worries! If you're coming from other libraries then Trio really requires a shift in perspective. I think we've all been there :-).

That said, while this solves my particular use-case, it still feels odd to me that there is no task object that I can use to cancel, join (and potentially retrieve the return value) of an individual task. I don't currently have a need for it, but I strongly feel that without it, something is lacking in the API.

So it's funny: I felt that way too! The first release of Trio had all that stuff. (Check it out.) But what I found over ~a year of working with thing was that... basically what you just said kept happening over and over. "Well, it turns out in this case it's actually simpler not to use that stuff... but surely next time we'll want it!" Except we never did. And it turned out to create awkwardness for really useful changes, like nursery.start [1] and simplified nurseries, which turned out to critical for all kinds of things (e.g. absolutely needed for pytest-trio, trio-asyncio, websocket libraries, ...). So in 0.2.0 it got deprecated, and 0.3.0 removed it. This also removed a ton of API complexity: looking at the 0.2.0 release notes I count 14 classes/methods/etc. that were removed or moved out of the top-level trio namespace because of this change.

It's possible that the problem with the old API was in the details, not the concept; maybe another version of the same basic idea would be more useful. But I'm fine with deferring this until someone comes up with a proposal for what a better version would look like, with more compelling use cases. And I believe you can implement such an API as a library on top of what Trio has now, so there's nothing stopping folks from experimenting with new APIs here if they want to. So I'm going to leave this closed for now.

Thanks for bringing this up though – it's always useful to review these things, and this gives us somewhere to point people who ask about this.

[1] If start_soon returns a Task, that's easy to ignore when you don't want it, but if start returns a value + a Task, then you end up writing (thing_i_care_about, _) = await nursery.start(...) all the time. Cf. sock.accept(), which I always get wrong because I forget that it returns two values, one of which I never care about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants