Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancelled __aexit__ handlers eat exceptions #455

Open
smurfix opened this issue Feb 22, 2018 · 17 comments
Open

Cancelled __aexit__ handlers eat exceptions #455

smurfix opened this issue Feb 22, 2018 · 17 comments

Comments

@smurfix
Copy link
Contributor

smurfix commented Feb 22, 2018

This is what happens when you don't wrap your async context's exit handler in a shielded cancel scope:

… trick question. Nothing happens – the error gets dropped on the floor. Since this is a fairly common mistake (at least IME) I wonder whether we can do something about it. If not (which is what I suspect) we need to document this pitfall more loudly.

import trio

class bar:
    async def __aenter__(self):
        return self
    async def __aexit__(self, *tb):
        await trio.sleep(0)

async def foo():
    with trio.open_cancel_scope() as s:
        async with bar():
            s.cancel()
            raise RuntimeError("Duh")

trio.run(foo)
@njsmith
Copy link
Member

njsmith commented Feb 23, 2018

Sorry this has been biting you! I'd like to understand better what exactly's happening. I guess it must be something like:

  • Some code raises an exception

  • While that's propagating, the surrounding scope gets cancelled

  • While unwinding for the original exception, Python calls __aexit__, which calls some Trio primitive, which raises a Cancelled exception (probably this exception has the original exception stashed as its __context__, following Python's normal exception chaining rules)

  • Then the Cancelled exception gets swallowed by the original cancel scope, so the original exception is lost entirely

Does that sound right?

It would be helpful to hear more concrete details about how you managed to trigger this combination of circumstances -- maybe it would give clues about how it could be handled better.

#285 and #416 touch on related/overlapping issues.

@njsmith
Copy link
Member

njsmith commented Feb 23, 2018

I should also say I'm very wary about making a blanket recommendation that people use shielding in any __aexit__/finally blocks... it might be the best solution for now in your case, I don't know, but in the general case shielding is dangerous as well (what if the blocking code in the __aexit__ block gets stuck?), so hopefully there's a more specific solution we can come up with.

@goodboy
Copy link
Member

goodboy commented Jul 13, 2018

Ok I just got bit bad by something similar but incorporating async gens..
Makes sense but still was a nightmare to debug..

Check it:

import trio

at_exit_entered = False

class bar:
    async def __aenter__(self):
        return self
    async def __aexit__(self, *tb):
        print(tb)
        global at_exit_entered
        at_exit_entered = True
        await trio.sleep(0)


async def nums(seed):
    async with bar():
        for i in range(seed):
            await trio.sleep(0)
            yield i


async def iterate_nums():
    with trio.open_cancel_scope() as cs:
        async for i in nums(10):
            print(i)
            # cs.shield = True
            await trio.sleep(0.1)
            # cs.shield = False


async def foo():
    async with trio.open_nursery() as n:
        n.start_soon(iterate_nums)
        await trio.sleep(0.8)
        n.cancel_scope.cancel()


trio.run(foo)
# fails since ``bar.__aexit__()`` is never triggered
assert at_exit_entered

Un-commenting the cs.shield = lines of course makes it work.

goodboy pushed a commit to goodboy/tractor that referenced this issue Jul 13, 2018
@njsmith
Copy link
Member

njsmith commented Jul 13, 2018

@tgoodlet ah, yeah, what you're hitting is a different frustrating problem. The problem is that neither you nor Python are taking responsibility for making sure your async generator object gets cleaned up properly. There's more discussion of this in PEP 533 (which is stalled, unfortunately). For now the only real workaround is to replace your async for with a clunky async with+async for combination:

from async_generator import aclosing

async def iterate_nums():
    ...
    async with aclosing(nums(10)) as ait:
        async for i in ait:
            ...

See also:

@njsmith
Copy link
Member

njsmith commented Jul 13, 2018

BTW @smurfix, I'd still be interested in seeing how you managed to trigger the original issue in real code, in case you still remember. You say it's a common mistake, and I don't doubt you, but I also can't quite visualize how you're triggering it :-).

@goodboy
Copy link
Member

goodboy commented Jul 13, 2018

@njsmith ahah! ok I'll try this aclosing() out. Thx!

Also, fwiw, I have the same issue as @smurfix and have to specifically shield in order to get reliable cancellation working for actor nurseries.

@njsmith
Copy link
Member

njsmith commented Oct 23, 2018

#746 was a duplicate of this.

@oremanj, can you give an example of what you were doing when you ran into this? I'm still having trouble getting an intuition for when it happens in practice :-)

There is a plausible justification for the current behavior: the task was going along, executing some arbitrary python code, which got cancelled. And in this case the arbitrary python code happened to be an exception handler. But of course what makes this counterintuitive is that we often think of finally or __aexit__ blocks as things that exceptions pass through, rather than places where they're caught, some logic executes, and then they're rethrown. I'm not really sure how to reconcile these different intuitions.

Over there @smurfix suggested that when an exception propagates into an __aexit__ method, and it raises a new exception, then maybe the two exceptions should be combined into a MultiError, instead of having the new exception replace the old exception. It's an interesting suggestion! Python's standard behavior here is sort of the only thing you can do in classic Python where MultiError doesn't exist, but now that we have MultiError maybe it should be revisited! There is one major obstacle that jumps to mind though, which is that one of the pillars of the MultiError v2 design (#611) is that each location either always wraps exceptions in MultiError, or never wraps exceptions in MultiError, and it would probably be surprising and annoying to have all exceptions that pass through channel.__aexit__ get wrapped into MultiError?

Though tbh we have this problem for nurseries too, and I'm not sure yet how that will all play out. Probably we should give MultiError v2 some time to shake out before we make decisions about __aexit__.

@smurfix
Copy link
Contributor Author

smurfix commented Oct 23, 2018

I'm not too fond of the "always-MultiError-or-never" idea. It's trivial to teach MultiError.split to work on non-MultiError exceptions – just package it into a MultiError and then return (None,exc)or (exc, None. There's a bit of boilerplate involved in that you need

except BaseException as exc:
    err, rest = MultiError.split(exc, RuntimeError)
    if err is None: raise
    … don't forget to think about what happens when `rest` is not `None`

instead of a plain

except RuntimeError as exc:

but that can't really be avoided in any case.

@njsmith
Copy link
Member

njsmith commented Oct 23, 2018

Yeah, it's trivial to teach MultiError.split and MultiError.catch to work on non-MultiError exceptions – in fact they already do, in the prototype – but it's less trivial to find all the places where except won't work, and has to be replaced by something like MultiError.catch.

@smurfix
Copy link
Contributor Author

smurfix commented Oct 23, 2018

Well, that gets more trivial if everything raises a MultiError, so you can't get away with handling the "old" exceptions.
It'll be interesting to see what kind of patterns emerge from all of this so that eventually the Python core can be extended appropriately. Hopefully.

@vxgmichel
Copy link
Contributor

While investigating python-trio/pytest-trio#75, I ran into the same issue. I managed to reproduce the exception swallowing with a simpler (and probably more common) example:

async def main():
   with trio.CancelScope() as scope:
       try:
           raise RuntimeError("Oooops")
       finally:
           scope.cancel()
           await trio.sleep(0)

Note that the exception doesn't get swallowed if await trio.sleep(0) is commented out.

@njsmith
Copy link
Member

njsmith commented Jun 8, 2019

The discussion in python-trio/pytest-trio#77 managed to isolate a case where this could happen accidentally in real-world code. That example seems to rely on a combination of multiple factors: two nested nurseries inside a context manager, a crashing task in the inner nursery, a self-cancellation, and, crucially, the weird semantics in pytest-trio where fixture yield can be cancelled without raising Cancelled. So, I'm glad we found it, but I'm not sure this is what anyone else in this issue was running into...

@Zac-HD
Copy link
Member

Zac-HD commented May 28, 2024

Another case of an exception getting dropped on the floor, here due to a plain old except inside a cancel scope:

    with trio.move_on_after(1):
        try:
            raise ZeroDivisionError()
        except ZeroDivisionError:
            await trio.sleep(10)
            if some_condition():
                raise

I'm genuinely unsure what should happen in this case.

@smurfix
Copy link
Contributor Author

smurfix commented May 28, 2024

I'm genuinely unsure what should happen in this case.

This can also happen the other way 'round, with even more interesting consequences.

Bottom line: you need to shield async code inside exception blocks (in finally: clauses, as well as in except: clauses which might re-raise) from outside cancellations. There's no good way around it.

Ideally the Python code checkers should warn about this, but I don't know if any of them do.

@Zac-HD
Copy link
Member

Zac-HD commented May 28, 2024

flake8-async's ASYNC102 warns you about checkpoints inside finally:, or an except Cancelled:, but not (yet) other except clauses. I guess we should add any other except-that-raises here.

@belm0
Copy link
Member

belm0 commented May 28, 2024

    with trio.move_on_after(1):
        try:
            raise ZeroDivisionError()
        except ZeroDivisionError:
            await trio.sleep(10)
            if some_condition():
                raise

It looks similar to the opening example in #1559. How much are these issues related?

Because for #1559, we work around it with a context manager (originally prototyped by oremanj )

class _preserve_current_exception:
    """A context manager which should surround an ``__exit__`` or
    ``__aexit__`` handler or the contents of a ``finally:``
    block. It ensures that any exception that was being handled
    upon entry is not masked by a `trio.Cancelled` raised within
    the body of the context manager.
    """

https://github.com/python-trio/trio-websocket/blob/6a004bbf4749791e89f15f57cb2aced2c442246f/trio_websocket/_impl.py#L49-L77

@jakkdl
Copy link
Member

jakkdl commented Aug 31, 2024

flake8-async's ASYNC102 warns you about checkpoints inside finally:, or an except Cancelled:, but not (yet) other except clauses. I guess we should add any other except-that-raises here.

this is now implemented and released as ASYNC120 await-in-except

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants