From c09f7776fc92045c7ac800045c342ef6950fbfdd Mon Sep 17 00:00:00 2001 From: Sean Quah Date: Mon, 16 May 2022 23:30:45 +0100 Subject: [PATCH 1/5] Add documentation for cancellation of request processing Signed-off-by: Sean Quah --- changelog.d/12761.doc | 1 + docs/SUMMARY.md | 1 + docs/cancellation.md | 387 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 389 insertions(+) create mode 100644 changelog.d/12761.doc create mode 100644 docs/cancellation.md diff --git a/changelog.d/12761.doc b/changelog.d/12761.doc new file mode 100644 index 000000000000..2eb2c0976f1b --- /dev/null +++ b/changelog.d/12761.doc @@ -0,0 +1 @@ +Add documentation for cancellation of request processing. diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md index 65570cefbe1b..5ba80d694586 100644 --- a/docs/SUMMARY.md +++ b/docs/SUMMARY.md @@ -89,6 +89,7 @@ - [Database Schemas](development/database_schema.md) - [Experimental features](development/experimental_features.md) - [Synapse Architecture]() + - [Cancellation](cancellation.md) - [Log Contexts](log_contexts.md) - [Replication](replication.md) - [TCP Replication](tcp_replication.md) diff --git a/docs/cancellation.md b/docs/cancellation.md new file mode 100644 index 000000000000..f0265b71ec58 --- /dev/null +++ b/docs/cancellation.md @@ -0,0 +1,387 @@ +# Cancellation +Sometimes, requests take a long time to service and clients disconnect +before Synapse produces a response. To avoid wasting resources, Synapse +can cancel request processing for select endpoints with the `@cancelled` +decorator. + +Synapse makes use of Twisted's `Deferred.cancel()` feature to make +cancellation work. + +## Enabling cancellation for an endpoint +1. Check that the endpoint method, and any `async` functions in its call + tree handle cancellation correctly. See + [Handling cancellation correctly](#handling-cancellation-correctly) + for a list of things to look out for. +2. Apply the `@cancellable` decorator to the `on_GET/POST/PUT/DELETE` + method. It's not recommended to make non-`GET` methods cancellable, + since cancellation midway through some database updates is less + likely to be handled correctly. + +## Mechanics +There are two stages to cancellation: downward propagation of a +`cancel()` call, followed by upwards propagation of a `CancelledError` +out of a blocked `await`. +Both Twisted and asyncio have a cancellation mechanism. + +| | Method | Exception | Exception inherits from | +|---------------|---------------------|-----------------------------------------|-------------------------| +| Twisted | `Deferred.cancel()` | `twisted.internet.defer.CancelledError` | `Exception` (!) | +| asyncio | `Task.cancel()` | `asyncio.CancelledError` | `BaseException` | + +### Deferred.cancel() +When Synapse starts handling a request, it runs the async method +responsible for handling it using `defer.ensureDeferred`, which returns +a `Deferred`. + +```python +def do_something() -> Deferred[None]: + ... + +async def on_GET() -> Tuple[int, JsonDict]: + d = make_deferred_yieldable(do_something()) + await d + return 200, {} + +request = defer.ensureDeferred(on_GET()) +``` + +During cancellation, `Deferred.cancel()` is called on the `Deferred` +from `defer.ensureDeferred`, `request`. Twisted knows which `Deferred` +`request` is waiting on and passes the `cancel()` call on to `d`. + +The `Deferred` being waited on, `d`, may have its own handling for +`cancel()` and pass the call on to other `Deferred`s. + +Eventually, a `Deferred` handles the `cancel()` call by resolving itself +with a `CancelledError`. + +### CancelledError +The `CancelledError` gets raised out of the `await` and bubbles up, as +per normal Python exception handling. + +## Handling cancellation correctly +In general, when writing code that might be subject to cancellation, two +things must be considered: + * The effect of `CancelledError`s raised out of `await`s. + * The effect of `Deferred`s being `cancel()`ed. + +Examples of code that handles cancellation incorrectly include: + * `try-except` blocks which swallow `CancelledError`s. + * Code that shares the same `Deferred`, which may be cancelled, between + multiple requests. + * Code that starts some processing that's exempt from cancellation, but + uses a logging context from cancellable code. The logging context + will be finished upon cancellation, while the uncancelled processing + is still using it. + +Some common patterns are listed below in more detail. + +### `async` function calls +Most functions in Synapse are relatively straightforward from a +cancellation standpoint: they don't do anything with `Deferred`s and +purely call and `await` other `async` functions. + +An `async` function handles cancellation correctly if its own code +handles cancellation correctly and all the async function it calls +handle cancellation correctly. For example: +```python +async def do_two_things() -> None: + check_something() + await do_something() + await do_something_else() +``` +`do_two_things` handles cancellation correctly if `do_something` and +`do_something_else` handle cancellation correctly. + +That is, when checking whether a function handles cancellation +correctly, its implementation and all its `async` function calls need to +be checked, recursively. + +As `check_something` is not `async`, it does not need to be checked. + +### CancelledErrors +Because Twisted's `CancelledError`s are `Exception`s, it's easy to +accidentally catch and suppress them. Care must be taken to ensure that +`CancelledError`s are allowed to propagate upwards. + + + + + + + + + + +
+ +**Bad**: +```python +try: + await do_something() +except Exception: + # `CancelledError` gets swallowed here. + logger.info(...) +``` + + +**Good**: +```python +try: + await do_something() +except CancelledError: + raise +except Exception: + logger.info(...) +``` +
+ +**OK**: +```python +try: + check_something() + # A `CancelledError` won't ever be raised here. +except Exception: + logger.info(...) +``` + + +**Good**: +```python +try: + await do_something() +except ValueError: + logger.info(...) +``` +
+ +#### defer.gatherResults +`defer.gatherResults` produces a `Deferred` which: + * broadcasts `cancel()` calls to every `Deferred` being waited on. + * wraps the first exception it sees in a `FirstError`. + +Together, this means that `CancelledError`s will be wrapped in +a `FirstError` unless unwrapped. Such `FirstError`s are liable to be +swallowed, so they must be unwrapped. + + + + + + +
+ +**Bad**: +```python +async def do_something() -> None: + await make_deferred_yieldable( + defer.gatherResults([...], consumeErrors=True) + ) + +try: + await do_something() +except CancelledError: + raise +except Exception: + # `FirstError(CancelledError)` gets swallowed here. + logger.info(...) +``` + + + +**Good**: +```python +async def do_something() -> None: + await make_deferred_yieldable( + defer.gatherResults([...], consumeErrors=True) + ).addErrback(unwrapFirstError) + +try: + await do_something() +except CancelledError: + raise +except Exception: + logger.info(...) +``` +
+ +### Creation of `Deferred`s +If a function creates a `Deferred`, the effect of cancelling it must be considered. `Deferred`s that get shared are likely to have unintended behaviour when cancelled. + + + + + + + + + +
+ +**Bad**: +```python +cache: Dict[str, Deferred[None]] = {} + +def wait_for_room(room_id: str) -> Deferred[None]: + deferred = cache.get(room_id) + if deferred is None: + deferred = Deferred() + cache[room_id] = deferred + # `deferred` can have multiple waiters. + # All of them will observe a `CancelledError` + # if any one of them is cancelled. + return make_deferred_yieldable(deferred) + +# Request 1 +await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") +# Request 2 +await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") +``` + + +**Good**: +```python +cache: Dict[str, Deferred[None]] = {} + +def wait_for_room(room_id: str) -> Deferred[None]: + deferred = cache.get(room_id) + if deferred is None: + deferred = Deferred() + cache[room_id] = deferred + # `deferred` will never be cancelled now. + # A `CancelledError` will still come out of + # the `await`. + # `delay_cancellation` may also be used. + return make_deferred_yieldable(stop_cancellation(deferred)) + +# Request 1 +await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") +# Request 2 +await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") +``` +
+ + +**Good**: +```python +cache: Dict[str, List[Deferred[None]]] = {} + +def wait_for_room(room_id: str) -> Deferred[None]: + if room_id not in cache: + cache[room_id] = [] + # Each request gets its own `Deferred` to wait on. + deferred = Deferred() + cache[room_id]].append(deferred) + return make_deferred_yieldable(deferred) + +# Request 1 +await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") +# Request 2 +await wait_for_room("!aAAaaAaaaAAAaAaAA:matrix.org") +``` +
+ +### Uncancelled processing +Some `async` functions may kick off some `async` processing which is +intentionally protected from cancellation, by `stop_cancellation` or +other means. If the `async` processing inherits the logcontext of the +request which initiated it, care must be taken to ensure that the +logcontext is not finished before the `async` processing completes. + + + + + + + + + + +
+ +**Bad**: +```python +cache: Optional[ObservableDeferred[None]] = None + +async def do_something_else( + to_resolve: Deferred[None] +) -> None: + await ... + logger.info("done!") + to_resolve.callback(None) + +async def do_something() -> None: + if not cache: + to_resolve = Deferred() + cache = ObservableDeferred(to_resolve) + # `do_something_else` will never be cancelled and + # can outlive the `request-1` logging context. + run_in_background(do_something_else, to_resolve) + + await make_deferred_yieldable(cache.observe()) + +with LoggingContext("request-1"): + await do_something() +``` + + +**Good**: +```python +cache: Optional[ObservableDeferred[None]] = None + +async def do_something_else( + to_resolve: Deferred[None] +) -> None: + await ... + logger.info("done!") + to_resolve.callback(None) + +async def do_something() -> None: + if not cache: + to_resolve = Deferred() + cache = ObservableDeferred(to_resolve) + run_in_background(do_something_else, to_resolve) + # We'll wait until `do_something_else` is + # done before raising a `CancelledError`. + await make_deferred_yieldable( + delay_cancellation(cache.observe()) + ) + else: + await make_deferred_yieldable(cache.observe()) + +with LoggingContext("request-1"): + await do_something() +``` +
+ +**OK**: +```python +cache: Optional[ObservableDeferred[None]] = None + +async def do_something_else( + to_resolve: Deferred[None] +) -> None: + await ... + logger.info("done!") + to_resolve.callback(None) + +async def do_something() -> None: + if not cache: + to_resolve = Deferred() + cache = ObservableDeferred(to_resolve) + # `do_something_else` will get its own independent + # logging context. `request-1` will not count any + # metrics from `do_something_else`. + run_as_background_process( + "do_something_else", + do_something_else, + to_resolve, + ) + + await make_deferred_yieldable(cache.observe()) + +with LoggingContext("request-1"): + await do_something() +``` + +
From 48d91cfba2ae83f65feb024997275b8f892894ce Mon Sep 17 00:00:00 2001 From: Sean Quah Date: Tue, 17 May 2022 14:34:33 +0100 Subject: [PATCH 2/5] Move docs/cancellation.md to docs/development/synapse_architecture/ --- docs/SUMMARY.md | 2 +- docs/{ => development/synapse_architecture}/cancellation.md | 0 2 files changed, 1 insertion(+), 1 deletion(-) rename docs/{ => development/synapse_architecture}/cancellation.md (100%) diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md index 5ba80d694586..8400a6539a4e 100644 --- a/docs/SUMMARY.md +++ b/docs/SUMMARY.md @@ -89,7 +89,7 @@ - [Database Schemas](development/database_schema.md) - [Experimental features](development/experimental_features.md) - [Synapse Architecture]() - - [Cancellation](cancellation.md) + - [Cancellation](development/synapse_architecture/cancellation.md) - [Log Contexts](log_contexts.md) - [Replication](replication.md) - [TCP Replication](tcp_replication.md) diff --git a/docs/cancellation.md b/docs/development/synapse_architecture/cancellation.md similarity index 100% rename from docs/cancellation.md rename to docs/development/synapse_architecture/cancellation.md From 64db88877de829f9e3942d4f6df5862c155581c2 Mon Sep 17 00:00:00 2001 From: Sean Quah Date: Tue, 17 May 2022 14:42:42 +0100 Subject: [PATCH 3/5] Fix typo (thanks anoa) --- docs/development/synapse_architecture/cancellation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/development/synapse_architecture/cancellation.md b/docs/development/synapse_architecture/cancellation.md index f0265b71ec58..b01dfbb171b2 100644 --- a/docs/development/synapse_architecture/cancellation.md +++ b/docs/development/synapse_architecture/cancellation.md @@ -1,8 +1,8 @@ # Cancellation Sometimes, requests take a long time to service and clients disconnect before Synapse produces a response. To avoid wasting resources, Synapse -can cancel request processing for select endpoints with the `@cancelled` -decorator. +can cancel request processing for select endpoints marked with the +`@cancellable` decorator. Synapse makes use of Twisted's `Deferred.cancel()` feature to make cancellation work. From 3a6e68ed8dc37421d5b5e60fd34d1b43188f8f1e Mon Sep 17 00:00:00 2001 From: Sean Quah Date: Tue, 17 May 2022 14:43:10 +0100 Subject: [PATCH 4/5] Expand on the @cancellable decorator --- .../synapse_architecture/cancellation.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/docs/development/synapse_architecture/cancellation.md b/docs/development/synapse_architecture/cancellation.md index b01dfbb171b2..8b0d164a3120 100644 --- a/docs/development/synapse_architecture/cancellation.md +++ b/docs/development/synapse_architecture/cancellation.md @@ -5,14 +5,15 @@ can cancel request processing for select endpoints marked with the `@cancellable` decorator. Synapse makes use of Twisted's `Deferred.cancel()` feature to make -cancellation work. +cancellation work. The `@cancellable` decorator does nothing by itself +and merely acts as a flag. ## Enabling cancellation for an endpoint 1. Check that the endpoint method, and any `async` functions in its call tree handle cancellation correctly. See [Handling cancellation correctly](#handling-cancellation-correctly) for a list of things to look out for. -2. Apply the `@cancellable` decorator to the `on_GET/POST/PUT/DELETE` +2. Add the `@cancellable` decorator to the `on_GET/POST/PUT/DELETE` method. It's not recommended to make non-`GET` methods cancellable, since cancellation midway through some database updates is less likely to be handled correctly. @@ -31,12 +32,13 @@ Both Twisted and asyncio have a cancellation mechanism. ### Deferred.cancel() When Synapse starts handling a request, it runs the async method responsible for handling it using `defer.ensureDeferred`, which returns -a `Deferred`. +a `Deferred`. For example: ```python def do_something() -> Deferred[None]: ... +@cancellable async def on_GET() -> Tuple[int, JsonDict]: d = make_deferred_yieldable(do_something()) await d @@ -45,8 +47,10 @@ async def on_GET() -> Tuple[int, JsonDict]: request = defer.ensureDeferred(on_GET()) ``` -During cancellation, `Deferred.cancel()` is called on the `Deferred` -from `defer.ensureDeferred`, `request`. Twisted knows which `Deferred` +When a client disconnects early, Synapse checks for the presence of the +`@cancellable` decorator on `on_GET`. Since `on_GET` is cancellable, +`Deferred.cancel()` is called on the `Deferred` from +`defer.ensureDeferred`, `request`. Twisted knows which `Deferred` `request` is waiting on and passes the `cancel()` call on to `d`. The `Deferred` being waited on, `d`, may have its own handling for From 482f4dd82d238368c739bd3616aa698109fcce6c Mon Sep 17 00:00:00 2001 From: Sean Quah Date: Wed, 18 May 2022 16:45:17 +0100 Subject: [PATCH 5/5] Apply suggestions from code review --- docs/development/synapse_architecture/cancellation.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/development/synapse_architecture/cancellation.md b/docs/development/synapse_architecture/cancellation.md index 8b0d164a3120..ef9e0226353b 100644 --- a/docs/development/synapse_architecture/cancellation.md +++ b/docs/development/synapse_architecture/cancellation.md @@ -6,7 +6,8 @@ can cancel request processing for select endpoints marked with the Synapse makes use of Twisted's `Deferred.cancel()` feature to make cancellation work. The `@cancellable` decorator does nothing by itself -and merely acts as a flag. +and merely acts as a flag, signalling to developers and other code alike +that a method can be cancelled. ## Enabling cancellation for an endpoint 1. Check that the endpoint method, and any `async` functions in its call @@ -50,7 +51,7 @@ request = defer.ensureDeferred(on_GET()) When a client disconnects early, Synapse checks for the presence of the `@cancellable` decorator on `on_GET`. Since `on_GET` is cancellable, `Deferred.cancel()` is called on the `Deferred` from -`defer.ensureDeferred`, `request`. Twisted knows which `Deferred` +`defer.ensureDeferred`, ie. `request`. Twisted knows which `Deferred` `request` is waiting on and passes the `cancel()` call on to `d`. The `Deferred` being waited on, `d`, may have its own handling for