Skip to content

Commit

Permalink
Add support for return/ThrowBehavior of services
Browse files Browse the repository at this point in the history
# Motivation

Currently the `ServiceGroup` cancels the whole group when a service returns or throws. This is fine for most server applications but it doesn't work for CLI tools since they mostly want to use the group to orchestrate the services as long as the user command is handled and then return cleanly from the group. Furthermore, even on server setups one might want to customize the behavior of what happens when a service returns/throws e.g. one might want to shutdown the group after their HTTP server throws so that a telemetry service flushes out all the remaining data.

# Modification

This PR does a few things:
1. It creates a new `ServiceConfiguration` and `TerminationBehavior` in the `ServiceGroupConfiguration`. Those can be used to declare the services that are run and what happens when the service returns or throws
2. Adds a new `cancellationSignals` to to the `ServiceGroupConfiguration` which allows cancellation to trigger based on a signal
3. Makes sure that any given service is only retained as long as necessary i.e. when a service returns the group is not retaining it anymore. This allows freeing of resources as early as possible
4. Breaking: Removes the `Hashable` conformance on the configuration structs. This was wrong in the first place and something we should just avoid doing in general.

# Result
We now give the user even more control about the group's behaviors.
  • Loading branch information
FranzBusch committed Aug 14, 2023
1 parent de1ef4b commit 38f62c8
Show file tree
Hide file tree
Showing 9 changed files with 1,057 additions and 259 deletions.
Original file line number Diff line number Diff line change
@@ -1,41 +1,49 @@
# How to adopt ServiceLifecycle in applications

``ServiceLifecycle`` aims to provide a unified API that services should adopt to make orchestrating
them in an application easier. To achieve this ``ServiceLifecycle`` is providing the ``ServiceGroup`` actor.
``ServiceLifecycle`` aims to provide a unified API that services should adopt to
make orchestrating them in an application easier. To achieve this
``ServiceLifecycle`` is providing the ``ServiceGroup`` actor.

## Why do we need this?

When building applications we often have a bunch of services that comprise the internals of the applications.
These services include fundamental needs like logging or metrics. Moreover, they also include
services that compromise the application's business logic such as long-running actors.
Lastly, they might also include HTTP, gRPC, or similar servers that the application is exposing.
One important requirement of the application is to orchestrate the various services currently during
startup and shutdown. Furthermore, the application also needs to handle a single service failing.

Swift introduced Structured Concurrency which already helps tremendously with running multiple
async services concurrently. This can be achieved with the use of task groups. However, Structured
Concurrency doesn't enforce consistent interfaces between the services, so it becomes hard to orchestrate them.
This is where ``ServiceLifecycle`` comes in. It provides the ``Service`` protocol which enforces
a common API. Additionally, it provides the ``ServiceGroup`` which is responsible for orchestrating
all services in an application.
When building applications we often have a bunch of services that comprise the
internals of the applications. These services include fundamental needs like
logging or metrics. Moreover, they also include services that compromise the
application's business logic such as long-running actors. Lastly, they might
also include HTTP, gRPC, or similar servers that the application is exposing.
One important requirement of the application is to orchestrate the various
services during startup and shutdown.

Swift introduced Structured Concurrency which already helps tremendously with
running multiple asynchronous services concurrently. This can be achieved with
the use of task groups. However, Structured Concurrency doesn't enforce
consistent interfaces between the services, so it becomes hard to orchestrate
them. This is where ``ServiceLifecycle`` comes in. It provides the ``Service``
protocol which enforces a common API. Additionally, it provides the
``ServiceGroup`` which is responsible for orchestrating all services in an
application.

## Adopting the ServiceGroup in your application

This article is focusing on how the ``ServiceGroup`` works and how you can adopt it in your application.
If you are interested in how to properly implement a service, go check out the article: <doc:How-to-adopt-ServiceLifecycle-in-libraries>.
This article is focusing on how the ``ServiceGroup`` works and how you can adopt
it in your application. If you are interested in how to properly implement a
service, go check out the article:
<doc:How-to-adopt-ServiceLifecycle-in-libraries>.

### How is the ServiceGroup working?

The ``ServiceGroup`` is just a slightly complicated task group under the hood that runs each service
in a separate child task. Furthermore, the ``ServiceGroup`` handles individual services exiting
or throwing unexpectedly. Lastly, it also introduces a concept called graceful shutdown which allows
tearing down all services in reverse order safely. Graceful shutdown is often used in server
scenarios i.e. when rolling out a new version and draining traffic from the old version.
The ``ServiceGroup`` is just a complicated task group under the hood that runs
each service in a separate child task. Furthermore, the ``ServiceGroup`` handles
individual services exiting or throwing. Lastly, it also introduces a concept
called graceful shutdown which allows tearing down all services in reverse order
safely. Graceful shutdown is often used in server scenarios i.e. when rolling
out a new version and draining traffic from the old version (commonly referred
to as quiescing).

### How to use the ServiceGroup?

Let's take a look how the ``ServiceGroup`` can be used in an application. First, we define some
fictional services.
Let's take a look how the ``ServiceGroup`` can be used in an application. First,
we define some fictional services.

```swift
struct FooService: Service {
Expand All @@ -53,11 +61,12 @@ public struct BarService: Service {
}
```

The `BarService` is depending in our example on the `FooService`. A dependency between services
is quite common and the ``ServiceGroup`` is inferring the dependencies from the order of the
services passed to the ``ServiceGroup/init(services:configuration:logger:)``. Services with a higher
index can depend on services with a lower index. The following example shows how this can be applied
to our `BarService`.
The `BarService` is depending in our example on the `FooService`. A dependency
between services is quite common and the ``ServiceGroup`` is inferring the
dependencies from the order of the services passed to the
``ServiceGroup/init(configuration:)``. Services with a higher index can depend
on services with a lower index. The following example shows how this can be
applied to our `BarService`.

```swift
@main
Expand All @@ -68,9 +77,13 @@ struct Application {

let serviceGroup = ServiceGroup(
// We are encoding the dependency hierarchy here by listing the fooService first
services: [fooService, barService],
configuration: .init(gracefulShutdownSignals: []),
logger: logger
configuration: .init(
services: [
.init(service: fooService),
.init(service: barService)
],
logger: logger
),
)

try await serviceGroup.run()
Expand All @@ -80,17 +93,26 @@ struct Application {

### Graceful shutdown

The ``ServiceGroup`` supports graceful shutdown by taking an array of `UnixSignal`s that trigger
the shutdown. Commonly `SIGTERM` is used to indicate graceful shutdowns in container environments
such as Docker or Kubernetes. The ``ServiceGroup`` is then gracefully shutting down each service
one by one in the reverse order of the array passed to the init.
Importantly, the ``ServiceGroup`` is going to wait for the ``Service/run()`` method to return
Graceful shutdown is a concept from service lifecycle which aims to be an
alternative to task cancellation that is not as forceful. Graceful shutdown
rather let's the various services opt-in to supporting it. A common example of
when you might want to use graceful shutdown is in containerized enviroments
such as Docker or Kubernetes. In those environments, `SIGTERM` is commonly used
to indicate to the application that it should shutdown in before a `SIGKILL` is
send.

The ``ServiceGroup`` can be setup to listen to `SIGTERM` and trigger a graceful
shutdown on all its orchestrated services. It will then gracefully shut down
each service one by one in reverse startup order. Importantly, the
``ServiceGroup`` is going to wait for the ``Service/run()`` method to return
before triggering the graceful shutdown on the next service.

Since graceful shutdown is up to the individual services and application it requires explicit support.
We recommend that every service author makes sure their implementation is handling graceful shutdown
correctly. Lastly, application authors also have to make sure they are handling graceful shutdown.
A common example of this is for applications that implement streaming behaviours.
Since graceful shutdown is up to the individual services and application it
requires explicit support. We recommend that every service author makes sure
their implementation is handling graceful shutdown correctly. Lastly,
application authors also have to make sure they are handling graceful shutdown.
A common example of this is for applications that implement streaming
behaviours.

```swift
struct StreamingService: Service {
Expand Down Expand Up @@ -126,27 +148,32 @@ struct Application {
})

let serviceGroup = ServiceGroup(
services: [streamingService],
configuration: .init(gracefulShutdownSignals: [.sigterm]),
logger: logger
configuration: .init(
services: [.init(service: streamingService)],
gracefulShutdownSignals: [.sigterm],
logger: logger
)
)

try await serviceGroup.run()
}
}
```

The code above demonstrates a hypothetical `StreamingService` with a configurable handler that
is invoked per stream. Each stream is handled in a separate child task concurrently.
The above code doesn't support graceful shutdown right now. There are two places where we are missing it.
First, the service's `run()` method is iterating the `makeStream()` async sequence. This iteration is
not stopped on graceful shutdown and we are continuing to accept new streams. Furthermore,
the `streamHandler` that we pass in our main method is also not supporting graceful shutdown since it
is iterating over the incoming requests.

Luckily, adding support in both places is trivial with the helpers that ``ServiceLifecycle`` exposes.
In both cases, we are iterating an async sequence and what we want to do is stop the iteration.
To do this we can use the `cancelOnGracefulShutdown()` method that ``ServiceLifecycle`` adds to
The code above demonstrates a hypothetical `StreamingService` with a
configurable handler that is invoked per stream. Each stream is handled in a
separate child task concurrently. The above code doesn't support graceful
shutdown right now. There are two places where we are missing it. First, the
service's `run()` method is iterating the `makeStream()` async sequence. This
iteration is not stopped on graceful shutdown and we are continuing to accept
new streams. Furthermore, the `streamHandler` that we pass in our main method is
also not supporting graceful shutdown since it is iterating over the incoming
requests.

Luckily, adding support in both places is trivial with the helpers that
``ServiceLifecycle`` exposes. In both cases, we are iterating an async sequence
and what we want to do is stop the iteration. To do this we can use the
`cancelOnGracefulShutdown()` method that ``ServiceLifecycle`` adds to
`AsyncSequence`. The updated code looks like this:

```swift
Expand Down Expand Up @@ -183,18 +210,64 @@ struct Application {
})

let serviceGroup = ServiceGroup(
services: [streamingService],
configuration: .init(gracefulShutdownSignals: [.sigterm]),
logger: logger
configuration: .init(
services: [.init(service: streamingService)],
gracefulShutdownSignals: [.sigterm],
logger: logger
)
)

try await serviceGroup.run()
}
}
```

Now one could ask - Why aren't we using cancellation in the first place here? The problem is that
cancellation is forceful and doesn't allow users to make a decision if they want to cancel or not.
However, graceful shutdown is very specific to business logic often. In our case, we were fine with just
stopping to handle new requests on a stream. Other applications might want to send a response indicating
to the client that the server is shutting down and waiting for an acknowledgment of that message.
Now one could ask - Why aren't we using cancellation in the first place here?
The problem is that cancellation is forceful and doesn't allow users to make a
decision if they want to cancel or not. However, graceful shutdown is very
specific to business logic often. In our case, we were fine with just stopping
to handle new requests on a stream. Other applications might want to send a
response indicating to the client that the server is shutting down and waiting
for an acknowledgment of that message.

### Customizing the behavior when a service returns or throws

By default the ``ServiceGroup`` is cancelling the whole group if the one service
returns or throws. However, in some scenarios this is totally expected e.g. when
the ``ServiceGroup`` is used in a CLI tool to orchestrate some services while a
command is handled. To customize the behavior you set the
``ServiceGroupConfiguration/ServiceConfiguration/returnBehaviour`` and
``ServiceGroupConfiguration/ServiceConfiguration/throwBehaviour``. Both of them
offer three different options. The default behavior for both is
``ServiceGroupConfiguration/ServiceConfiguration/TerminationBehavior/cancelGroup``.
You can also choose to either ignore if a service returns/throws by setting it
to ``ServiceGroupConfiguration/ServiceConfiguration/TerminationBehavior/ignore``
or trigger a graceful shutdown by setting it to
``ServiceGroupConfiguration/ServiceConfiguration/TerminationBehavior/gracefullyShutdownGroup``.

Another example where you might want to use this is when you have a service that
should be gracefully shutdown when another service exits, e.g. you want to make
sure your telemetry service is gracefully shutdown after your HTTP server
unexpectedly threw from its `run()` method. This setup could look like this:

```swift
@main
struct Application {
static func main() async throws {
let telemetryService = TelemetryService()
let httpServer = HTTPServer()

let serviceGroup = ServiceGroup(
configuration: .init(
services: [
.init(service: telemetryService),
.init(service: httpServer, returnBehavior: .shutdownGracefully, throwBehavior: .shutdownGracefully)
],
logger: logger
),
)

try await serviceGroup.run()
}
}
```
Loading

0 comments on commit 38f62c8

Please sign in to comment.