Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loki: Add a prepare-shutdown endpoint #8786

Merged
merged 7 commits into from
Mar 17, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

##### Enhancements

* [8786](https://github.com/grafana/loki/pull/8786) **DylanGuedes**: Ingester: add new /ingester/prepare_shutdown endpoint.
* [8744](https://github.com/grafana/loki/pull/8744) **dannykopping**: Ruler: remote rule evaluation.
* [8727](https://github.com/grafana/loki/pull/8727) **cstyan** **jeschkies**: Propagate per-request limit header to querier.
* [8682](https://github.com/grafana/loki/pull/8682) **dannykopping**: Add fetched chunk size distribution metric `loki_chunk_fetcher_fetched_size_bytes`.
Expand Down
13 changes: 12 additions & 1 deletion docs/sources/api/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -621,6 +621,17 @@ backing store. Mainly used for local testing.

In microservices mode, the `/flush` endpoint is exposed by the ingester.

### Tell ingester to release all resources on next SIGTERM

```
POST /ingester/prepare_shutdown
```

`/ingester/prepare_shutdown` will prepare the ingester to release resources on the next SIGTERM,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should define SIGTERM here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIGTERM is a type of signal that a process can receive and is supported by all Unix-based operating systems. Do you think saying SIGTERM signal is enough here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL, and I have just exposed my UNIX ignorance. But it's also possible that we might have customers in Windows-only environments who could use the information? What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, with "could use the information" do you mean the definition of SIGTERM? If so: personally, I feel like this isn't the appropriate place for explaining a concept like SIGTERM, although saying "SIGTERM signal" indeed looked better than just saying "SIGTERM".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's go with "SIGTERM signal" then.

where releasing resources means flushing data and unregistering from the ring.
DylanGuedes marked this conversation as resolved.
Show resolved Hide resolved
This endpoint supersede any YAML configurations and isn't necessary if the ingester is already
DylanGuedes marked this conversation as resolved.
Show resolved Hide resolved
configured to unregister from the ring or to flush on shutdown.

## Flush in-memory chunks and shut down

```
Expand Down Expand Up @@ -1401,4 +1412,4 @@ $ curl -H "Content-Type: application/json" -XPOST -s "https://localhost:3100/api
This is helpful for scaling down WAL-enabled ingesters where we want to ensure old WAL directories are not orphaned,
but instead flushed to our chunk backend.

In microservices mode, the `/ingester/flush_shutdown` endpoint is exposed by the ingester.
In microservices mode, the `/ingester/flush_shutdown` endpoint is exposed by the ingester.
21 changes: 18 additions & 3 deletions pkg/ingester/ingester.go
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ type Interface interface {
// deprecated
LegacyShutdownHandler(w http.ResponseWriter, r *http.Request)
ShutdownHandler(w http.ResponseWriter, r *http.Request)
PrepareShutdown(w http.ResponseWriter, r *http.Request)
}

// Ingester builds chunks for incoming log streams.
Expand Down Expand Up @@ -502,7 +503,8 @@ func (i *Ingester) running(ctx context.Context) error {
return serviceError
}

// Called after running exits, when Ingester transitions to Stopping state.
// stopping is called when Ingester transitions to Stopping state.
//
// At this point, loop no longer runs, but flushers are still running.
func (i *Ingester) stopping(_ error) error {
i.stopIncomingRequests()
Expand All @@ -523,8 +525,8 @@ func (i *Ingester) stopping(_ error) error {

i.streamRateCalculator.Stop()

// In case the flag to terminate on shutdown is set we need to mark the
// ingester service as "failed", so Loki will shut down entirely.
// In case the flag to terminate on shutdown is set or this instance is marked to release its resources,
// we need to mark the ingester service as "failed", so Loki will shut down entirely.
// The module manager logs the failure `modules.ErrStopProcess` in a special way.
if i.terminateOnShutdown && errs.Err() == nil {
return modules.ErrStopProcess
Expand Down Expand Up @@ -568,6 +570,19 @@ func (i *Ingester) LegacyShutdownHandler(w http.ResponseWriter, r *http.Request)
w.WriteHeader(http.StatusNoContent)
}

// PrepareShutdown will handle the /ingester/prepare_shutdown endpoint.
//
// Internally, when triggered, this handler will configure the ingester service to release their resources whenever a SIGTERM is received.
// Releasing resources meaning flushing data, deleting tokens, and removing itself from the ring.
func (i *Ingester) PrepareShutdown(w http.ResponseWriter, r *http.Request) {
level.Info(util_log.Logger).Log("msg", "preparing full ingester shutdown, resources will be released on SIGTERM")
i.lifecycler.SetFlushOnShutdown(true)
i.lifecycler.SetUnregisterOnShutdown(true)
i.terminateOnShutdown = true
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusOK)
}

// ShutdownHandler handles a graceful shutdown of the ingester service and
// termination of the Loki process.
func (i *Ingester) ShutdownHandler(w http.ResponseWriter, r *http.Request) {
Expand Down
2 changes: 2 additions & 0 deletions pkg/loki/loki.go
Original file line number Diff line number Diff line change
Expand Up @@ -372,6 +372,8 @@ type Loki struct {
deleteClientMetrics *deletion.DeleteRequestClientMetrics

HTTPAuthMiddleware middleware.Interface

ingesterRelease *atomic.Bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review but is this variable set anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol looks like something I was using in an older implementation but forgot to clean, thanks for catching it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep I was using it in a former implementation

}

// New makes a new Loki.
Expand Down
3 changes: 3 additions & 0 deletions pkg/loki/modules.go
Original file line number Diff line number Diff line change
Expand Up @@ -471,6 +471,9 @@ func (t *Loki) initIngester() (_ services.Service, err error) {
t.Server.HTTP.Methods("POST").Path("/ingester/flush_shutdown").Handler(
httpMiddleware.Wrap(http.HandlerFunc(t.Ingester.LegacyShutdownHandler)),
)
t.Server.HTTP.Methods("POST").Path("/ingester/prepare_shutdown").Handler(
httpMiddleware.Wrap(http.HandlerFunc(t.Ingester.PrepareShutdown)),
)
t.Server.HTTP.Methods("POST").Path("/ingester/shutdown").Handler(
httpMiddleware.Wrap(http.HandlerFunc(t.Ingester.ShutdownHandler)),
)
Expand Down