compute_ctl: Use 'fast' shutdown for Postgres termination #8289
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
We currently use 'immediate' mode in the most commonly used shutdown path, when the control plane calls a
compute_ctl
API to terminate Postgres inside compute without waiting for the actual pod / VM termination. Yet, 'immediate' shutdown doesn't create a shutdown checkpoint and ROs have bad times figuring out the list of running xacts during next start.Summary of changes
Use 'fast' mode, which creates a shutdown checkpoint that is important for ROs to get a list of running xacts faster instead of going through the CLOG. On the control plane side, we poll this
compute_ctl
termination API for 10s, it should be enough as we don't really write any data at checkpoint time. If it times out, we anyway switch to the slow k8s-based termination.See https://www.postgresql.org/docs/current/server-shutdown.html for the list of modes and signals.
The default VM shutdown hook already uses
fast
mode, see [1][1]
neon/vm-image-spec.yaml
Lines 30 to 31 in c9fd8d7
Related to #6211
Checklist before requesting a review
Checklist before merging