Consider calling _node/shutdown from a pre-stop hook #6478

pebrc · 2023-03-02T14:58:58Z

Packing complex logic (and API calls!) into a pre-stop hook is a bit of an anti-pattern. However the ECK operator currently does not handle k8s node maintenance in a graceful way. This is because evictions due to node maintenance are not orchestrated by the operator and none of the pre-shutdown logic is executed that we run on regular scale down or ES Pod upgrades.

This becomes a problem on clusters with a lot of data where Pods are being evicted due to node maintenance and shard recovery kicks in almost immediately.

A possible solution would be to add a pre-stop script that queries the ES API to find out whether a node shutdown is currently in progress. If so it does nothing more. If not is issues a _node/shutdown request to the ES API of type restart (which is a guess of course because we cannot know in the pre-stop hook what kind of shutdown is happening)

Downsides of this approach are:

exposure of additional API credentials (cluster_admin) in the script
implementing a solid retry logic and dealing with unavailbility of ES (the overall pre-stop hook timeout helps here)
implementing a loop to wait for the shutdown complete condition from the ES side

cc @SpencerLN

The text was updated successfully, but these errors were encountered:

pebrc added the >enhancement Enhancement of existing functionality label Mar 2, 2023

pebrc mentioned this issue Mar 2, 2023

When a Pod is deleted outside of ECK (K8s node maintenance, upgrades, etc.) cold tier data isn't maintained on the node #6444

Closed

pebrc self-assigned this Mar 15, 2023

pebrc mentioned this issue Mar 17, 2023

Call _nodes/shutdown from pre-stop hook #6544

Merged

EgbertW mentioned this issue Mar 21, 2023

Operator applying changes breaks index when bulk inserting to non-replicated index #6555

Closed

pebrc closed this as completed in #6544 Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider calling _node/shutdown from a pre-stop hook #6478

Consider calling _node/shutdown from a pre-stop hook #6478

pebrc commented Mar 2, 2023 •

edited

Loading

Consider calling _node/shutdown from a pre-stop hook #6478

Consider calling _node/shutdown from a pre-stop hook #6478

Comments

pebrc commented Mar 2, 2023 • edited Loading

pebrc commented Mar 2, 2023 •

edited

Loading