Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downscale one master at a time & don't remove last running master #1549

Merged
merged 3 commits into from
Aug 14, 2019

Conversation

sebgl
Copy link
Contributor

@sebgl sebgl commented Aug 12, 2019

Add safety measures to StatefulSets downscales:

  • Make sure we remove one master node at a time
    This plays nicely with zen settings in general (especially zen1).

  • Don't remove a master if it's the last master running in the cluster
    This prevents situation where we can accidentally remove the last
    running master, loosing cluster_state, while waiting for other masters
    to be running during a sset mutation (eg. sset renamed).

Both are implemented as "invariants" in an "invariants" struct we use
through the downscale process.

Relates #1281.

This commit adds 2 safety measures to StatefulSets downscales:

* Make sure we remove one master node at a time
    This plays nicely with zen settings in general (especially zen1).
* Don't remove a master if it's the last master running in the cluster
   This prevents situation where we can accidentally remove the last
running master, loosing cluster_state, while waiting for other masters
to spin up during a sset mutation (eg. sset renamed).

Both are implemented as "invariants" in an "invariants" struct we use
through the downscale process.
@sebgl sebgl added the justdoit Continuous improvement not related to a specific feature label Aug 12, 2019
Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@anyasabo anyasabo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, literally only naming nits

@barkbay
Copy link
Contributor

barkbay commented Aug 13, 2019

LGTM !

As a side note I have been a little bit surprised that the operator allows a scale down from 5 to 3 master nodes while master 0 and 1 are unavailable:

NAME                                    READY   STATUS    RESTARTS   AGE     IP           NODE                                                 NOMINATED NODE   READINESS GATES
pod/elasticsearch-sample-es-default-0   0/1     Pending   0          75s     <none>       <none>                                               <none>           <none>
pod/elasticsearch-sample-es-default-1   0/1     Pending   0          73s     <none>       <none>                                               <none>           <none>
pod/elasticsearch-sample-es-default-2   1/1     Running   0          6m41s   10.60.1.3    gke-michael-dev-cluster-default-pool-f0e38722-3p4q   <none>           <none>
pod/elasticsearch-sample-es-default-3   1/1     Running   0          3m26s   10.60.0.13   gke-michael-dev-cluster-default-pool-cd30e5ea-vvw1   <none>           <none>
pod/elasticsearch-sample-es-default-4   1/1     Running   0          3m25s   10.60.2.3    gke-michael-dev-cluster-default-pool-3098b032-v54c   <none>           <none>

After the scale down request user has only one master alive while it's initial request was to keep 3 of them:

NAME                                    READY   STATUS    RESTARTS   AGE   IP          NODE                                                 NOMINATED NODE   READINESS GATES
pod/elasticsearch-sample-es-default-0   0/1     Pending   0          19m   <none>      <none>                                               <none>           <none>
pod/elasticsearch-sample-es-default-1   0/1     Pending   0          19m   <none>      <none>                                               <none>           <none>
pod/elasticsearch-sample-es-default-2   1/1     Running   0          24m   10.60.1.3   gke-michael-dev-cluster-default-pool-f0e38722-3p4q   <none>           <none>

But I guess that's the expected behavior ?

Copy link
Contributor

@david-kow david-kow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one nit.

@sebgl
Copy link
Contributor Author

sebgl commented Aug 14, 2019

@barkbay

After the scale down request user has only one master alive while it's initial request was to keep 3 of them

Interesting, thanks for testing this corner case. I think that's what we'd expect? The operator cannot do much about nodes staying in a Pending state unfortunately. Btw how did you manage to get into that situation? I'd love to easily reproduce these scenarios.

One important thing though is, if you wanted to downscale to 5 to 2, it would make sure we don't remove pod/elasticsearch-sample-es-default-2 (last running master) if 0 and 1 are not running.
We would still stay in a "dangerous" situation where only 1/3 master nodes are running. I don't think we can do better?

@sebgl
Copy link
Contributor Author

sebgl commented Aug 14, 2019

@pebrc or @anyasabo can you double-check the last commit?

Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-LGTM

@sebgl sebgl merged commit 1d55099 into elastic:master Aug 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
justdoit Continuous improvement not related to a specific feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants