Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve master nodes deletion #668

Closed
barkbay opened this issue Apr 23, 2019 · 2 comments
Closed

Improve master nodes deletion #668

barkbay opened this issue Apr 23, 2019 · 2 comments
Labels
discuss We need to figure this out

Comments

@barkbay
Copy link
Contributor

barkbay commented Apr 23, 2019

When some masters need to be deleted the following algorithm is applied:

  1. calculateChanges calculates how many master nodes should be removed
  2. CalculatePerformableChanges returns how many master nodes can be removed
    Then for each master node to be removed:
  3. Compute and apply new quorum size without the master node to be removed
  4. Schedule master pod deletion

In some rare cases it can lead to a split brain situation, for instance:

  1. Initial situation: 4 masters in a clusters, 2 masters need to be removed
  2. minimum_master_nodes is decrease from 3 to 2
  3. 2 masters pods are scheduled to be deleted at the K8S level

Since minimum_master_nodes is set to 2 while there is still 4 masters running there is a small chance of having a split brain situation between steps 2. and 3.
This situation is mostly true for Zen1, with Zen2 masters are excluded before to be deleted.
The algorithm depicted above is the only way to move from two masters to one node, it is a special case which is inherently unsafe.

Some improvements can be done here:

  1. We should never delete more than half of the masters (at least with Zen1 and with the exception of the special case of the two to one master)
  2. If there are some dedicated masters, maybe we should treat them more carefully than the other nodes and do not blindly apply the maxUnavailable setting.
  3. We should never go down to 1 master (spof)

The 2 latest points are also true for Zen 2 and might have a higher priority since we are moving to ES 7.

@pebrc pebrc added this to the Beta milestone May 7, 2019
@pebrc pebrc added the discuss We need to figure this out label May 7, 2019
@pebrc pebrc removed this from the Beta milestone May 10, 2019
@sebgl
Copy link
Contributor

sebgl commented Jul 18, 2019

Related discussion for handling zen1 correctly in the sset refactoring: #1281

@sebgl
Copy link
Contributor

sebgl commented Sep 12, 2019

We now add and remove one master node at a time (wip for rolling upgrades, other issues have been opened) and wait for our cache of resources to match expectations, to properly handle zen1 and zen2 settings.
Closing this issue in favor of keeping open #1710, #1628, #1693.

@sebgl sebgl closed this as completed Sep 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss We need to figure this out
Projects
None yet
Development

No branches or pull requests

3 participants