Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(kuma-cp): restore default leader election mechanism #5817

Conversation

jakubdyszkiewicz
Copy link
Contributor

@jakubdyszkiewicz jakubdyszkiewicz commented Jan 24, 2023

Switching for leader for life brought more problems than solutions #3023

One of the problems was noticed on GCP and AWS where Node migration happened and the old leader CP Pod was stuck. The result was that no new leader was chosen and Pod with sidecar injection was never up.

One of the reasons for leader for life migration was Kube API throttling when CP was under big pressure. Since then, we reduced calls to storage. If this is still a problem, we can bump the lease duration and lease renewal time.

Fix #5709

Checklist prior to review

  • Link to docs PR or issue --
  • Link to UI issue or PR --
  • Is the issue worked on linked? --
  • The PR does not hardcode values that might break projects that depend on kuma (e.g. "kumahq" as a image registry) --
  • The PR will work for both Linux and Windows, system specific functions like syscall.Mkfifo have equivalent implementation on the other OS --
  • Unit Tests --
  • E2E Tests --
  • Manual Universal Tests --
  • Manual Kubernetes Tests --
  • Do you need to update UPGRADE.md? --
  • Does it need to be backported according to the backporting policy? --
  • Do you need to explicitly set a > Changelog: entry here or add a ci/ label to run fewer/more tests?

@jakubdyszkiewicz jakubdyszkiewicz added the ci/run-full-matrix PR: Runs all possible e2e test combination (expensive use carefully) label Jan 24, 2023
@lahabana
Copy link
Contributor

Wrong branch should be release-2.1 right?

@jakubdyszkiewicz jakubdyszkiewicz changed the base branch from master to release-2.1 January 24, 2023 11:50
@jakubdyszkiewicz
Copy link
Contributor Author

ahh, I thought the branch was not cut yet

Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
@jakubdyszkiewicz jakubdyszkiewicz force-pushed the feat/restore-default-leader-mechanism branch from f494f2d to 4a2184a Compare January 24, 2023 12:02
@jakubdyszkiewicz jakubdyszkiewicz removed the ci/run-full-matrix PR: Runs all possible e2e test combination (expensive use carefully) label Jan 24, 2023
Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
@jakubdyszkiewicz jakubdyszkiewicz marked this pull request as ready for review January 24, 2023 12:45
@jakubdyszkiewicz jakubdyszkiewicz requested review from a team, bartsmykla and lobkovilya and removed request for a team January 24, 2023 12:45
Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
@jakubdyszkiewicz jakubdyszkiewicz merged commit 161476b into kumahq:release-2.1 Jan 25, 2023
@jakubdyszkiewicz jakubdyszkiewicz deleted the feat/restore-default-leader-mechanism branch January 25, 2023 11:48
@lahabana lahabana linked an issue Jan 25, 2023 that may be closed by this pull request
Automaat added a commit that referenced this pull request Jan 31, 2023
* chore: fix version sh pipe exit (#5816)

Signed-off-by: slonka <slonka@users.noreply.github.com>

* fix: validator for MeshHealthCheck to-targetRef (#5818)

Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>

* chore(deps): bump kumahq/kuma-gui to b82dc41471d7831d734bbc52919cfa7a8b46a65c (#5819)

* chore(release): update readme file (#5813)

* chore(release): update readme and upgrade.md with 2.1 release

Signed-off-by: Marcin Skalski <marcin.skalski@konghq.com>

* test(unit): use mock clock in inspect tests (#5814)

Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>

* chore(deps): bump kumahq/kuma-gui to b82dc41471d7831d734bbc52919cfa7a8b46a65c

Bumps kumahq/kuma-gui to version [master@b82dc41471d7831d734bbc52919cfa7a8b46a65c](https://github.com/kumahq/kuma-gui/tree/b82dc41471d7831d734bbc52919cfa7a8b46a65c)

Signed-off-by: GitHub <noreply@github.com>

Signed-off-by: Marcin Skalski <marcin.skalski@konghq.com>
Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: Marcin Skalski <marcin.skalski@konghq.com>
Co-authored-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* fix(kuma-cp): kds deadlock in unit tests (#5821)

* fix(kuma-cp): kds deadlock in unit tests

Signed-off-by: slonka <slonka@users.noreply.github.com>

* fix: update MeshFaultInjection API with mergeable slice (#5811)

Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>

* fix(kuma-cp): fix the way how we set FractionalPercent in envoy (#5820)

Signed-off-by: Lukasz Dziedziak <lukidzi@gmail.com>

* feat(kuma-cp): restore default leader election mechanism (#5817)

Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>

* feat(kuma-cp): switch to kube outbounds as vips by default (#5825)

Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>

* fix(policy): fix meshtimeout static validation (#5830)

Signed-off-by: Marcin Skalski <marcin.skalski@konghq.com>

* chore: back-ports GUI files (#5832)

* chore: back-ports GUI files
* fix(ci): format files

Signed-off-by: Philipp Rudloff <philipp.rudloff@konghq.com>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(policy): meshtimeout for GRPC (#5841)

Signed-off-by: Marcin Skalski <marcin.skalski@konghq.com>

* chore: change gateway log to V1 (#5836)

Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>

* feat(kuma-cp): disable kube outbounds as vips by default (#5844)

Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>

* chore(deps): back-ports GUI files (#5845)

Back-ports GUI files.

Signed-off-by: Philipp Rudloff <philipp.rudloff@konghq.com>

* fix(kuma-cp): should set only once header for MeshRateLimit (#5852)

Signed-off-by: Lukasz Dziedziak <lukidzi@gmail.com>

* chore: back-ports GUI files (#5859)

Back-ports GUI files.

Signed-off-by: Philipp Rudloff <philipp.rudloff@konghq.com>

* fix(kuma-cp): meshratelimit doesn't support MeshGatewayRoute (#5853)

Signed-off-by: Lukasz Dziedziak <lukidzi@gmail.com>

* fix(kuma-cp): configure split clusters for new policies (#5855)

* fix(kuma-cp): configure split clusters

Signed-off-by: Lukasz Dziedziak <lukidzi@gmail.com>

* fix(kuma-cp): revert tests

Signed-off-by: slonka <slonka@users.noreply.github.com>

* fix(kuma-cp): dry up gathered clusters

Signed-off-by: slonka <slonka@users.noreply.github.com>

* fix(kuma-cp): apply review suggestions

Signed-off-by: slonka <slonka@users.noreply.github.com>

---------

Signed-off-by: Lukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: slonka <slonka@users.noreply.github.com>
Co-authored-by: slonka <slonka@users.noreply.github.com>

* ci: use "ref_name" instead of "refName" in GH action (#5882)

fix: use `ref_name` instead of `refName`

Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>

* remove gui changes

---------

Signed-off-by: slonka <slonka@users.noreply.github.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Marcin Skalski <marcin.skalski@konghq.com>
Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Lukasz Dziedziak <lukidzi@gmail.com>
Signed-off-by: Philipp Rudloff <philipp.rudloff@konghq.com>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Krzysztof Słonka <slonka@users.noreply.github.com>
Co-authored-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Co-authored-by: kumahq[bot] <110050114+kumahq[bot]@users.noreply.github.com>
Co-authored-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Łukasz Dziedziak <lukidzi@gmail.com>
Co-authored-by: Philipp Rudloff <philipp.rudloff@konghq.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Leader election fails after node termination
3 participants