-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use topology to achive HA #2366
Conversation
Thanks for your contribution. Could you provide more detail and some examples to demonstrate the scenario and how it works? |
@@ -7,6 +7,7 @@ metadata: | |||
pingcap.com/pd.{{ template "cluster.name" . }}-pd.sha: {{ include "pd-configmap.data-digest" . | quote }} | |||
pingcap.com/tikv.{{ template "cluster.name" . }}-tikv.sha: {{ include "tikv-configmap.data-digest" . | quote }} | |||
pingcap.com/tidb.{{ template "cluster.name" . }}-tidb.sha: {{ include "tidb-configmap.data-digest" . | quote }} | |||
topologyKey: {{ .Values.topologyKey }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about change topologyKey
to pingcap.com/ha-topology-key
?
and define it as a const like https://github.com/pingcap/tidb-operator/blob/master/pkg/label/label.go#L43 does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
charts/tidb-cluster/values.yaml
Outdated
@@ -64,6 +64,10 @@ discovery: | |||
# if the ConfigMap was not changed. | |||
enableConfigMapRollout: true | |||
|
|||
# The scheduler achieves HA scheduling based on the topologyKey. | |||
# You can modify it to other label of node | |||
topologyKey: kubernetes.io/hostname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
topologyKey: kubernetes.io/hostname | |
haTopologyKey: kubernetes.io/hostname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
The current tidb-operator's scheduling rule is highly available scheduling in the node dimension, which guarantees that the component instances are scheduled to different nodes to ensure high availability. Sometimes the distribution of nodes is as follows:
The zones here are assumed to be different IDCs. It can be seen that kind-worker is located in zone1, kind-worker2 is located in zone2, kind-worker3, kind-worker4 and kind-worker5 are all located in zone3. At this time, basic-po-1 and basic-pd-2 are located in the same zone (zone3). We sometimes want to schedule pod to different zones, that is, to ensure the zone dimension is highly available, as follows:
This ensures that at least two pods are still available in case of zone3 power outage, network outage, etc. The code implements a highly available schedule for the zone dimension, where the zone, if not configured (I'm currently changing it to be configured through tidb-cluster.yaml annotations), defaults to the node, which is compatible with the current scheduling rules of tidb-operator |
@PengJi Thanks for clarifying. That's very helpful. |
Plz gofmt the |
@PengJi Thanks for the contribution! |
@DanielZhangQD One of requirements is to implement HA based on the zone dimension (such as IDC, currently is node dimension). If using podAntiAffinity, the possible configuration is as follows:
This method does not guarantee that pods are evenly distributed among the three different zones, or is there any other configuration method? |
@cofyc PTAL |
@PengJi Can you add a release note (user-facing change)? |
/run-e2e-tests |
This LGTM and is an enhancement to current HA implementation. It does not support more than one topology, so pods in a particular zone (topology-key=zone) will not be spread evenly across nodes. Do you plan to implement this feature, like Multiple TopologySpreadConstraints? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-cherry-picker |
cherry pick to release-1.1 in PR #2399 |
There is no plan to implement this feature temporarily. The current method is to use the k8s SelectorSpreadPriority and InterPodAffinityPriority priority strategy and increase their weight. |
What problem does this PR solve?
#2110
What is changed and how does it work?
Some label is set to node, for example: kubernetes.io/hostname=host1,zone=zone1, then a certain label is used as tolology, and the scheduler schedules pod to different tolology according to tolology to implement HA
Check List
Tests
Code changes
Mainly changed ha.go
Side effects
NONE
Related changes
Does this PR introduce a user-facing change?: