Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extended scheduler to operator #145

Merged
merged 24 commits into from
Nov 2, 2018
Merged

Conversation

weekface
Copy link
Contributor

@weekface weekface commented Oct 25, 2018

This PR adds an extended scheduler to TiDB Operator, and fixes #138:

  • Basic extended scheduler framework
  • HighAvailability predicate
  • Unit tests
  • E2E tests

@tennix @onlymellb @xiaojingchen @gregwebs PTAL

@weekface weekface added this to the v0.4 milestone Oct 26, 2018
@@ -13,6 +13,7 @@ spec:
timezone: {{ .Values.timezone | default "UTC" }}
services:
{{ toYaml .Values.services | indent 4 }}
schedulerName: {{ .Values.schedulerName | default "default-scheduler" }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add schedulerName on here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue
}

nodeMap[nodeName] = append(nodeMap[nodeName], podName1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the nodeName is not contain in nodeMap, there will happen panic ,we should add logic here

if nodeMap[nodeName] ==nil{
      continue
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!
Done!

}

var min int
var minInitialized bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the minInitialized is not used,what is the minInitialized mean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used at line 91

command:
{{- if .Values.scheduler.kubeSchedulerImage }}
- kube-scheduler
{{- else }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hyperkube image also has kube-scheduler in $PATH, so no need to set the command separately. So it's unnecessary to define two variables for scheduler image.

{{- end }}
- --port=10261
- --leader-elect=true
- --lock-object-name=tidb-scheduler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock should be customizable and I think it should be the same as schedulerName.

- --lock-object-namespace={{ .Release.Namespace }}
- --scheduler-name={{ .Values.scheduler.schedulerName }}
- --v={{ .Values.scheduler.logLevel }}
- --policy-configmap=tidb-scheduler-policy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
data:
policy.cfg: |-
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This JSON doesn't need to be templated, so using .Files.Get makes it more easy to maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, waiting for: #149 merged.

requests:
cpu: 80m
memory: 50Mi
# pd replicas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replicas are cluster specific, it should not be defined in the scheduler. And since PD is single raft cluster, its replica is the same as its member count. But TiKV replica is configured in PD configuration, this can be retrieved by PD API.
However, for the first version, use 3 for both PD and TiKV is enough.

glog.Infof("scheduling pod: %s/%s", ns, podName)
var err error
for _, predicate := range s.predicates {
glog.Infof("entering predicate: %s, nodes: %v", predicate.Name(), getNodeNames(kubeNodes))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log should have a higher log level since it's much more verbose.

}

// Filter selects a set of nodes from *schedulerapiv1.ExtenderArgs.Nodes when this is a pd or tikv pod
// else return the original nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise, returns the original nodes.

}, nil
}

// We didn't pass `prioritizeVerb` to kubernetes scheduler extender's config file, this method will not be called.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/didn't/don't/

var component string
var exist bool
if component, exist = pod.Labels[label.ComponentLabelKey]; !exist {
return nil, fmt.Errorf("can't find component in pod labels: %s/%s", ns, podName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return the nodes as it is.

}

if ordinal < replicas && min != 0 {
return nil, fmt.Errorf("the first %d pods can't scheduled to the same node", replicas)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't be scheduled

@weekface
Copy link
Contributor Author

/run-e2e-tests

3 similar comments
@weekface
Copy link
Contributor Author

/run-e2e-tests

@weekface
Copy link
Contributor Author

/run-e2e-tests

@weekface
Copy link
Contributor Author

/run-e2e-tests

@weekface
Copy link
Contributor Author

/run-e2e-tests

@weekface
Copy link
Contributor Author

/run-e2e-tests

@weekface
Copy link
Contributor Author

/run-e2e-tests

pdReplicas: 3
# tikv replicas
tikvReplicas: 3
hyperkubeImage: quay.io/coreos/hyperkube:v1.10.4_coreos.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use kubeSchedulerImage instead.

@weekface
Copy link
Contributor Author

weekface commented Nov 1, 2018

/run-e2e-tests

Copy link
Member

@tennix tennix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tennix
Copy link
Member

tennix commented Nov 2, 2018

/run-e2e-tests

1 similar comment
@weekface
Copy link
Contributor Author

weekface commented Nov 2, 2018

/run-e2e-tests

@xiaojingchen xiaojingchen merged commit 115fe19 into pingcap:master Nov 2, 2018
@weekface weekface deleted the scheduler branch November 5, 2018 11:39
queenliuxx pushed a commit to queenliuxx/tidb-operator that referenced this pull request Dec 19, 2018
* scheduler: add extended scheduler to operator
yahonda pushed a commit that referenced this pull request Dec 27, 2021
* README: update doc version and branch info

* readme: update format of branch description and add note

* minor update

* readme: update description

Co-authored-by: Lilian Lee <lilin@pingcap.com>
fgksgf pushed a commit to fgksgf/tidb-operator that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend scheduler for better support local persistent volume
3 participants