Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extended scheduler to operator #145

Merged
merged 24 commits into from
Nov 2, 2018
Merged
Changes from 1 commit
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
scheduler: add extended scheduler to operator
weekface committed Oct 25, 2018
commit 449176ebcaa90b37be63156eb72a27fee5266a01
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -21,11 +21,14 @@ docker-push: docker
docker: build
docker build --tag "${DOCKER_REGISTRY}/pingcap/tidb-operator:latest" images/tidb-operator

build: controller-manager
build: controller-manager scheduler

controller-manager:
$(GO) build -ldflags '$(LDFLAGS)' -o images/tidb-operator/bin/tidb-controller-manager cmd/controller-manager/main.go

scheduler:
$(GO) build -ldflags '$(LDFLAGS)' -o images/tidb-operator/bin/tidb-scheduler cmd/scheduler/main.go

e2e-docker-push: e2e-docker
docker push "${DOCKER_REGISTRY}/pingcap/tidb-operator-e2e:latest"

1 change: 1 addition & 0 deletions charts/tidb-cluster/templates/tidb-cluster.yaml
Original file line number Diff line number Diff line change
@@ -13,6 +13,7 @@ spec:
timezone: {{ .Values.timezone | default "UTC" }}
services:
{{ toYaml .Values.services | indent 4 }}
schedulerName: {{ .Values.schedulerName | default "default-scheduler" }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add schedulerName on here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pd:
replicas: {{ .Values.pd.replicas }}
image: {{ .Values.pd.image }}
6 changes: 4 additions & 2 deletions charts/tidb-cluster/values.yaml
Original file line number Diff line number Diff line change
@@ -11,6 +11,9 @@ rbac:
# if multiple clusters are deployed in the same namespace.
clusterName: demo

# schedulerName must be same with charts/tidb-operator/values#scheduler.schedulerName
schedulerName: tidb-scheduler

# timezone is the default system timzone for TiDB
timezone: UTC

@@ -131,11 +134,11 @@ tidb:
# cloud.google.com/load-balancer-type: Internal

monitor:
create: true
# Also see rbac.create
# If you set rbac.create to false, you need to provide a value here.
# If you set rbac.create to true, you should leave this empty.
serviceAccount:
create: true
persistent: false
storageClassName: local-storage
storage: 10Gi
@@ -179,7 +182,6 @@ monitor:
# operator: Equal
# value: tidb
# effect: "NoSchedule"

# GCP MarketPlace integration
ubbagent: {}

64 changes: 64 additions & 0 deletions charts/tidb-operator/templates/scheduler-deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
{{- $defaultHyperkubeImage := "quay.io/coreos/hyperkube:v1.10.4_coreos.0" -}}
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: tidb-scheduler
labels:
app.kubernetes.io/name: {{ template "chart.name" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: scheduler
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
spec:
replicas: {{ .Values.scheduler.replicas }}
selector:
matchLabels:
app.kubernetes.io/name: {{ template "chart.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: scheduler
template:
metadata:
labels:
app.kubernetes.io/name: {{ template "chart.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: scheduler
spec:
{{- if .Values.scheduler.serviceAccount }}
serviceAccount: {{ .Values.scheduler.serviceAccount }}
{{- end }}
containers:
- name: tidb-scheduler
image: {{ .Values.operatorImage }}
resources:
{{ toYaml .Values.scheduler.resources | indent 12 }}
command:
- /usr/local/bin/tidb-scheduler
- -v={{ .Values.scheduler.logLevel }}
- -port=10262
- -pd-replicas={{ .Values.scheduler.pdReplicas | default 3 }}
- -tikv-replicas={{ .Values.scheduler.tikvReplicas | default 3 }}
- name: kube-scheduler
{{- if .Values.scheduler.kubeSchedulerImage }}
image: {{ .Values.scheduler.kubeSchedulerImage }}
{{- else if .Values.scheduler.hyperkubeImage }}
image: {{ .Values.scheduler.hyperkubeImage }}
{{- else }}
image: {{ $defaultHyperkubeImage }}
{{- end }}
resources:
{{ toYaml .Values.scheduler.resources | indent 12 }}
command:
{{- if .Values.scheduler.kubeSchedulerImage }}
- kube-scheduler
{{- else }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hyperkube image also has kube-scheduler in $PATH, so no need to set the command separately. So it's unnecessary to define two variables for scheduler image.

- /hyperkube
- scheduler
{{- end }}
- --port=10261
- --leader-elect=true
- --lock-object-name=tidb-scheduler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock should be customizable and I think it should be the same as schedulerName.

- --lock-object-namespace={{ .Release.Namespace }}
- --scheduler-name={{ .Values.scheduler.schedulerName }}
- --v={{ .Values.scheduler.logLevel }}
- --policy-configmap=tidb-scheduler-policy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

- --policy-configmap-namespace={{ .Release.Namespace }}
50 changes: 50 additions & 0 deletions charts/tidb-operator/templates/scheduler-policy-configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: tidb-scheduler-policy
labels:
app.kubernetes.io/name: {{ template "chart.name" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: scheduler
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
data:
policy.cfg: |-
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This JSON doesn't need to be templated, so using .Files.Get makes it more easy to maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, waiting for: #149 merged.

"kind" : "Policy",
"apiVersion" : "v1",
"predicates": [
{"name": "MatchInterPodAffinity"},
{"name": "CheckVolumeBinding"},
{"name": "CheckNodeCondition"},
{"name": "GeneralPredicates"},
{"name": "HostName"},
{"name": "PodFitsHostPorts"},
{"name": "MatchNodeSelector"},
{"name": "PodFitsResources"},
{"name": "NoDiskConflict"},
{"name": "PodToleratesNodeTaints"},
{"name": "CheckNodeMemoryPressure"},
{"name": "CheckNodeDiskPressure"}
],
"priorities": [
{"name": "EqualPriority", "weight": 1},
{"name": "ImageLocalityPriority", "weight": 1},
{"name": "LeastRequestedPriority", "weight": 1},
{"name": "BalancedResourceAllocation", "weight": 1},
{"name": "SelectorSpreadPriority", "weight": 1},
{"name": "NodePreferAvoidPodsPriority", "weight": 1},
{"name": "NodeAffinityPriority", "weight": 1},
{"name": "TaintTolerationPriority", "weight": 1},
{"name": "MostRequestedPriority", "weight": 1}
],
"extenders": [
{
"urlPrefix": "http://127.0.0.1:10262/scheduler",
"filterVerb": "filter",
"weight": 1,
"httpTimeout": 30000000000,
"enableHttps": false
}
]
}
125 changes: 125 additions & 0 deletions charts/tidb-operator/templates/scheduler-rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
{{- if .Values.rbac.create }}
kind: ServiceAccount
apiVersion: v1
metadata:
name: {{ .Values.scheduler.serviceAccount }}
labels:
app.kubernetes.io/name: {{ template "chart.name" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: scheduler
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: {{ .Release.Name }}:tidb-scheduler
labels:
app.kubernetes.io/name: {{ template "chart.name" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: scheduler
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
rules:
{{- if .Values.clusterScoped }}
- apiGroups: [""]
resources: ["pods", "services", "configmaps", "replicationcontrollers", "persistentvolumeclaims", "endpoints"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods/binding"]
verbs: ["create"]
- apiGroups: [""]
resources: ["endpoints", "events"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups: ["apps"]
resources: ["statefulsets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["extensions"]
resources: ["replicasets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
{{- end }}
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "update"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: {{ .Release.Name }}:tidb-scheduler
labels:
app.kubernetes.io/name: {{ template "chart.name" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: scheduler
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
subjects:
- kind: ServiceAccount
name: {{ .Values.scheduler.serviceAccount }}
namespace: {{ .Release.Namespace }}
roleRef:
kind: ClusterRole
name: {{ .Release.Name }}:tidb-scheduler
apiGroup: rbac.authorization.k8s.io
{{- if (not .Values.clusterScoped) }}
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: {{ .Release.Name }}:tidb-scheduler
labels:
app.kubernetes.io/name: {{ template "chart.name" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: scheduler
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
rules:
- apiGroups: [""]
resources: ["pods", "services", "configmaps", "replicationcontrollers", "persistentvolumeclaims", "endpoints"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods/binding"]
verbs: ["create"]
- apiGroups: [""]
resources: ["endpoints", "events"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups: ["apps"]
resources: ["statefulsets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["extensions"]
resources: ["replicasets"]
verbs: ["get", "list", "watch"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: {{ .Release.Name }}:tidb-scheduler
labels:
app.kubernetes.io/name: {{ template "chart.name" . }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: scheduler
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
subjects:
- kind: ServiceAccount
name: {{ .Values.scheduler.serviceAccount }}
roleRef:
kind: Role
name: {{ .Release.Name }}:tidb-scheduler
apiGroup: rbac.authorization.k8s.io
{{- end }}
{{- end }}
22 changes: 22 additions & 0 deletions charts/tidb-operator/values.yaml
Original file line number Diff line number Diff line change
@@ -34,3 +34,25 @@ controllerManager:
pdFailoverPeriod: 5m
# tidb failover period default(5m)
tidbFailoverPeriod: 5m

scheduler:
# With rbac.create=false, the user is responsible for creating this account
# With rbac.create=true, this service account will be created
# Also see rbac.create and clusterScoped
serviceAccount: tidb-scheduler
logLevel: 2
replicas: 1
schedulerName: tidb-scheduler
resources:
limits:
cpu: 250m
memory: 150Mi
requests:
cpu: 80m
memory: 50Mi
# pd replicas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replicas are cluster specific, it should not be defined in the scheduler. And since PD is single raft cluster, its replica is the same as its member count. But TiKV replica is configured in PD configuration, this can be retrieved by PD API.
However, for the first version, use 3 for both PD and TiKV is enough.

pdReplicas: 3
# tikv replicas
tikvReplicas: 3
hyperkubeImage: quay.io/coreos/hyperkube:v1.10.4_coreos.0
# kubeSchedulerImage:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary to use two image variable.

7 changes: 5 additions & 2 deletions cmd/scheduler/main.go
Original file line number Diff line number Diff line change
@@ -18,7 +18,6 @@ import (
"net/http"
_ "net/http/pprof"
"os"

"time"

"github.com/golang/glog"
@@ -33,12 +32,16 @@ import (
var (
printVersion bool
port int
pdReplicas int
tikvReplicas int
)

func init() {
flag.BoolVar(&printVersion, "V", false, "Show version and quit")
flag.BoolVar(&printVersion, "version", false, "Show version and quit")
flag.IntVar(&port, "port", 10262, "The port that the tidb scheduler's http service runs on (default 10262)")
flag.IntVar(&pdReplicas, "pd-replicas", 3, "The pd replicas (default 3)")
flag.IntVar(&tikvReplicas, "tikv-replicas", 3, "The tikv replicas (default 3)")
flag.Parse()
}

@@ -62,7 +65,7 @@ func main() {
}

go wait.Forever(func() {
server.StartServer(kubeCli, port)
server.StartServer(kubeCli, port, int32(pdReplicas), int32(tikvReplicas))
}, 5*time.Second)
glog.Fatal(http.ListenAndServe(":6060", nil))
}
Loading