Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add consecutive count check for Auto-scaling #1703

Merged
merged 23 commits into from
Feb 17, 2020
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions manifests/crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7078,6 +7078,12 @@ spec:
metric will be set to 80% average CPU utilization.
items: {}
type: array
metricsTimeWindowSeconds:
description: MetricsTimeWindowSeconds describe the time window seconds
for the metrics to be queried in the Prometheus. If not set, the
default value would be 180.
format: int32
type: integer
minReplicas:
description: minReplicas is the lower limit for the number of replicas
to which the autoscaler can scale down. It defaults to 1 pod.
Expand All @@ -7090,12 +7096,28 @@ spec:
will be set to 500
format: int32
type: integer
scaleInThreshold:
description: ScaleInThreshold describe the consecutive threshold
for the auto-scaling, if the consecutive counts of the scale-int
result in auto-scaling reach this number, the auto-scaling would
be performed. If not set, the default value is 1 which means it
would perform with no threshold.
format: int32
type: integer
scaleOutIntervalSeconds:
description: ScaleOutIntervalSeconds represents the duration seconds
between each auto-scaling-out If not set, the default ScaleOutIntervalSeconds
will be set to 300
format: int32
type: integer
scaleOutThreshold:
description: ScaleOutThreshold describe the consecutive threshold
for the auto-scaling, if the consecutive counts of the scale-out
result in auto-scaling reach this number, the auto-scaling would
be performed. If not set, the default value is 1 which means it
would perform with no threshold.
format: int32
type: integer
required:
- maxReplicas
type: object
Expand All @@ -7120,6 +7142,12 @@ spec:
metric will be set to 80% average CPU utilization.
items: {}
type: array
metricsTimeWindowSeconds:
description: MetricsTimeWindowSeconds describe the time window seconds
for the metrics to be queried in the Prometheus. If not set, the
default value would be 180.
format: int32
type: integer
minReplicas:
description: minReplicas is the lower limit for the number of replicas
to which the autoscaler can scale down. It defaults to 1 pod.
Expand All @@ -7132,12 +7160,28 @@ spec:
will be set to 500
format: int32
type: integer
scaleInThreshold:
description: ScaleInThreshold describe the consecutive threshold
for the auto-scaling, if the consecutive counts of the scale-int
result in auto-scaling reach this number, the auto-scaling would
be performed. If not set, the default value is 1 which means it
would perform with no threshold.
format: int32
type: integer
scaleOutIntervalSeconds:
description: ScaleOutIntervalSeconds represents the duration seconds
between each auto-scaling-out If not set, the default ScaleOutIntervalSeconds
will be set to 300
format: int32
type: integer
scaleOutThreshold:
description: ScaleOutThreshold describe the consecutive threshold
for the auto-scaling, if the consecutive counts of the scale-out
result in auto-scaling reach this number, the auto-scaling would
be performed. If not set, the default value is 1 which means it
would perform with no threshold.
format: int32
type: integer
required:
- maxReplicas
type: object
Expand Down
63 changes: 63 additions & 0 deletions pkg/apis/pingcap/v1alpha1/openapi_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 20 additions & 0 deletions pkg/apis/pingcap/v1alpha1/tidbclusterautoscaler_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,26 @@ type BasicAutoScalerSpec struct {
// If not set, the default metric will be set to 80% average CPU utilization.
// +optional
Metrics []v2beta2.MetricSpec `json:"metrics,omitempty"`

// MetricsTimeWindowSeconds describe the time window seconds for the metrics
// to be queried in the Prometheus.
// If not set, the default value would be 180.
// +optional
MetricsTimeWindowSeconds *int32 `json:"metricsTimeWindowSeconds,omitempty"`
Yisaer marked this conversation as resolved.
Show resolved Hide resolved

// ScaleOutThreshold describe the consecutive threshold for the auto-scaling,
// if the consecutive counts of the scale-out result in auto-scaling reach this number,
// the auto-scaling would be performed.
// If not set, the default value is 3.
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
// +optional
ScaleOutThreshold *int32 `json:"scaleOutThreshold,omitempty"`

// ScaleInThreshold describe the consecutive threshold for the auto-scaling,
// if the consecutive counts of the scale-int result in auto-scaling reach this number,
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
// the auto-scaling would be performed.
// If not set, the default value is 3.
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
// +optional
ScaleInThreshold *int32 `json:"scaleInThreshold,omitempty"`
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
}

// TODO: sync status
Expand Down
15 changes: 15 additions & 0 deletions pkg/apis/pingcap/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/autoscaler/autoscaler/autoscaler_manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ func (am *autoScalerManager) syncTidbClusterReplicas(tc *v1alpha1.TidbCluster, o
}

//TODO: sync tac status
func (am *autoScalerManager) syncAutoScalingStatus(tc *v1alpha1.TidbCluster, oldTCSpec *v1alpha1.TidbClusterSpec,
func (am *autoScalerManager) syncAutoScalingStatus(tc *v1alpha1.TidbCluster, oldTc *v1alpha1.TidbClusterSpec,
tac *v1alpha1.TidbClusterAutoScaler) error {
return nil
}
43 changes: 32 additions & 11 deletions pkg/autoscaler/autoscaler/tidb_autoscaler.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,38 +24,59 @@ import (

func (am *autoScalerManager) syncTiDB(tc *v1alpha1.TidbCluster, tac *v1alpha1.TidbClusterAutoScaler, client promClient.Client) error {
if tac.Spec.TiDB == nil {
emptyConsecutiveCount(tc, v1alpha1.TiDBMemberType)
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
return nil
}
sts, err := am.stsLister.StatefulSets(tc.Namespace).Get(operatorUtils.GetStatefulSetName(tc, v1alpha1.TiDBMemberType))
if err != nil {
return err
}
if !checkAutoScalingPrerequisites(tc, sts, v1alpha1.TiDBMemberType) {
emptyConsecutiveCount(tc, v1alpha1.TiDBMemberType)
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
return nil
}
targetReplicas := tc.Spec.TiDB.Replicas

// TODO: sync tidb.metrics from prometheus
// rate(process_cpu_seconds_total{cluster="tidb",job="tidb"}[threshold Minute])
//for _, _ = range tac.Spec.TiDB.Metrics {
// // revive:disable:empty-block
//}
currentReplicas := tc.Spec.TiDB.Replicas
targetReplicas := calculateRecommendedReplicas(tac, v1alpha1.TiDBMemberType, client)
targetReplicas = limitTargetReplicas(targetReplicas, tac, v1alpha1.TiDBMemberType)
if targetReplicas == tc.Spec.TiDB.Replicas {
emptyConsecutiveCount(tc, v1alpha1.TiDBMemberType)
return nil
}
return syncTiDBAfterCalculated(tc, tac, currentReplicas, targetReplicas)
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
}

// syncTiDBAfterCalculated would check the Consecutive count to avoid jitter, and it would also check the interval
// duration between each auto-scaling. If either of them is not meet, the auto-scaling would be rejected.
// If the auto-scaling is permitted, the timestamp would be recorded and the Consecutive count would be zeroed.
func syncTiDBAfterCalculated(tc *v1alpha1.TidbCluster, tac *v1alpha1.TidbClusterAutoScaler, currentReplicas, recommendedReplicas int32) error {
if err := updateConsecutiveCount(tc, tac, v1alpha1.TiDBMemberType, currentReplicas, recommendedReplicas); err != nil {
return err
}

ableToScale, err := checkConsecutiveCount(tc, tac, v1alpha1.TiDBMemberType, currentReplicas, recommendedReplicas)
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
if err != nil {
return err
}
if !ableToScale {
return nil
}
intervalSeconds := tac.Spec.TiDB.ScaleInIntervalSeconds
if targetReplicas > tc.Spec.TiDB.Replicas {
if recommendedReplicas > currentReplicas {
intervalSeconds = tac.Spec.TiDB.ScaleOutIntervalSeconds
}
ableToScale, err := checkStsAutoScalingInterval(tc, *intervalSeconds, v1alpha1.TiDBMemberType)
ableToScale, err = checkStsAutoScalingInterval(tc, *intervalSeconds, v1alpha1.TiDBMemberType)
if err != nil {
return err
}
if !ableToScale {
return nil
}
tc.Spec.Annotations[label.AnnTiDBLastAutoScalingTimestamp] = time.Now().String()
tc.Spec.TiDB.Replicas = targetReplicas
updateTcTiDBAnnIfScale(tc)
tc.Spec.TiDB.Replicas = recommendedReplicas
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
return nil
}

func updateTcTiDBAnnIfScale(tc *v1alpha1.TidbCluster) {
tc.Annotations[label.AnnTiDBLastAutoScalingTimestamp] = time.Now().String()
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
emptyConsecutiveCount(tc, v1alpha1.TiDBMemberType)
Yisaer marked this conversation as resolved.
Show resolved Hide resolved
}
Loading