Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose Horizontal Pod Autoscaler Behavior and add hpa scaledown test #1077

Merged
merged 17 commits into from
Sep 15, 2022
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions apis/v1alpha1/opentelemetrycollector_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,13 @@ type OpenTelemetryCollectorSpec struct {
// MaxReplicas sets an upper bound to the autoscaling feature. If MaxReplicas is set autoscaling is enabled.
// +optional
MaxReplicas *int32 `json:"maxReplicas,omitempty"`

// Autoscaler specifies the pod autoscaling configuration to use
// for the OpenTelemetryCollector workload.
//
// +optional
Autoscaler *AutoscalerSpec `json:"autoscaler,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: did we consider embedding the HPA spec in here? Or at least embed the autoscaling behavior spec here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just adding enough code to be able to get an e2e test to work within the time allocated, which means we need to scale down much quicker that the default 300 seconds.

@pavolloffay what do you think? Do I need to add the other values of PA scaling rules here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my use case, I know that I would like to be able to specify policies and not just StabilizationWindowSeconds. Embedding only StabilizationWindowSeconds is going to make it so we need to add in each feature on request, making a code change for each one.


// SecurityContext will be set as the container security context.
// +optional
SecurityContext *v1.SecurityContext `json:"securityContext,omitempty"`
Expand Down Expand Up @@ -195,6 +202,21 @@ type OpenTelemetryCollectorList struct {
Items []OpenTelemetryCollector `json:"items"`
}

// AutoscalerSpec defines the OpenTelemetryCollector's pod autoscaling specification.
type AutoscalerSpec struct {
// ScaleUp is the minimum number of seconds to wait before scaling up
//
// +optional
// +kubebuilder:validation:Minimum=0
ScaleUp *int32 `json:"scaleUp,omitempty"`

// ScaleDown is the minimum number of seconds to wait before scaling down
//
// +optional
// +kubebuilder:validation:Minimum=0
ScaleDown *int32 `json:"scaleDown,omitempty"`
}

func init() {
SchemeBuilder.Register(&OpenTelemetryCollector{}, &OpenTelemetryCollectorList{})
}
9 changes: 9 additions & 0 deletions apis/v1alpha1/opentelemetrycollector_webhook.go
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,15 @@ func (r *OpenTelemetryCollector) validateCRDSpec() error {
return fmt.Errorf("the OpenTelemetry Spec autoscale configuration is incorrect, minReplicas should be one or more")
}

if r.Spec.Autoscaler != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also check that min or max replicas is set in order to use this feature

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's inside an if statement that's checking whether maxReplicas is set.

if r.Spec.Autoscaler.ScaleDown != nil && *r.Spec.Autoscaler.ScaleDown < int32(1) {
return fmt.Errorf("the OpenTelemetry Spec autoscale configuration is incorrect, scaleDown should be one or more")
}

if r.Spec.Autoscaler.ScaleUp != nil && *r.Spec.Autoscaler.ScaleUp < int32(1) {
return fmt.Errorf("the OpenTelemetry Spec autoscale configuration is incorrect, scaleUp should be one or more")
}
}
}

return nil
Expand Down
30 changes: 30 additions & 0 deletions apis/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions bundle/manifests/opentelemetry.io_opentelemetrycollectors.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,23 @@ spec:
description: Args is the set of arguments to pass to the OpenTelemetry
Collector binary
type: object
autoscaler:
description: Autoscaler specifies the pod autoscaling configuration
to use for the OpenTelemetryCollector workload.
properties:
scaleDown:
description: ScaleDown is the minimum number of seconds to wait
before scaling down
format: int32
minimum: 0
type: integer
scaleUp:
description: ScaleUp is the minimum number of seconds to wait
before scaling up
format: int32
minimum: 0
type: integer
type: object
config:
description: Config is the raw JSON to be used as the collector's
configuration. Refer to the OpenTelemetry Collector documentation
Expand Down
17 changes: 17 additions & 0 deletions config/crd/bases/opentelemetry.io_opentelemetrycollectors.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,23 @@ spec:
description: Args is the set of arguments to pass to the OpenTelemetry
Collector binary
type: object
autoscaler:
description: Autoscaler specifies the pod autoscaling configuration
to use for the OpenTelemetryCollector workload.
properties:
scaleDown:
description: ScaleDown is the minimum number of seconds to wait
before scaling down
format: int32
minimum: 0
type: integer
scaleUp:
description: ScaleUp is the minimum number of seconds to wait
before scaling up
format: int32
minimum: 0
type: integer
type: object
config:
description: Config is the raw JSON to be used as the collector's
configuration. Refer to the OpenTelemetry Collector documentation
Expand Down
47 changes: 47 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -1691,6 +1691,13 @@ OpenTelemetryCollectorSpec defines the desired state of OpenTelemetryCollector.
Args is the set of arguments to pass to the OpenTelemetry Collector binary<br/>
</td>
<td>false</td>
</tr><tr>
<td><b><a href="#opentelemetrycollectorspecautoscaler">autoscaler</a></b></td>
<td>object</td>
<td>
Autoscaler specifies the pod autoscaling configuration to use for the OpenTelemetryCollector workload.<br/>
</td>
<td>false</td>
</tr><tr>
<td><b>config</b></td>
<td>string</td>
Expand Down Expand Up @@ -1866,6 +1873,46 @@ OpenTelemetryCollectorSpec defines the desired state of OpenTelemetryCollector.
</table>


### OpenTelemetryCollector.spec.autoscaler
<sup><sup>[↩ Parent](#opentelemetrycollectorspec)</sup></sup>



Autoscaler specifies the pod autoscaling configuration to use for the OpenTelemetryCollector workload.

<table>
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th>Description</th>
<th>Required</th>
</tr>
</thead>
<tbody><tr>
<td><b>scaleDown</b></td>
<td>integer</td>
<td>
ScaleDown is the minimum number of seconds to wait before scaling down<br/>
<br/>
<i>Format</i>: int32<br/>
<i>Minimum</i>: 0<br/>
</td>
<td>false</td>
</tr><tr>
<td><b>scaleUp</b></td>
<td>integer</td>
<td>
ScaleUp is the minimum number of seconds to wait before scaling up<br/>
<br/>
<i>Format</i>: int32<br/>
<i>Minimum</i>: 0<br/>
</td>
<td>false</td>
</tr></tbody>
</table>


### OpenTelemetryCollector.spec.env[index]
<sup><sup>[↩ Parent](#opentelemetrycollectorspec)</sup></sup>

Expand Down
39 changes: 39 additions & 0 deletions pkg/collector/horizontalpodautoscaler.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ import (
)

const defaultCPUTarget int32 = 90
const defaultScaleUpTime int32 = 60
const defaultScaleDownTime int32 = 300

func HorizontalPodAutoscaler(cfg config.Config, logger logr.Logger, otelcol v1alpha1.OpenTelemetryCollector) client.Object {
autoscalingVersion := cfg.AutoscalingVersion()
Expand All @@ -47,6 +49,15 @@ func HorizontalPodAutoscaler(cfg config.Config, logger logr.Logger, otelcol v1al
Annotations: annotations,
}

scaleUpTime := defaultScaleUpTime
if otelcol.Spec.Autoscaler != nil && otelcol.Spec.Autoscaler.ScaleDown != nil {
scaleUpTime = *otelcol.Spec.Autoscaler.ScaleUp
}
scaleDownTime := defaultScaleDownTime
if otelcol.Spec.Autoscaler != nil && otelcol.Spec.Autoscaler.ScaleDown != nil {
scaleDownTime = *otelcol.Spec.Autoscaler.ScaleDown
}

if autoscalingVersion == autodetect.AutoscalingVersionV2Beta2 {
targetCPUUtilization := autoscalingv2beta2.MetricSpec{
Type: autoscalingv2beta2.ResourceMetricSourceType,
Expand All @@ -73,6 +84,20 @@ func HorizontalPodAutoscaler(cfg config.Config, logger logr.Logger, otelcol v1al
Metrics: metrics,
},
}

if otelcol.Spec.Autoscaler != nil {
scaleUpRules := &autoscalingv2beta2.HPAScalingRules{
StabilizationWindowSeconds: &scaleUpTime,
}
scaleDownRules := &autoscalingv2beta2.HPAScalingRules{
StabilizationWindowSeconds: &scaleDownTime,
}
behavior := &autoscalingv2beta2.HorizontalPodAutoscalerBehavior{
ScaleUp: scaleUpRules,
ScaleDown: scaleDownRules,
}
autoscaler.Spec.Behavior = behavior
}
result = &autoscaler
} else {
targetCPUUtilization := autoscalingv2.MetricSpec{
Expand Down Expand Up @@ -100,6 +125,20 @@ func HorizontalPodAutoscaler(cfg config.Config, logger logr.Logger, otelcol v1al
Metrics: metrics,
},
}

if otelcol.Spec.Autoscaler != nil {
scaleUpRules := &autoscalingv2.HPAScalingRules{
StabilizationWindowSeconds: &scaleUpTime,
}
scaleDownRules := &autoscalingv2.HPAScalingRules{
StabilizationWindowSeconds: &scaleDownTime,
}
behavior := &autoscalingv2.HorizontalPodAutoscalerBehavior{
ScaleUp: scaleUpRules,
ScaleDown: scaleDownRules,
}
autoscaler.Spec.Behavior = behavior
}
result = &autoscaler
}

Expand Down
3 changes: 3 additions & 0 deletions tests/e2e/autoscale/00-install.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ metadata:
spec:
minReplicas: 1
maxReplicas: 2
autoscaler:
scaleUp: 10
scaleDown: 15
resources:
limits:
cpu: 500m
Expand Down