Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pub/Sub Scaler: Inappropriate alignment #6052

Open
karotchykau opened this issue Aug 8, 2024 · 1 comment
Open

Pub/Sub Scaler: Inappropriate alignment #6052

karotchykau opened this issue Aug 8, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@karotchykau
Copy link

Report

When you apply gcp-pubsub, it won't trigger upscaling for some metrics in most cases if there are no changes in traffic.

For instance, let's consider NumUndeliveredMessages. If someone publishes a lot of different messages, stops, and timeHorizon gets passed, the resource will start scaling down to 0 even there are still a lot of unprocessed messages. The reason for this is that it always assigns a static DELTA alignment (https://github.com/kedacore/keda/blob/v2.15.0/pkg/scalers/gcp/gcp_stackdriver_client.go#L370); therefore, if there were 1,000,000 messages published and we could process only 100,000 of them during timeHorizon, the rest (900,000) would stay unacknowledged because our cluster got downscaled to 0 and we'll not be upscaled unless someone publishes a new message.

P.S. It'll actually have negative numbers for DELTA, but it doesn't make it different.

Expected Behavior

You should be able to somehow configure the alignment type (as well as the interval).

Actual Behavior

You cannot configure the alignment type (as well as the interval); therefore, old messages will not be taken into account.

Steps to Reproduce the Problem

  1. Create ScaledObject for gcp-pubsub. Here is an example of parameters that I used before:
      mode: "NumUndeliveredMessages"
      aggregation: "mean"
      value: "100"
      timeHorizon: "5m"
  1. In your scaleTargetRef (Deployment/etc.) configure some kind of delay so it processes messages slower.
  2. Publish a decent number of messages that you know won't be fully processed during timeHorizon (e.g. 100,000).

Logs from KEDA operator

N/A

Logs are clear and without any errors indicating the scaling process that I described above.

KEDA Version

2.15.0

Kubernetes Version

1.29

Platform

Google Cloud

Scaler Details

Pub/Sub

Anything else?

For those who encountered the same issue. You can just fall back to gcp-stackdriver instead:

      projectId: PROJECT
      filter: 'resource.type="pubsub_subscription" AND resource.labels.subscription_id="SUBSCRIPTION_ID" AND metric.type="pubsub.googleapis.com/subscription/num_undelivered_messages"'
      ...
@karotchykau karotchykau added the bug Something isn't working label Aug 8, 2024
@karotchykau
Copy link
Author

N.B.

Without any aggregation (""), it will just skip the alignment configuration. Here is the result from BuildMQLQuery:

fetch pubsub_subscription
| metric 'pubsub.googleapis.com/subscription/num_undelivered_messages'
| filter (resource.project_id == 'PROJECT' && resource.subscription_id == 'SUBSCRIPTION_ID')
| within 5m

But it just skips it, that's it. Also, it's unclear in this case what kind of reduction is done after the within 5m above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: To Triage
Development

No branches or pull requests

1 participant