Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] Keda operator seems to have a memory leak when handling ScaledJob (with a pubsub scaler) #1257

Closed
OrenLederman opened this issue Oct 15, 2020 · 2 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@OrenLederman
Copy link

OrenLederman commented Oct 15, 2020

Keda operator consumes increasing amount of memory. In a GKE installation with a single ScaledJob, it used 1.6gb of memory after 12 hours.

Expected Behavior

Operator memory should remain mostly constant

Actual Behavior

I observed this behavior on clean cluster installations, both on GKE and Minikube. Attempted the following:

  • Keda only. Minor memory increase
  • Keda with a single ScaledJob. Memory increase about 35mb per hour
  • Keda with two ScaledJobs. Memory increased about 72mb per hour

Steps to Reproduce the Problem

  1. create new minikube minikube start --memory=4096 --cpus=4 --kubernetes-version=v1.16.13 --disk-size=20
  2. Install Keda using helm:
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
kubectl create namespace keda
helm install keda kedacore/keda --version 2.0.0-rc --namespace keda 
  1. Deploy ScaledJob. See yaml file at the end of the report. Note that the job is running a code that reads messages in a loop, but it doesn't matter for this experiment (I had the same issue with a code that reads a single message):
kubectl create ns dev
kubectl -n dev -f ~/temp/scaledjob_v2.yaml

Specifications

  • **KEDA Version: 2.0.0-rc
  • **Platform & Version: minikube version: v1.14.0
  • **Kubernetes Version: 1.16.13
  • **Scaler(s): GCP Pubsub

Appendix

# scaledjob_v2.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
 name: test-keda-pubsub-scaledjob
spec:
 jobTargetRef:
   parallelism: 1
   completions: 1
   activeDeadlineSeconds: 600 # Specifies the duration in seconds relative to the startTime that the job may be active before the system tries to terminate it; value must be positive integer
   backoffLimit: 6 # Specifies the number of retries before marking this job failed. Defaults to 6
   template:
     # describes the [job template](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/)
     spec:
       containers:
       - name: consumer
         image:  patnaikshekhar/keda-pubsub-sample:1
         env:
           - name: SUBSCRIPTION_NAME
             value: "test-ol-keda-sub"
           - name: GOOGLE_APPLICATION_CREDENTIALS_JSON
             valueFrom:
               secretKeyRef:
                 name: gcp-secret
                 key: GOOGLE_APPLICATION_CREDENTIALS_JSON
           - name: PROJECT_ID
             valueFrom:
               secretKeyRef:
                 name: gcp-secret
                 key: PROJECT_ID
 pollingInterval: 30  # Optional. Default: 30 seconds
 maxReplicaCount: 100 # Optional. Default: 100
 successfulJobsHistoryLimit: 5
 failedJobsHistoryLimit: 5
 triggers:
 - type: gcp-pubsub
   metadata:
     subscriptionSize: "1"
     subscriptionName: "test-ol-keda-sub" # Required
     credentialsFromEnv: GOOGLE_APPLICATION_CREDENTIALS_JSON # Required
@OrenLederman OrenLederman added the bug Something isn't working label Oct 15, 2020
@OrenLederman
Copy link
Author

OrenLederman commented Oct 15, 2020

Another interesting piece of information - the cpu load and memory usage graphs seem correlated. If I had to guess, this is not exactly a memory leak. Instead, the operator keeps adding data to a data structure somewhere (which explains the increase of memory), and it traverses this data structure periodically (which explains the increase of cpu load)

@OrenLederman
Copy link
Author

Fixed by #1284

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants