Adjust default memory/cpu requests and limits #115

tcdowney · 2021-04-08T22:29:33Z

There's been evidence that the Cloud Controller API and registry-buddy components have been getting OOMKilled under typical usage in production environments. This change adjusts the resource limits to be more in line with what our monit limits historically were in the capi-release BOSH release. E.g. https://github.com/cloudfoundry/capi-release/blob/6f9b4d5ec3ba46adf39d7de2d60e1b58a4511784/jobs/cloud_controller_ng/spec#L893

Some memory requests were adjusted up and down to better account for typical/min usage of these components.

This change also adjusts some of the CPU limits to be more generous for
critical processes such as nginx and the cf-api-controllers reconciler.

Overall there is a +210Mi increase in memory requests (only +10 if you ignore the short-lived ccdb migration Job).

If operators want to tweak these values further they can follow this approach in our vertical scaling docs.

Tracker story: #177468691

There's been evidence that the Cloud Controller API and registry-buddy components have been getting OOMKilled under typical usage in production environments. This change adjusts the resource limits to be more in line with what our monit limits historically were in the capi-release BOSH release. E.g. https://github.com/cloudfoundry/capi-release/blob/6f9b4d5ec3ba46adf39d7de2d60e1b58a4511784/jobs/cloud_controller_ng/spec#L893 This change also adjusts some of the CPU limits to be more generous for critical processes such as nginx and the cf-api-controllers reconciler. [#177468691](https://www.pivotaltracker.com/story/show/177468691) Authored-by: Tim Downey <tdowney@vmware.com>

tcdowney · 2021-04-08T22:32:36Z

config/api_server_deployment.yml

@@ -40,10 +40,10 @@ spec:
          resources:
            requests:
              cpu: 500m
-              memory: 300Mi
+              memory: 500Mi


In an environment with no load Cloud Controller is using ~400Mi of RAM. For decent quality of service in light environments it should have at least 500Mi. In practice I've seen it reach a steady state of ~3.5Gi of RAM on CF for VMs environments with occasional surges past 4Gi, but these were typically anomalous cases in very large scale environments.

Do we have any recommendations somewhere about increasing requests for larger environments where we can mention the ~3.5GB need?

We vaguely hint at frequent restarts being due to "memory pressure" on this doc: https://docs.cloudfoundry.org/running/managing-cf/scaling-cloud-controller-k8s.html#cf-api-server

But AFAIK I'm not aware of any "production cluster sizing" guidance apart from what's mentioned in https://github.com/cloudfoundry/cf-for-k8s/blob/819bc1c981153df6801d07e747b6ef4809189d24/docs/platform_operators/scaling.md#scaling-cf-for-k8s.

It's also a pretty hand wavy number based on what I've seen in typical CF for VMs environments with 4GB (swap enabled) VMs given to the Cloud Controllers. It's possible the number might bloat higher on a larger VM or it's possible that the usage patterns are different enough on CF for K8s that doesn't reach those levels often.

Does anyone know what the minimum node memory size we recommend is? As long as its 8-16GB I feel like there should be enough headroom for this to be OK.

ahh neat okay the docs are accessible from cf-for-k8s 👍 . According to our deploy docs, we require a minimum of 15GB of memory (https://github.com/cloudfoundry/cf-for-k8s/blob/819bc1c981153df6801d07e747b6ef4809189d24/docs/deploy.md).

tcdowney · 2021-04-08T22:33:52Z

config/api_server_deployment.yml

@@ -68,10 +68,10 @@ spec:
          resources:
            requests:
              cpu: 100m
-              memory: 300Mi
+              memory: 250Mi


The local worker doesn't do much in cf-for-k8s. On CF for VMs it is ferrying large files to the blobstore, but in cf-for-k8s it's just moving a file on disk to a mount that the registry-buddy has access to. I've lowered both the limit and requests to account for this.

tcdowney · 2021-04-08T22:37:24Z

config/api_server_deployment.yml

+              cpu: 1500m
+              memory: 1Gi


I think this CPU limit must have been a typo before. I've adjusted it to a more reasonable multi-core limit so that it can get some concurrency if it needs to. I've also adjusted the memory limit to 1Gi.

The registry-buddy converts a zip file containing package source code into a tar file/single-layer OCI image and sends it to a container registry. For apps with large individual files (e.g. a Java app with jars) we've seen memory surge past the 150Mi this had originally and the cf-api-servers Pod would get OOMKilled. I'm considering 2Gi to account for the default max package size, though I do not think it should actually use that much memory based on my understanding of the code:

capi-k8s-release/src/registry-buddy/archive/archive.go

Line 119 in 5878960

func WriteZipToTar(tw TarWriter, srcZip, basePath string, uid, gid int, mode int64, normalizeModTime bool, fileFilter func(string) bool) error {

Thoughts on this?

tcdowney · 2021-04-08T22:38:25Z

config/api_server_deployment.yml

            limits:
-              cpu: 500m
+              cpu: 1000m


Increased this since nginx is in the critical path for API server requests. It's a concurrent web server so this could technically be increased further if we'd like.

tcdowney · 2021-04-08T22:39:54Z

config/ccdb-migrate-job.yaml

@@ -29,7 +29,7 @@ spec:
            memory: 300Mi
          limits:
            cpu: 1000m
-            memory: 1Gi
+            memory: 4Gi


I think it's unlikely that the database migrations use more than 4Gi -- my gut tells me it would probably be ~2Gi max on a large env -- but it would be awful if migrations got OOMKilled. Increased this to 4Gi since that is around what it would use on BOSH (kinda -- monit would have allowed it to temporarily burst higher for a few cycles).

tcdowney · 2021-04-08T22:40:52Z

config/clock_deployment.yml

            limits:
              cpu: 1000m
-              memory: 1Gi
+              memory: 4Gi


The Clock performs a periodic bulk sync of apps and CF tasks that are running on the platform. While doing this it loads information about all the running LRPs into memory to compare against what's in CCDB. This can be pretty memory intensive on large environments so we should be more generous here.

tcdowney · 2021-04-08T22:42:06Z

config/controllers_deployment.yml

@@ -43,11 +43,11 @@ spec:
          value: #@ data.values.workloads_namespace
        resources:
          limits:
-            cpu: 100m
-            memory: 200Mi
+            cpu: 1000m


This controller is responsible for staging (it watches kpack build/images) and some route reconciliation. We should definitely let it use at least a full CPU if not more. I think this must have been a typo for it to be so low.

tcdowney · 2021-04-08T22:43:47Z

config/controllers_deployment.yml

-            cpu: 100m
-            memory: 200Mi
+            cpu: 1000m
+            memory: 1Gi


The SharedInformer cache for this controller needs memory to be able to load information about all the CRDs its watching. For larger environments with a lot of routes and/or kpack builds/images it's very likely for it to get OOMKilled with a limit at 200mb.

tcdowney · 2021-04-08T22:45:26Z

I added some comments explaining some of the choices I made when tweaking the numbers. There are some containers that may benefit from being given even higher CPU limits, so left some comments for discussion on those as well.

heycait · 2021-04-08T23:09:43Z

Seems reasonable to me. I don't have any comments about that code you linked because I wasn't really seeing how that was mentioning anything about size.

tcdowney · 2021-04-08T23:29:55Z

Seems reasonable to me. I don't have any comments about that code you linked because I wasn't really seeing how that was mentioning anything about size.

The registry-buddy code I linked is only relevant because it is reading (potentially large) files into memory and writing them into a tar file. My understanding is that it should be a buffered copy, though, and should actually use that much memory.

Maybe https://github.com/cloudfoundry/capi-k8s-release/blob/5878960c0ad229fbd5832c728e3addf686e2f18f/src/registry-buddy/package_upload/package_upload.go is the culprit. I'm not that familiar with how go-containerregistry actually works and what all is in memory when it's constructing that image and writing it to the registry. 🤔

EDIT
Actually I've gone through more code and I think while we're not reading the zip into memory all at once, we are converting it to a tar in memory and directly making a container image from it (also in memory) that's then written to the remote registry:

capi-k8s-release/src/registry-buddy/package_upload/package_upload.go

Line 23 in 5878960

    
           layer, err := tarball.LayerFromReader(archive.ReadZipAsTar(zipPath, "/", 0, 0, -1, true, noopFilter))

Maybe that's the source of the registry-buddy's mysterious memory usage. 🤔 💭

heycait · 2021-04-09T16:14:20Z

EDIT
Actually I've gone through more code and I think while we're not reading the zip into memory all at once, we are converting it to a tar in memory and directly making a container image from it (also in memory) that's then written to the remote registry:

oh interesting! I wonder why the conversion between zip/tar needs to be done.. I guess the question has to do more with compression vs archiving though

UPDATE
After looking into tar vs zip, I see now that we're uncompressing it into a single archived file to be read for containing building.

tcdowney · 2021-04-09T17:27:08Z

After looking into tar vs zip, I see now that we're uncompressing it into a single archived file to be read for containing building.

Yeah, for more context there around why it is a zip to begin with -- that's because app source Packages in CF are zip files and the CLI is uploading it to Cloud Controller as a zip (via this endpoint). The tar conversion is because OCI Image layers are usually just tar archives (this is an interesting blog post I read once about this).

I think potentially in this code we could have written the tar to disk first, but the container image creation itself would ultimately still read it all into memory so I don't think that would actually help things (may just make it slower). I don't think we need to change anything about it right now, but this may be helpful stuff to think about if we reimplement parts of this flow in the future (along with using the public version of the zip->tar conversion code from buildpacks/pack#814).

It looks like in cf-for-k8s the max package size is hardcoded to be ~1GB so I think this limit should be fine. We've got one local worker container so the registry-buddy will be handling at most one upload at a time with this configuration.

cf-gitbot added the unscheduled label Apr 8, 2021

tcdowney commented Apr 8, 2021

View reviewed changes

cloudfoundry deleted a comment from cf-gitbot Apr 8, 2021

tcdowney merged commit 1cab1d5 into master Apr 12, 2021

cf-gitbot removed the unscheduled label Apr 12, 2021

tcdowney deleted the adjust-request-limits-177468691 branch April 12, 2021 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjust default memory/cpu requests and limits #115

Adjust default memory/cpu requests and limits #115

tcdowney commented Apr 8, 2021

tcdowney Apr 8, 2021

heycait Apr 8, 2021

tcdowney Apr 8, 2021

heycait Apr 9, 2021

tcdowney Apr 8, 2021

tcdowney Apr 8, 2021

tcdowney Apr 8, 2021

tcdowney Apr 8, 2021

tcdowney Apr 8, 2021

tcdowney Apr 8, 2021

tcdowney Apr 8, 2021

tcdowney commented Apr 8, 2021

heycait commented Apr 8, 2021

tcdowney commented Apr 8, 2021 •

edited

Loading

heycait commented Apr 9, 2021 •

edited

Loading

tcdowney commented Apr 9, 2021

Adjust default memory/cpu requests and limits #115

Adjust default memory/cpu requests and limits #115

Conversation

tcdowney commented Apr 8, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tcdowney commented Apr 8, 2021

heycait commented Apr 8, 2021

tcdowney commented Apr 8, 2021 • edited Loading

heycait commented Apr 9, 2021 • edited Loading

tcdowney commented Apr 9, 2021

tcdowney commented Apr 8, 2021 •

edited

Loading

heycait commented Apr 9, 2021 •

edited

Loading