Skip to content
This repository has been archived by the owner on Feb 14, 2023. It is now read-only.

Adjust default memory/cpu requests and limits #115

Merged
merged 1 commit into from
Apr 12, 2021

Conversation

tcdowney
Copy link
Member

@tcdowney tcdowney commented Apr 8, 2021

There's been evidence that the Cloud Controller API and registry-buddy components have been getting OOMKilled under typical usage in production environments. This change adjusts the resource limits to be more in line with what our monit limits historically were in the capi-release BOSH release. E.g. https://github.com/cloudfoundry/capi-release/blob/6f9b4d5ec3ba46adf39d7de2d60e1b58a4511784/jobs/cloud_controller_ng/spec#L893

Some memory requests were adjusted up and down to better account for typical/min usage of these components.

This change also adjusts some of the CPU limits to be more generous for
critical processes such as nginx and the cf-api-controllers reconciler.

Overall there is a +210Mi increase in memory requests (only +10 if you ignore the short-lived ccdb migration Job).

If operators want to tweak these values further they can follow this approach in our vertical scaling docs.

Tracker story: #177468691

There's been evidence that the Cloud Controller API and registry-buddy
components have been getting OOMKilled under typical usage in production
environments. This change adjusts the resource limits to be more in line
with what our monit limits historically were in the capi-release BOSH
release. E.g. https://github.com/cloudfoundry/capi-release/blob/6f9b4d5ec3ba46adf39d7de2d60e1b58a4511784/jobs/cloud_controller_ng/spec#L893

This change also adjusts some of the CPU limits to be more generous for
critical processes such as nginx and the cf-api-controllers reconciler.

[#177468691](https://www.pivotaltracker.com/story/show/177468691)

Authored-by: Tim Downey <tdowney@vmware.com>
@@ -40,10 +40,10 @@ spec:
resources:
requests:
cpu: 500m
memory: 300Mi
memory: 500Mi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an environment with no load Cloud Controller is using ~400Mi of RAM. For decent quality of service in light environments it should have at least 500Mi. In practice I've seen it reach a steady state of ~3.5Gi of RAM on CF for VMs environments with occasional surges past 4Gi, but these were typically anomalous cases in very large scale environments.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any recommendations somewhere about increasing requests for larger environments where we can mention the ~3.5GB need?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We vaguely hint at frequent restarts being due to "memory pressure" on this doc: https://docs.cloudfoundry.org/running/managing-cf/scaling-cloud-controller-k8s.html#cf-api-server

But AFAIK I'm not aware of any "production cluster sizing" guidance apart from what's mentioned in https://github.com/cloudfoundry/cf-for-k8s/blob/819bc1c981153df6801d07e747b6ef4809189d24/docs/platform_operators/scaling.md#scaling-cf-for-k8s.

It's also a pretty hand wavy number based on what I've seen in typical CF for VMs environments with 4GB (swap enabled) VMs given to the Cloud Controllers. It's possible the number might bloat higher on a larger VM or it's possible that the usage patterns are different enough on CF for K8s that doesn't reach those levels often.

Does anyone know what the minimum node memory size we recommend is? As long as its 8-16GB I feel like there should be enough headroom for this to be OK.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh neat okay the docs are accessible from cf-for-k8s 👍 . According to our deploy docs, we require a minimum of 15GB of memory (https://github.com/cloudfoundry/cf-for-k8s/blob/819bc1c981153df6801d07e747b6ef4809189d24/docs/deploy.md).

@@ -68,10 +68,10 @@ spec:
resources:
requests:
cpu: 100m
memory: 300Mi
memory: 250Mi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local worker doesn't do much in cf-for-k8s. On CF for VMs it is ferrying large files to the blobstore, but in cf-for-k8s it's just moving a file on disk to a mount that the registry-buddy has access to. I've lowered both the limit and requests to account for this.

Comment on lines +92 to +93
cpu: 1500m
memory: 1Gi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this CPU limit must have been a typo before. I've adjusted it to a more reasonable multi-core limit so that it can get some concurrency if it needs to. I've also adjusted the memory limit to 1Gi.

The registry-buddy converts a zip file containing package source code into a tar file/single-layer OCI image and sends it to a container registry. For apps with large individual files (e.g. a Java app with jars) we've seen memory surge past the 150Mi this had originally and the cf-api-servers Pod would get OOMKilled. I'm considering 2Gi to account for the default max package size, though I do not think it should actually use that much memory based on my understanding of the code:

func WriteZipToTar(tw TarWriter, srcZip, basePath string, uid, gid int, mode int64, normalizeModTime bool, fileFilter func(string) bool) error {

Thoughts on this?

limits:
cpu: 500m
cpu: 1000m
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increased this since nginx is in the critical path for API server requests. It's a concurrent web server so this could technically be increased further if we'd like.

@@ -29,7 +29,7 @@ spec:
memory: 300Mi
limits:
cpu: 1000m
memory: 1Gi
memory: 4Gi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's unlikely that the database migrations use more than 4Gi -- my gut tells me it would probably be ~2Gi max on a large env -- but it would be awful if migrations got OOMKilled. Increased this to 4Gi since that is around what it would use on BOSH (kinda -- monit would have allowed it to temporarily burst higher for a few cycles).

limits:
cpu: 1000m
memory: 1Gi
memory: 4Gi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Clock performs a periodic bulk sync of apps and CF tasks that are running on the platform. While doing this it loads information about all the running LRPs into memory to compare against what's in CCDB. This can be pretty memory intensive on large environments so we should be more generous here.

@@ -43,11 +43,11 @@ spec:
value: #@ data.values.workloads_namespace
resources:
limits:
cpu: 100m
memory: 200Mi
cpu: 1000m
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This controller is responsible for staging (it watches kpack build/images) and some route reconciliation. We should definitely let it use at least a full CPU if not more. I think this must have been a typo for it to be so low.

cpu: 100m
memory: 200Mi
cpu: 1000m
memory: 1Gi
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SharedInformer cache for this controller needs memory to be able to load information about all the CRDs its watching. For larger environments with a lot of routes and/or kpack builds/images it's very likely for it to get OOMKilled with a limit at 200mb.

@cloudfoundry cloudfoundry deleted a comment from cf-gitbot Apr 8, 2021
@tcdowney
Copy link
Member Author

tcdowney commented Apr 8, 2021

I added some comments explaining some of the choices I made when tweaking the numbers. There are some containers that may benefit from being given even higher CPU limits, so left some comments for discussion on those as well.

@heycait
Copy link

heycait commented Apr 8, 2021

Seems reasonable to me. I don't have any comments about that code you linked because I wasn't really seeing how that was mentioning anything about size.

@tcdowney
Copy link
Member Author

tcdowney commented Apr 8, 2021

Seems reasonable to me. I don't have any comments about that code you linked because I wasn't really seeing how that was mentioning anything about size.

The registry-buddy code I linked is only relevant because it is reading (potentially large) files into memory and writing them into a tar file. My understanding is that it should be a buffered copy, though, and should actually use that much memory.

Maybe https://github.com/cloudfoundry/capi-k8s-release/blob/5878960c0ad229fbd5832c728e3addf686e2f18f/src/registry-buddy/package_upload/package_upload.go is the culprit. I'm not that familiar with how go-containerregistry actually works and what all is in memory when it's constructing that image and writing it to the registry. 🤔

EDIT
Actually I've gone through more code and I think while we're not reading the zip into memory all at once, we are converting it to a tar in memory and directly making a container image from it (also in memory) that's then written to the remote registry:

layer, err := tarball.LayerFromReader(archive.ReadZipAsTar(zipPath, "/", 0, 0, -1, true, noopFilter))

Maybe that's the source of the registry-buddy's mysterious memory usage. 🤔 💭

@heycait
Copy link

heycait commented Apr 9, 2021

EDIT
Actually I've gone through more code and I think while we're not reading the zip into memory all at once, we are converting it to a tar in memory and directly making a container image from it (also in memory) that's then written to the remote registry:

oh interesting! I wonder why the conversion between zip/tar needs to be done.. I guess the question has to do more with compression vs archiving though

UPDATE
After looking into tar vs zip, I see now that we're uncompressing it into a single archived file to be read for containing building.

@tcdowney
Copy link
Member Author

tcdowney commented Apr 9, 2021

After looking into tar vs zip, I see now that we're uncompressing it into a single archived file to be read for containing building.

Yeah, for more context there around why it is a zip to begin with -- that's because app source Packages in CF are zip files and the CLI is uploading it to Cloud Controller as a zip (via this endpoint). The tar conversion is because OCI Image layers are usually just tar archives (this is an interesting blog post I read once about this).

I think potentially in this code we could have written the tar to disk first, but the container image creation itself would ultimately still read it all into memory so I don't think that would actually help things (may just make it slower). I don't think we need to change anything about it right now, but this may be helpful stuff to think about if we reimplement parts of this flow in the future (along with using the public version of the zip->tar conversion code from buildpacks/pack#814).

It looks like in cf-for-k8s the max package size is hardcoded to be ~1GB so I think this limit should be fine. We've got one local worker container so the registry-buddy will be handling at most one upload at a time with this configuration.

@tcdowney tcdowney merged commit 1cab1d5 into master Apr 12, 2021
@tcdowney tcdowney deleted the adjust-request-limits-177468691 branch April 12, 2021 17:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants