-
Notifications
You must be signed in to change notification settings - Fork 864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRT client not respecting pod memory limits #4034
Comments
Question, as I'm not very familiar with Kubernetes:
How is this limit set? Is it a "virtual" limit? |
even iam getting similar issue |
@debora-ito I'm not a kubernetes expert either, but afaik resource allocation is done using cgroups. Our infrastructure team theorize that the CRT client may not respect cgroups correctly. @SriDeepa-s3 We've managed to circumvent this by using a normal AsyncClients instead for S3 uploads, which has solved the issue. |
@debora-ito: is there any reason why CRT client not bound to memory ? |
@Lunkers noted. We'll investigate. |
@debora-ito: any update on above issue ? |
I started investigating this last week. I can reproduce this issue on my laptop when my internet speeds are much lower than the CRT S3 client's targetThroughputInGbps setting (it's 10Gbps by default, which is much higher than my home internet). The CRT S3 client does use memory outside the JVM, so it is able to exceed Java's normal Runtime.maxMemory() aka -XX:MaxHeapSize. The CRT S3 client's only real tuning knob is Something weird is definitely going on. The CRT's memory usage climbs well over 1GiB in the first 60sec of my upload, before coming back down and settling well below 1GiB for the remainder of the upload. I'm still investigating, this is unacceptable. |
Actually, I'm not reproducing this. When my memory usage climbed over 1GiB, I had been messing around and set the With the default |
I guess we'll need a more reliable repro case... Are you certain there's only 1 instance of the CRT S3 client running? You can see how much native memory the CRT S3 client is using by running with the FWIW, you can see the JVM's memory usage by calling |
our scenario is like, we are running consumer where it will listen to one topic and upload the input stream to s3 . sample code in #4094 is there any thing iam missing here on close? |
If possible, I would revamp your code to only have 1 instance of the S3Client. The S3Client is built with the intention of being a singleton, handling multiple requests at once in an intelligent way that doesn't use too many resources. But if you have N instances, they will use N times the system resources. |
In our scenario every request with different credentials and bucket details
..so we ended up in creating new client for every request ..
Is there way to create client at ones and ovveride the credentials ?
And when we close the client in finally block after all operations
done..doesn't release the resources ?
…On Wed, 28 Jun 2023, 22:14 Michael Graeb, ***@***.***> wrote:
If possible, I would revamp your code to only have 1 instance of the
S3Client. The S3Client is built with the intention of being a singleton,
handling multiple requests at once in an intelligent way that doesn't use
too many resources. But if you have N instances, they will use N times the
system resources.
—
Reply to this email directly, view it on GitHub
<#4034 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BALWNNKRAIRDG7SND6TPSITXNRNOTANCNFSM6AAAAAAYNJSFHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hello, we are also facing some similar issues but with downloading. Can you tell me what do you mean by normal AsyncClients. |
Darn, it's not currently possible in aws-sdk-java-v2. It's possible in the underlying native code to use different buckets and credentials per-request, but that configuration still needs to be exposed out at this level. Sorry. The team has this task in their backlog... |
Definitely seems like a bug in the CRT async client because I'm using that, a single instance shared across the process and the native memory use (non JVM heap) soars to 11 or even 14GB during single file uploads. I'm going to switch to the non CRT async client until this gets resolved. I have the same problem where I'm running inside kubernetes which has a limit of 6GB and the pod is getting terminated for far exceeding the 6GB limit whereas the JVM heap is hovering around 2GB. |
For me switching to the S3AsyncClient.builder().build() client did not resolve the issue. I'm using SDK version 2.20.103 and regardless which client I use many gigabytes of native (non JVM heap) memory get used, the process exceeds its limit and is terminated. I thought the S3AsyncClient.builder().build() client was implemented in Java and therefore used the JVM heap but that's not what it seems like because the leak I'm experiencing is not in the JVM heap. |
Upon further reviewing my situation I do not believe I was experiencing a problem with S3TransferManager or the CRT async client. I wanted to point that out so that nobody wasted time researching a problem based on my last two comments. |
You can configure credentials per-request using
To handle buckets in different regions than that of the S3Client, you can enable
|
any updates on this issue? |
I am also wondering if there was any updates on this, since I was also experiencing some memory leaks with crtBuilder. |
I experienced the same issue of a pod getting killed due to excessive memory usage. |
@jassuncao Thanks for the information. We're working on additional configuration options to make this easier to manage and will update this issue then that update is released. |
I had memory leak too with the CRT implementation (unable to control the java native memory). The only viable solution I've found to continue using high-level interfaces for multipart operations, especially for uploading files larger than 5 gigabytes, is to use SDK v1. |
We apologize for the long silence, we have some updates to share. The CRT team made some changes in the CRT client core, released in We are also exposing a new attribute in the S3CrtAsyncClient that will provide more control over the utilized memory at the client level: In summary:
Edit: updated with the version of the SDK that includes the new maxNativeMemoryLimitInBytes attribute. |
Describe the bug
When uploading large files using the CRT s3 client, it seems that Kubernetes memory restrictions are not respected.
When uploading a large file using the code snippet in #4033 in a Kubernetes pod, a number of anonymous file read processess are spawned and don't respect the memory limits on the pod, trying to use more memory than the pod is allowed to.
Expected Behavior
The SDK upload processes should not consume all available RAM in the pod, and respect Kubernetes limits.
Current Behavior
The pod consumes more and more memory for each upload created, rarely freeing memory. Sooner or later, Kubernetes kills the pod for using too many resources. The provided screenshot shows a pod with an 8GB memory limit:
Reproduction Steps
use the transfer manager created in the snippet provided in #4033 to upload a large file, roughly 80-100GB in a Kubernetes pod.
Possible Solution
No response
Additional Information/Context
No response
AWS Java SDK version used
2.20.67
JDK version used
11
Operating System and version
Ubuntu jammy jellyfish
The text was updated successfully, but these errors were encountered: