-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coldstart Performance/Bundle Size Regression in 3.577.0 #6144
Comments
The size difference in the following packages has increased our lambda deployment package size to over the hard limit of 262144000 bytes:
|
@mymattcarroll Although there was a bump in size, I was seeing it be on the order of kilobytes not megabytes. I limited my repro to one package though. Were you close to the 250 MB limit prior? Do you have numbers of your size between 3.575 and 3.577 or later? If you can drop some numbers in here, that may help determine whether it is the same cause or a different one. |
You should mark Another 10kb, which we the SDK team should fix, is the inclusion of After these fixes the difference should only be 134.6kb -> 140.3kb test code:
|
When we publish the fix in #6149 you shouldn't have to do anything different anymore. |
Turns out our problem was Credit to @divporter for teaching me about the |
I just tried 3.588. I'm seeing better coldstart times, but still 50K in bloat. One thing I noticed was smithy/core was bundled as cjs instead of esm. Here's the breakdown of where the size increases are coming from if it helps: Delta 3.575 -> 3.588
|
@siddsriv is making the same change in |
Numbers for 3.590 are much more inline with 3.575. I think we can close now. Thanks for all your help here!
|
Actually, hold on closing here. Although I saw better performance and size on the repro, I still saw poor performance when I updated my prod code to 3.590.0. I had to manually set |
3.592 updates the Here's some data from my production api. The best performance was from 3.568 for me, I'm seeing about a 50 ms coldstart delta still and I'm still digging into it.
|
I've updated the repro to include more clients
|
50ms on these number is 25% slower, not sure I agree with the negligible part. Would be great to understand what was the trade off for adding this and if there's anything at all that we can do to reverse it besides pinning to 3.575. |
Hey @dreamorosi, the negligible is between 3.575 and 3.592. However, I then compared 3.568 to 3.592 and saw the 50ms difference when I replicated what I saw in my production code. I agree that 50ms is a big regression and you should be able to replicate my results with the code I've currently checked into the repo. I looked at a few things to see if I could identify the source and I found some tiny things (shared-ini-loader and one other package I'm blanking on) but nothing that explains the 50 ms. The source of the latency no longer seems to be related to code size because I tried minification and didn't see much difference. |
Alright, here's some more benchmarks with my repro. Looks like the performance bombed in 3.575. I'd recommend taking a close look at #6088
|
Could you create a repro outside of Lambda that isolates the issue? I have run the CDK example and the initDuration for v3.592.0 is in the 210-220 range. If there is an issue with the SDK it should be demonstrable in Node.js outside of Lambda. Add a |
I'll see what I can do. Just to confirm, you are comparing 3.574 to 3.592 and initializing multiple clients (like the repro) and you don't see much difference? |
Yes, I used the provided repository and ran both those versions many times using the provided execution command examples in the readme. I also tested the generated bundles themselves with additional timing reports, edited by hand, but those consisted merely of const A = performance.now(); // top of file
...
const B = performance.now(); // top of handler function
...
console.log("init time", B - A); // before end of handler function The contributing factors to initialization time in AWS Lambda are not entirely related to the user code, and fluctuate over time due to fleet provisioning. I don't know the details, so I'd like to remove that variance. |
Not sure if I'm running into the same issue but I saw something similar when upgrading to a newer SDK version and found this issue. The size of my Lambda function increased from 3.8mb to 6.1mb. I also use the built in CDK When analyzing the bundle, it seems that |
@jaapvanblaaderen It could potentially be related. I was seeing something similar with esm modules until they made a change to the packaging. @kuhe WRT to the
|
I've narrowed it down to the exact line that got added in If I patch that line from: export const keepAliveSupport = {
supported: Boolean(typeof Request !== "undefined" && "keepalive" in new Request("https://[::1]")),
}; To: export const keepAliveSupport = {
supported: true,
}; Then the 50 ms latency hit disappears in 3.575. Wild guess: maybe DNS is getting involved and there is a 50 ms timeout? |
Hi @perpil. How did you measure/narrow it down to that specific line? This is only relevant in Node 16 as Node 18+ has export const handler = async (event) => {
const response = {
statusCode: 200,
body: JSON.stringify({
"keepAlive": Boolean(typeof Request !== "undefined" && "keepalive" in new Request("https://[::1]"))
}),
};
return response;
}; You also need to test |
Thanks @perpil, that's a great help finding the most likely location of the problem. I will investigate how to fix it. |
@richarddd the smithy dependencies were upgraded in // node_modules/@smithy/fetch-http-handler/dist-es/fetch-http-handler.js
var keepAliveSupport = {
supported: Boolean(typeof Request !== "undefined" && "keepalive" in new Request("https://[::1]"))
}; You'll also note, there are no references to that variable in the generated output, so it isn't necessary in the node environment. When I remove that code, or hardcode it to true, I no longer see the 50 ms hit to @kuhe Thanks for looking into it, I think there are a couple ways of solving it so reach out if you need help. One thing you mentioned was to try and use var keepAliveSupport = ...
var start = performance.now();
...other code
var end = performance.now() |
If you update your install lockfile to pick up The reason fetch-http-handler appears now is that recently we added preliminary support for |
@perpil sorry for the misunderstanding I can now reproduce the issue. Great finding! The reason for this regression is that when const a = Request;
export const handler = async (event) => {
const response = {
statusCode: 200,
body: JSON.stringify("slow"),
};
return response;
}; Comment out first line and we're back to normal. Request further pulls in more lazily evaluated classes such as URL/Stream etc when instantiated. Source: https://github.com/nodejs/node/blob/f2381bc7cb120e43bb97ffeccd19dfdc946c8b2c/lib/internal/bootstrap/web/exposed-window-or-worker.js#L86 |
@kuhe I can confirm SDK |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread. |
Checkboxes for prior research
Describe the bug
I'm seeing a larger than normal regression in coldstart performance and treeshaken bundle size after 3.575. This chart shows cold start latency and esbuild built bundle size by sdk. It's a small sample size (3 runs), but I'm seeing +24 ms in coldstart and +146K in bundle size.
SDK version number
@aws-sdk/client-dynamodb@3.577.0
Which JavaScript Runtime is this issue in?
Node.js
Details of the browser/Node.js/ReactNative version
v20.13.1
Reproduction Steps
Read the README in this Repro
Observed Behavior
Increased coldstart times and bundle size.
Expected Behavior
Minimal impact to coldstart and bundle size on tree-shaken code.
Possible Solution
Looking at the bundle analyzer, the biggest size impact are from the @smithy/* packages, but it also seems like
fast-xml-parser
appeared too, I haven't dug much beyond that.Additional Information/Context
In my app that I upgraded, I saw higher impacts to coldstart performance (>90ms) between 3.563 and 3.577. I was using more clients however:
client-sts
,client-dynamodb
,client-iam
, andlib-dynamodb
. To keep things simple, I did the repro with just client-dynamodb.The text was updated successfully, but these errors were encountered: