-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expose and run GC for nodejs to avoid memory intensive actions to run… #1826
Conversation
33f1656
to
a0a591b
Compare
@jeremiaswerner I think "Advanced proposal 2" is a much better way to solve this. My reasoning is that, from the NodeJS action writer's perspective, they can and should assume that each time their action runs, they get a nice clean state of memory. You and I know that is not necessarily true due to warm container reuse that is specific to OpenWhisk but this really shouldn't, in my opinion, be a concern for the action writer. I think the logical end to making the action writer responsible for handling garbage collection is that they will end up calling I believe that we can make the NodeJS runtime start garbage collection automatically on behalf of the user without penalizing their action invocation time. This could be done by calling the Perhaps something like this: https://github.com/bjustin-ibm/openwhisk/blob/991e2955d7c82cce5c0042a068ecc7125d8f47fb/core/nodejsActionBase/src/service.js#L115 |
We'd need to benchmark the Alternative: What if we run the gc in a Promise and fuse both the GC-Promise and the action-returned Promise via |
@markusthoemmes I'm not sure I follow. Are you suggesting something like: var runPromise = doRun(req);
var gcPromise = runGCWithPromise();
return Promise.all([runPromise, gcPromise])
.then(function(resolvedPromises) {
var result = resolvedPromises[0];
// process result as before...
}); If so, I'm not sure how you ensure that |
wow, this is pretty horrible behavior on the part of nodejs. apparently, Buffers are allocated off-heap, and heap size limits (e.g. --max-old-space-size) are not obeyed when allocating off-heap structures. |
Ok, so here's a little benchmarking experiment I ran. I started with the current state of @jeremiaswerner's code in this PR. I created an action called "eatManualGC" which is exactly the same as his memory.js test action. I then created another action called "eatNoGC" which is the same action, but with the call to I invoked "eatNoGC" 10 times within 30 seconds, then "eatManualGC" 10 times within 30 seconds and recorded the invocation times for each. I then added automatic garbage collection into the NodeJS runtime as proposed above (the And then ignoring the outlier 822ms eatManualGC run: Furthermore, Jeremias' new action limits test passes with automatic garbage collection and the "eatNoGC" version of |
@starpit The infamous LOS of course which really is part of the heap but is often ignored both algorithmically (by ignoring fragmentation) or, even worse as here, left out of the flag space. I know you cannot be surprised by this. ;-) I was using @jeremiaswerner As Nick indirectly alluded to, one can try to set a heap size limit and allow node's scheduling to limit memory consumption. This is "correct" approach if V8 had exposed the correct flags. GC (and their authors) tend to be lazy and so it will not garbage collect until it hits the limit. For V8, I believe the default is 1.7GB. So if use As for the observed 300ms, this is because NodeJS is a copying generational collector so the time it spends is due to a fixed overhead, time to process the thread stacks, and to then trace through the live objects. In your program (and in ones with no memory leaks), this amount will be small and that's why it's "only" 300 ms. Actually, 300ms is pretty horrible for a generational system with a nearly empty heap. (I used to write GCs.) I would use the flag to mitigate non-large objects. This would require a bit of invoker work though as we have no flag differentiation mechanism at the moment. |
@perryibm i was surprised that v8 had no flag to control max-large-object-space. max-old-space-size does nothing for this use case (large Buffers). it's also hard to believe (but true) that v8 pays no attention either to ulimit or to system memory capacity when sizing its large object area. as far as i can tell, the only solutions are the ones proposed above. i.e. explicit gc, gc-per-invoke... there is one more not proposed, which would be to retry once on process failure. the best solution seems to be explicit gc -- if you are a large-object user, then you just have to play along with v8's suck. |
Better than time-based would be a way to get V8's instantaneous LOS consumption. I'll look but it might not have any such hook. |
|
Yep. I was hoping for something internal to V8 but I don't trust it so I would go with ps-level info like you said. It's a bit crude as we want just LOS but this can work as an approximation. Also we don't want to go into GC death spiral so it should be rate limited. |
@bjustin-ibm the goal of this is not to help the developer while eating lots of memory, but to have a remedy for our If the action itself eats lots of memory, and is taking some time to run, GC will be fine and behave the same as in a local environment. This of course seems to be a quite naive implementation looking at the discussion between @starpit and @perryibm. That being said, I'm not sure we can guarantee a GC run (and thus prevent the problem) if we don't control when and how the GC is run at all. Even with limits set on V8, we might freeze the container before GC hits? |
Does the node vm sandbox help here? |
indeed the problem is that garbage is not being collected. however, i believe that the problem is not so much that we're somehow "starving" the collector. rather, the v8 allocator apparently does not consider native memory constraints when allocating native memory. you can see this, by setting the max old space to 100MB, and observing process.memoryConsumption() in each iter of @jeremiaswerner 's test, with --trace-gc. the for reasons i don't understand yet, at least on mac, what should be happening: the collector should observe a "large object/native" max memory constraint. even better: it should pay attention to ulimit and native available memory. it does none of this. |
@perryibm @starpit very nice explanations and insides on garbage collection. not sure I understood everything in detail. could you suggest something actionable in addition to the options below? @bjustin-ibm @mt thanks for the hint with the I've adopted the change and run a single action 1000 times eating 128MB. The warm executions differ about a 6ms in average, higher outliers but similar std deviation.
from here I see three option for this PR: I left the conditional execution of GC based on memory consumption because that would make things non-deteministic and it becomes harder to debug. |
6e29108
to
5a6a26d
Compare
@jeremiaswerner The current changes LGTM |
@jeremiaswerner PG? |
5a6a26d
to
b04ebdc
Compare
PG1 1178 |
@csantanapr @bjustin-ibm PG is green |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bjustin-ibm @jeremiaswerner LGTM
b04ebdc
to
85d8a5a
Compare
PG4/274 🔵 @bjustin-ibm @rabbah can one of you pls merge? |
Hmm. So the resolution is to gc even if a container doesn't get reused... because the common case is reuse? Did you consider running the gc on startup when the container is warm instead of on exit instead? This doesn't penalize the transient containers and allows for some slack between the docker unapause and /run to cover some of the latency. We do make it known it's a feature that a warm container can be detected with user code by leaving "state" around; I'd consider the gc an advanced feature and let users know they should run it if needed rather than always forcing. I think this option was dismissed too quickly. In the common case this isn't necessary at all. @perryibm also has some new findings in this space - may be relevant. |
i hadn't realized that the proposed solution was to force a GC on exit from every node invocation. this will penalize everyone, for the perhaps rare case of nodejs actions that manipulate large Buffers? also, use of |
Two cents
|
yes, I think it's possible to perform the check only at some point, i.e. every second invocation, or if the memory is above X, or only when reused the first time, ... my initial thought was to avoid any logic to reduce the probability of intermittent failures, therefore we discussed between (1) do it always (2) just expose and let the user run it. Current implementation is (1), but I would be ok to use (2) and let users who have issues (I know one by name) to try it out and see if it helps. Would you feel better if we choose (2). we could switch to (1) anytime later once we get good indications from users that this feature is helpful at all? what do you think? |
for now, i would just expose gc to the actions, and document that those using large Buffers may need to use explicit gc. |
I agree with this. However, for completeness: Another problem with the current implementation (using If we believe that GC should run automatically, I think it should be queued up to run after the result has been returned, as implemented here: master...bjustin-ibm:node_gc#diff-4ca0b607ab76fb60662c0e52545ee30bR114 In this way, the action run is not penalized to also include the automatic GC. |
do we set but, it could be that node would do the right thing, if we set the heap size to be the container's memory limit. the v8 source seems to indicate that the default max old space size is fixed. also, c.f. cloudfoundry/nodejs-buildpack#82 |
if you update the limits, you get a new container. |
yes, that makes sense. so: the default initial old space size is 1.6GB. the original tests didn't even get to that point, in terms of Buffer allocation. So the v8 VM probably feels no urgency to run a GC. i wonder if this is a general problem. yeah. this fails, even with in-heap objects:
where big is
it seems important to match the heap size to the container limit. i have no idea why v8 isn't doing this for us. |
To be reliable, we would have to be more conservative and set old space to be memory limit less young space, large space, and node size itself. Probably even more. An alternative is to allow a little bit of slack by using |
9ef1cc3
to
c3de588
Compare
… in memory limit run garbage collector automatically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@jeremiaswerner I've merged this to get it started moving through the pipeline, but I think this may warrant a doc update somewhere to let users know explicit GC is available for them to use. |
… in memory limit (apache#1826) Expose garbage collection so that it is explicitly available for use in actions
… in memory limit (apache#1826) Expose garbage collection so that it is explicitly available for use in actions
… in memory limit (apache#1826) Expose garbage collection so that it is explicitly available for use in actions
… in memory limit (apache#1826) Expose garbage collection so that it is explicitly available for use in actions
… in memory limit (apache#1826) Expose garbage collection so that it is explicitly available for use in actions
… in memory limit
Problem Statement
During debugging the system I found an issue with memory intensive nodejs actions that got killed because they are running in the memory limit even when the memory got freed up in the action code. Since the action containers are running a very short period of time (ms) I suspect the nodejs garbage collector had no chance to free up the memory resources because it's running in "idle" time that does not exist by definition. To reproduce choose a 512MB action limit and allocating a buffer of 128MB. (disclaimer: the action is a very synthetic action doing nothing else which might reduce the time for the GC to jump in, like downloading a video/image, ...)
action code:
result:
Expected
A user would expect that the memory is either freed up automatically or the user is able to free up the memory by enforcing the gc.
Resulting behaviour
As a result the GC is cleaning up the memory and the issue is gone away.
Drawbacks
global.gc()
is running in the user's action timeImpact on performance
Testing locally a node process for 100 iterations allocating 128MB vs 0MB of memory and forcing/not-forcing the garbage collector. It seems like that the
global.gc()
results in a constant overhead of 30ms, even if there is no or a lot of memory to free-up.results:
allocate 128MB and force global.gc();
allocate 128MB and do not force global.gc();
allocate 0 MB and force global.gc()
allocate 0 MB and do not force global.gc()
Implemented MVP
This PR exposes the garbage collector in our nodejs runtime containers by starting the
node
process with--expose-gc
. The user would be able to run the GC by its own. This should be either documented in a blog or in our documentation as a "hint"/"side-mark"Advanced proposal 1
We could run the garbage collector anytime after the user code run
global.gc()
which would result in a significant overhead (see above) but would make it transparent for the user.Advanced proposal 2
We could run the garbage collector automatically if we notice that the node process is consuming a lot of memory.
Advanced proposal 3
Provide a switch in the CLI that allows the user to decide whether the garbage collector should run for that action