Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose and run GC for nodejs to avoid memory intensive actions to run… #1826

Merged
merged 1 commit into from
May 19, 2017

Conversation

jeremiaswerner
Copy link
Contributor

@jeremiaswerner jeremiaswerner commented Feb 8, 2017

… in memory limit

Problem Statement

During debugging the system I found an issue with memory intensive nodejs actions that got killed because they are running in the memory limit even when the memory got freed up in the action code. Since the action containers are running a very short period of time (ms) I suspect the nodejs garbage collector had no chance to free up the memory resources because it's running in "idle" time that does not exist by definition. To reproduce choose a 512MB action limit and allocating a buffer of 128MB. (disclaimer: the action is a very synthetic action doing nothing else which might reduce the time for the GC to jump in, like downloading a video/image, ...)

action code:

function eat(memoryMB) {
  var bytes = 1*1024*1024*memoryMB;
  var buffer = new Buffer.alloc(bytes, 'a');
  buffer = null;
  console.log('done.');
}

function main(msg) {
    console.log('helloEatMemory', 'memory ' + msg.payload + 'MB');
    eat(msg.payload);
    return {msg: 'OK, buffer of size ' + msg.payload + ' MB has been filled.'};
}

result:

> while (true); do wsk action invoke eatMemory -p payload 128 -rb; echo ""; sleep 2; done
{
    "msg": "OK, buffer of size 128 MB has been filled."
}

{
    "msg": "OK, buffer of size 128 MB has been filled."
}

{
    "msg": "OK, buffer of size 128 MB has been filled."
}

{
    "msg": "OK, buffer of size 128 MB has been filled."
}

{
    "error": "The action did not produce a valid response and exited unexpectedly."
}

Expected

A user would expect that the memory is either freed up automatically or the user is able to free up the memory by enforcing the gc.

Resulting behaviour

As a result the GC is cleaning up the memory and the issue is gone away.

Drawbacks

  • garbage collector is blocking the node process. no other request is executed
  • global.gc() is running in the user's action time

Impact on performance

Testing locally a node process for 100 iterations allocating 128MB vs 0MB of memory and forcing/not-forcing the garbage collector. It seems like that the global.gc() results in a constant overhead of 30ms, even if there is no or a lot of memory to free-up.

function eat(memoryMB) {
  var bytes = 1*1024*1024*memoryMB;
  var buffer = new Buffer.alloc(bytes, 'a');
  buffer = null;
  console.log('eat ' + memoryMB + ' memory.');
}

function recEat(mb, timeout, count) {
  eat(mb);
  global.gc();
  setTimeout(function() {
    if (count > 0) {
        recEat(mb, timeout, count-1)
    }
  }, timeout);
}

recEat(128, 0, 100);

results:

allocate 128MB and force global.gc();

node --expose-gc ~/playground/dosEatMemoryLocal.js  
2.95s user 4.45s system 96% cpu 7.642 total

allocate 128MB and do not force global.gc();

node --expose-gc ~/playground/dosEatMemoryLocal.js  
2.69s user 4.41s system 96% cpu 7.323 total

allocate 0 MB and force global.gc()

node --expose-gc ~/playground/dosEatMemoryLocal.js  
0.30s user 0.03s system 82% cpu 0.404 total

allocate 0 MB and do not force global.gc()

node --expose-gc ~/playground/dosEatMemoryLocal.js 
0.06s user 0.02s system 39% cpu 0.199 total

Implemented MVP

This PR exposes the garbage collector in our nodejs runtime containers by starting the node process with --expose-gc. The user would be able to run the GC by its own. This should be either documented in a blog or in our documentation as a "hint"/"side-mark"

Advanced proposal 1

We could run the garbage collector anytime after the user code run global.gc() which would result in a significant overhead (see above) but would make it transparent for the user.

Advanced proposal 2

We could run the garbage collector automatically if we notice that the node process is consuming a lot of memory.

Advanced proposal 3

Provide a switch in the CLI that allows the user to decide whether the garbage collector should run for that action

@jeremiaswerner jeremiaswerner added the review Review for this PR has been requested and yet needs to be done. label Feb 8, 2017
@jberstler
Copy link

jberstler commented Feb 8, 2017

@jeremiaswerner I think "Advanced proposal 2" is a much better way to solve this. My reasoning is that, from the NodeJS action writer's perspective, they can and should assume that each time their action runs, they get a nice clean state of memory. You and I know that is not necessarily true due to warm container reuse that is specific to OpenWhisk but this really shouldn't, in my opinion, be a concern for the action writer.

I think the logical end to making the action writer responsible for handling garbage collection is that they will end up calling global.gc() "just in case" which incurs the time penalty during their action invocation when it may not really be needed. For example, they didn't use enough memory to warrant it, or they aren't invoking the action often enough to benefit from warm container reuse, etc.

I believe that we can make the NodeJS runtime start garbage collection automatically on behalf of the user without penalizing their action invocation time. This could be done by calling the gc() function in a setTimeout callback just before completing the /run request. This should allow the /run route to complete while putting the garbage collection operation on the stack for later execution.

Perhaps something like this: https://github.com/bjustin-ibm/openwhisk/blob/991e2955d7c82cce5c0042a068ecc7125d8f47fb/core/nodejsActionBase/src/service.js#L115

@markusthoemmes
Copy link
Contributor

markusthoemmes commented Feb 8, 2017

We'd need to benchmark the setTimeout proposal. It's a great idea but I fear due to node's single threadedness it won't gain us any performance (given that the gc call is synchronous and blocking).

Alternative: What if we run the gc in a Promise and fuse both the GC-Promise and the action-returned Promise via sequence? That'd give the scheduler the possibility to do its work properly (like running a GET request and run the GC while waiting for it) but still guarantee that the GC ran at least once (which is missing from the setTimeout idea I think).

@jberstler
Copy link

jberstler commented Feb 8, 2017

@markusthoemmes I'm not sure I follow. Are you suggesting something like:

var runPromise = doRun(req);
var gcPromise = runGCWithPromise();

return Promise.all([runPromise, gcPromise])
    .then(function(resolvedPromises) {
        var result = resolvedPromises[0];

        // process result as before...
    });

If so, I'm not sure how you ensure that gc doesn't run while the action is still working (and potentially eating lots of memory). For example, if the first line of the action is an async call, and the rest of the action uses up a lot of memory. With the Promise approach above it seems likely that gc will be interleaved while waiting for the action's async call to return - which is unfortunately before the action chews up a lot of memory.

@starpit
Copy link
Contributor

starpit commented Feb 8, 2017

wow, this is pretty horrible behavior on the part of nodejs. apparently, Buffers are allocated off-heap, and heap size limits (e.g. --max-old-space-size) are not obeyed when allocating off-heap structures.

@jberstler
Copy link

jberstler commented Feb 8, 2017

Ok, so here's a little benchmarking experiment I ran. I started with the current state of @jeremiaswerner's code in this PR. I created an action called "eatManualGC" which is exactly the same as his memory.js test action. I then created another action called "eatNoGC" which is the same action, but with the call to global.gc() commented out.

I invoked "eatNoGC" 10 times within 30 seconds, then "eatManualGC" 10 times within 30 seconds and recorded the invocation times for each. I then added automatic garbage collection into the NodeJS runtime as proposed above (the setTimeout approach), and then ran "eatNoGC" 10 times within 30 seconds, again recording the invocation times.

untitled

And then ignoring the outlier 822ms eatManualGC run:
untitled

Furthermore, Jeremias' new action limits test passes with automatic garbage collection and the "eatNoGC" version of memory.js.

master...bjustin-ibm:node_gc

@perryibm
Copy link
Contributor

perryibm commented Feb 8, 2017

@starpit The infamous LOS of course which really is part of the heap but is often ignored both algorithmically (by ignoring fragmentation) or, even worse as here, left out of the flag space. I know you cannot be surprised by this. ;-) I was using node --v8-options.

@jeremiaswerner As Nick indirectly alluded to, one can try to set a heap size limit and allow node's scheduling to limit memory consumption. This is "correct" approach if V8 had exposed the correct flags. GC (and their authors) tend to be lazy and so it will not garbage collect until it hits the limit. For V8, I believe the default is 1.7GB. So if use --max_old_space_size, we can try to limit and cause node to schedule the GC itself. This is not foolproof because there are at least 4 spaces V8 uses: code space, semi-spaces (often referred to as the nursery), old-space (sometimes called tenured space), and LOS (large-object space). I don't know the threshold (probably around 1-4MB), but arrays over this size are directly allocated here. The unfortunate part is that V8 currently provides no flag to control this. I speculate if you change your test program to allocating smaller objects - say make 1000 1KB arrays instead of a single 1MB, then the memory limit would work.

As for the observed 300ms, this is because NodeJS is a copying generational collector so the time it spends is due to a fixed overhead, time to process the thread stacks, and to then trace through the live objects. In your program (and in ones with no memory leaks), this amount will be small and that's why it's "only" 300 ms. Actually, 300ms is pretty horrible for a generational system with a nearly empty heap. (I used to write GCs.)

I would use the flag to mitigate non-large objects. This would require a bit of invoker work though as we have no flag differentiation mechanism at the moment.

@starpit
Copy link
Contributor

starpit commented Feb 8, 2017

@perryibm i was surprised that v8 had no flag to control max-large-object-space. max-old-space-size does nothing for this use case (large Buffers). it's also hard to believe (but true) that v8 pays no attention either to ulimit or to system memory capacity when sizing its large object area.

as far as i can tell, the only solutions are the ones proposed above. i.e. explicit gc, gc-per-invoke... there is one more not proposed, which would be to retry once on process failure. the best solution seems to be explicit gc -- if you are a large-object user, then you just have to play along with v8's suck.

@perryibm
Copy link
Contributor

perryibm commented Feb 8, 2017

Better than time-based would be a way to get V8's instantaneous LOS consumption. I'll look but it might not have any such hook.

@starpit
Copy link
Contributor

starpit commented Feb 8, 2017

process.memoryConsumption() gives an rss field. how would you leverage this? force a gc if it's too high?

@perryibm
Copy link
Contributor

perryibm commented Feb 8, 2017

Yep. I was hoping for something internal to V8 but I don't trust it so I would go with ps-level info like you said. It's a bit crude as we want just LOS but this can work as an approximation. Also we don't want to go into GC death spiral so it should be rate limited.

@markusthoemmes
Copy link
Contributor

markusthoemmes commented Feb 8, 2017

@bjustin-ibm the goal of this is not to help the developer while eating lots of memory, but to have a remedy for our pause/unpause behavior. We immediately freeze the container after the code is done, thus potentially preventing the GC to do its work properly. The Promise based approach would guarantee that the GC ran, but leave it to Node to decide when (potentially during a REST call for example, which is the best case as it wouldn't hit performance then).

If the action itself eats lots of memory, and is taking some time to run, GC will be fine and behave the same as in a local environment.

This of course seems to be a quite naive implementation looking at the discussion between @starpit and @perryibm. That being said, I'm not sure we can guarantee a GC run (and thus prevent the problem) if we don't control when and how the GC is run at all. Even with limits set on V8, we might freeze the container before GC hits?

@rabbah
Copy link
Member

rabbah commented Feb 8, 2017

Does the node vm sandbox help here?

@starpit
Copy link
Contributor

starpit commented Feb 8, 2017

indeed the problem is that garbage is not being collected. however, i believe that the problem is not so much that we're somehow "starving" the collector. rather, the v8 allocator apparently does not consider native memory constraints when allocating native memory.

you can see this, by setting the max old space to 100MB, and observing process.memoryConsumption() in each iter of @jeremiaswerner 's test, with --trace-gc. the rss continues to grow, up to about 1.3-1.7GB, completely independent of the mandated max old space size.

for reasons i don't understand yet, at least on mac, ulimit -m 256*1024 doesn't result in process KILL. however, docker run -m 256m does. v8 thinks it has plenty of spare native memory, so it just plows on, without any cleanup (until it reaches about 1.3GB or so, in the absence of a docker memory limit)

what should be happening: the collector should observe a "large object/native" max memory constraint. even better: it should pay attention to ulimit and native available memory. it does none of this.

@jeremiaswerner
Copy link
Contributor Author

jeremiaswerner commented Feb 9, 2017

@perryibm @starpit very nice explanations and insides on garbage collection. not sure I understood everything in detail. could you suggest something actionable in addition to the options below?

@bjustin-ibm @mt thanks for the hint with the setTimeout running in a promise. That seem to work like a charm.

I've adopted the change and run a single action 1000 times eating 128MB. The warm executions differ about a 6ms in average, higher outliers but similar std deviation.

warm time for action with running automated GC as in this PR

marker controller_blockingActivation        : mean=  114.39, std=   69.87, p99.5=  586.28, p99=  459.28, p95=  175.80, p90=  141.20, p50=   98.00, min=      67, max=     968, #samples=  869

warm running no GC at all (note: less samples because every 3rd/4th failed)

marker controller_blockingActivation        : mean=  108.53, std=   70.49, p99.5=  508.34, p99=  427.53, p95=  183.00, p90=  160.20, p50=   87.50, min=      50, max=     714, #samples=  750

from here I see three option for this PR:
a) discard
b) expose GC but let users run it manually (commit 1)
c) expose GC and run automatically every time (commit 1 + 2)

I left the conditional execution of GC based on memory consumption because that would make things non-deteministic and it becomes harder to debug.

@perryibm @bjustin-ibm @starpit @mt what should we do?

@jeremiaswerner jeremiaswerner force-pushed the node_gc branch 2 times, most recently from 6e29108 to 5a6a26d Compare February 9, 2017 15:24
@jberstler
Copy link

@jeremiaswerner The current changes LGTM

@csantanapr
Copy link
Member

@jeremiaswerner PG?

@jeremiaswerner
Copy link
Contributor Author

PG1 1178

@jeremiaswerner jeremiaswerner added the pgapproved Pipeline has approved this change. label Feb 21, 2017
@jeremiaswerner
Copy link
Contributor Author

@csantanapr @bjustin-ibm PG is green

Copy link

@jberstler jberstler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@csantanapr csantanapr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bjustin-ibm @jeremiaswerner LGTM

@jeremiaswerner
Copy link
Contributor Author

PG4/274 🔵

@bjustin-ibm @rabbah can one of you pls merge?

@rabbah
Copy link
Member

rabbah commented May 18, 2017

Hmm. So the resolution is to gc even if a container doesn't get reused... because the common case is reuse? Did you consider running the gc on startup when the container is warm instead of on exit instead? This doesn't penalize the transient containers and allows for some slack between the docker unapause and /run to cover some of the latency.

We do make it known it's a feature that a warm container can be detected with user code by leaving "state" around; I'd consider the gc an advanced feature and let users know they should run it if needed rather than always forcing. I think this option was dismissed too quickly. In the common case this isn't necessary at all.

@perryibm also has some new findings in this space - may be relevant.

@starpit
Copy link
Contributor

starpit commented May 18, 2017

i hadn't realized that the proposed solution was to force a GC on exit from every node invocation. this will penalize everyone, for the perhaps rare case of nodejs actions that manipulate large Buffers?

also, use of Promise.all means that if, for some reason, global.gc() fails, then the activation will fail. is this what was intended?

@csantanapr
Copy link
Member

Two cents

  1. We no longer have node 0.12, we are going to upgrade to node 6.10.2. Will the new v8 in 6.10.2 have something new we can revisit this problem to solve in other ways
  2. Can this be an opt in? instead by default penalize every single invocation on the critical path?

@jeremiaswerner
Copy link
Contributor Author

jeremiaswerner commented May 18, 2017

@rabbah @starpit

yes, I think it's possible to perform the check only at some point, i.e. every second invocation, or if the memory is above X, or only when reused the first time, ...

my initial thought was to avoid any logic to reduce the probability of intermittent failures, therefore we discussed between (1) do it always (2) just expose and let the user run it.

Current implementation is (1), but I would be ok to use (2) and let users who have issues (I know one by name) to try it out and see if it helps.

Would you feel better if we choose (2). we could switch to (1) anytime later once we get good indications from users that this feature is helpful at all?

what do you think?

@starpit
Copy link
Contributor

starpit commented May 18, 2017

for now, i would just expose gc to the actions, and document that those using large Buffers may need to use explicit gc.

@jberstler
Copy link

for now, i would just expose gc to the actions, and document that those using large Buffers may need to use explicit gc.

I agree with this.

However, for completeness: Another problem with the current implementation (using Promise.all) does not guarantee when the GC will actually run - perhaps before, during, or after the action code runs. Additionally, there is no reason to wait for GC to complete successfully (as @starpit also points out) before returning the result.

If we believe that GC should run automatically, I think it should be queued up to run after the result has been returned, as implemented here: master...bjustin-ibm:node_gc#diff-4ca0b607ab76fb60662c0e52545ee30bR114 In this way, the action run is not penalized to also include the automatic GC.

@starpit
Copy link
Contributor

starpit commented May 18, 2017

do we set --max-old-space-size based on the action's limit? this might be tricky, in the face of updates to an action's limit (which would require restarting the container in order to use the updated max-old-space-size).

but, it could be that node would do the right thing, if we set the heap size to be the container's memory limit. the v8 source seems to indicate that the default max old space size is fixed.

also, c.f. cloudfoundry/nodejs-buildpack#82

@rabbah
Copy link
Member

rabbah commented May 18, 2017

if you update the limits, you get a new container.

@starpit
Copy link
Contributor

starpit commented May 18, 2017

if you update the limits, you get a new container.

yes, that makes sense.

so: the default initial old space size is 1.6GB. the original tests didn't even get to that point, in terms of Buffer allocation. So the v8 VM probably feels no urgency to run a GC. i wonder if this is a general problem.

yeah. this fails, even with in-heap objects:

% wsk action update big big.js -m 128
% wsk action invoke big -p mb 10 -br
{
    "length": 10485760
}
% wsk action invoke big -p mb 10 -br
{
    "error": "The action did not produce a valid response and exited unexpectedly."
}

where big is

const main = params => ({ length: new Array(params.mb * 1024 * 1024).fill(0).length })

it seems important to match the heap size to the container limit. i have no idea why v8 isn't doing this for us.

@perryibm
Copy link
Contributor

To be reliable, we would have to be more conservative and set old space to be memory limit less young space, large space, and node size itself. Probably even more. An alternative is to allow a little bit of slack by using --memory-swap to create a tiny bit of elasticity. We don't want much swapping but occasional use of it for an infrequent major GC is not terrible. This also has other use cases such as when there are binaries/images that are large but are actually unused. A possible choice could be something like --memory=512m --memory-swap=640m. Of course, this should be done in conjunction with the heap parameter adjustments.

@jeremiaswerner jeremiaswerner force-pushed the node_gc branch 3 times, most recently from 9ef1cc3 to c3de588 Compare May 19, 2017 06:45
… in memory limit

run garbage collector automatically
@jeremiaswerner
Copy link
Contributor Author

@rabbah @starpit @bjustin-ibm as discussed I switched to "expose GC only and give users a chance to run it in the action". travis and pg ok.

PG3/627 🔵

Copy link

@jberstler jberstler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jberstler jberstler merged commit 5a7467a into apache:master May 19, 2017
@jberstler
Copy link

@jeremiaswerner I've merged this to get it started moving through the pipeline, but I think this may warrant a doc update somewhere to let users know explicit GC is available for them to use.

houshengbo pushed a commit to houshengbo/openwhisk that referenced this pull request Jun 23, 2017
… in memory limit (apache#1826)

Expose garbage collection so that it is explicitly available for use in actions
jonpspri pushed a commit to jonpspri/openwhisk that referenced this pull request Jul 4, 2017
… in memory limit (apache#1826)

Expose garbage collection so that it is explicitly available for use in actions
houshengbo pushed a commit to houshengbo/openwhisk that referenced this pull request Nov 13, 2017
… in memory limit (apache#1826)

Expose garbage collection so that it is explicitly available for use in actions
houshengbo pushed a commit to houshengbo/openwhisk that referenced this pull request Nov 13, 2017
… in memory limit (apache#1826)

Expose garbage collection so that it is explicitly available for use in actions
houshengbo pushed a commit to houshengbo/openwhisk that referenced this pull request Nov 14, 2017
… in memory limit (apache#1826)

Expose garbage collection so that it is explicitly available for use in actions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pgapproved Pipeline has approved this change. review Review for this PR has been requested and yet needs to be done.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants