Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heap autosize #82

Closed
wants to merge 1 commit into from
Closed

Conversation

CAFxX
Copy link
Contributor

@CAFxX CAFxX commented Apr 18, 2017

Proposed change

As previously discussed in #34 (and tangentially in #29) it would be nice to have the buildpack do something "smart" in the default case (a-la java buildpack) by setting the --max-old-space-size v8 option to something conservative (75% of the container memory quota).

The autotuned default can be overridden in two ways:

  1. If the application provides a Procfile, the heap autotuning is disabled (this means that apps currently setting --max-old-space-size manually are not affected by this change)
  2. If the application does not provide a Procfile it is possible to provide additional arguments to npm by setting the NPM_CLI_OPTIONS environment variable. The value of this variable is appended to the npm invocation, thereby allowing to override the autotuned value.

The rationale for choosing 75% is twofold:

  1. because the old space is definitely the heaviest but not the only user of memory: besides v8 itself, jitted code, stacks, i/o buffers and new-space all require some memory
  2. because it's close to what we see done often by our users (running with default node configuration [~1.4GB] in a 2GB container, i.e. ~75%)

Use cases

By default node runs with a max "old space" size of 1.4GB. This means that in practice warm apps use 1.6~1.7GB (as reported by cf app).
For containers smaller than 1.7GB, this means that the applications are likely to be crash due to OOM. For containers larger than 1.7GB, this means that the extra memory won't be used.

  • I have viewed signed and have submitted the Contributor License Agreement

  • I have made this pull request to the develop branch

  • I have added an integration test (two actually)

@cfdreddbot
Copy link

Hey CAFxX!

Thanks for submitting this pull request! I'm here to inform the recipients of the pull request that you and the commit authors have already signed the CLA.

@cf-gitbot
Copy link

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/143873083

The labels on this github issue will be updated when the story is started.

@CAFxX CAFxX changed the base branch from master to develop April 18, 2017 13:18
Also expose $NPM_CLI_OPTIONS for customizing npm
@CAFxX
Copy link
Contributor Author

CAFxX commented Apr 18, 2017

As a side note: the added integration tests are passing but, at least on my machine, some unrelated tests are failing.

@sclevine
Copy link
Contributor

@dgodd thoughts about this?

@CAFxX do you have links to any data or documentation that suggest 75% of desired memory usage is a reasonable value for --max-old-space-size? For whatever reason, I haven't observed many users complain about this.

@CAFxX
Copy link
Contributor Author

CAFxX commented Apr 20, 2017

@sclevine:

do you have links to any data or documentation that suggest 75% of desired memory usage is a reasonable value for --max-old-space-size?

  • https://blog.heroku.com/node-habits-2016 kinda suggests using 90%
  • our users, when they run with sufficiently large memory quota so that OOM isn't triggered, see their memory usage peak at 1.7GB: this gives an empirical ratio of 1.4GB/1.7GB = 82%
  • 75% is meant to be a conservative number, meaning that it should be suboptimal but safer (since it leaves a larger headroom): if users want to squeeze the last bit of performance out of their app they should likely tune v8 manually

For whatever reason, I haven't observed many users complain about this.

We had two different teams run into problems because of no autotuning in the last month alone (and apparently we aren't the only ones, see e.g. nodejs/node#2738). Also, the current behavior is inconsistent with what the java buildpack does out-of-the-box.

@CAFxX
Copy link
Contributor Author

CAFxX commented Apr 20, 2017

Pinging @adrai, @afeld and @flavorjones because they were discussing something similar in #34.

@adrai
Copy link

adrai commented Apr 21, 2017

Nice.... exactly what I wanted since 2015

@dgodd
Copy link
Contributor

dgodd commented Apr 21, 2017

@sclevine I think this makes a lot of sense. @CAFxX Thanks, this looks good

@sclevine
Copy link
Contributor

@flavorjones you seemed reluctant to make this change in the past. I'm not too worried about it, especially given Heroku's suggestions.

Anyone opposed to a quick timeboxed investigation, and then setting --max-old-space-size to 75% by default? I would like to introduce an env var to configure or disable it as well.

@afeld
Copy link
Contributor

afeld commented Apr 22, 2017

Looks good to me!

@sclevine
Copy link
Contributor

Prioritized an investigative story here: https://www.pivotaltracker.com/story/show/144219147

@dgodd
Copy link
Contributor

dgodd commented May 1, 2017

@CAFxX Having played with these settings for a while, I am convinced by "--gc_interval=100" "--optimize_for_size" "--always_compact" for low memory settings (although feel that low memory is app specific, and thus a good example for a pro tip on push, rather than a default).

In a similar vein, a pro-tip for high memory settings to have the user set "--max_old_space_size" seems good.

It is less clear to me why setting "--max_old_space_size" for low memory settings is helpful, in my experimentation the only effect was to change how the app crashed, at which point adding a node setting to crash seems less useful than just letting the CF values force a crash. (Admittedly the node crash output had more data in it, but was not immediately more useful than a simple "out of memory error")

Is your experience different than the above?

@CAFxX
Copy link
Contributor Author

CAFxX commented May 1, 2017

It is less clear to me why setting "--max_old_space_size" for low memory settings is helpful

Two reasons:

  1. because "low memory settings" would require a second heuristic (how do you define low memory? <1G just because 1G is the default container size? what about deployments with different default container sizes? what about non-production environments? what about applications that have big base working sets?)
  2. because without it an application with, say, 0.5G of "live" memory in a 1G container (without going into too many details about what is considered "live memory", let's define it as the amount of data that survives an ideal full GC) can still crash due to OOM[1] even with the other flags set. OTOH with max-old-space-size set to 0.75G such a crash is going to be very unlikely. I'm not advocating that max_old_space_size is a silver bullet: if the "live memory" is higher than the max_old_space_size the application will still crash, but at least it very likely won't crash (as it does now) if the "live memory" is lower than max_old_space_size

I'm not excluding that we can additionally make use of --gc_interval=100[2] --optimize_for_size --always_compact in certain cases (after the "low memory" heuristic have been devised, obviously). But I believe in incremental improvements, and setting just max-old-space-size felt like the lowest-hanging (and potentially less controversial - especially given the links I included above) fruit.


[1] as mentioned previously, two different teams got bitten by lack of autotuning; one of them had a very small "live memory" set but their app still managed to OOM-crash due to node's lazy GC
[2] I don't really see the need for tweaking gc_interval if max_old_space_size is correctly set, but maybe I'm missing something.

@adrai
Copy link

adrai commented May 2, 2017

fyi
The most part of our apps have not more than 256mb.
And it seems that newer node versions have a more aggressive GC by default.

@sclevine
Copy link
Contributor

@dgodd Could we add an environment variable that enables these optimizations (aggressively, but with conservative space sizes) based on the container limits? Then if we find most people set the environment variable and don't have issues, we can enable it by default.

@dgodd
Copy link
Contributor

dgodd commented May 10, 2017 via email

bobzoller added a commit to goodeggs/ranch-baseimage-nodejs that referenced this pull request May 13, 2017
set it to 75% of the container's memory limit
should improve GC behavior in low memory environments
see:
- cloudfoundry/nodejs-buildpack#82
- nodejs/node#2738
@sclevine
Copy link
Contributor

Prioritized adding the env var as described.

@sclevine
Copy link
Contributor

On develop here: fd5c9ee

Note that we went with:

npm start --optimize_for_size --always_compact --max_old_space_size=$(( $(jq -r '.limits.mem' <<<"$VCAP_APPLICATION") * 75 / 100 ))

Don't hesitate to open an issue if you have feedback about these settings.

We didn't introduce NPM_CLI_OPTIONS, because we felt that overriding the default start command (via a Procfile, manifest.yml, or cf push -c) covers that functionality.

@CAFxX
Copy link
Contributor Author

CAFxX commented May 21, 2017

@sclevine @nikitarathi23 @dgodd something's funny in that commit: fd5c9ee#diff-00ca2312048696e35dce358a85e315b9R86 (it's just a test description but...)

Also the commit doesn't seem to include any documentation change. How are users supposed to discover this?

@sclevine
Copy link
Contributor

Hi @CAFxX,

I rejected the story with a request for both of those changes yesterday, see: https://www.pivotaltracker.com/n/projects/1042066/stories/143873083

The functionality is present and appears to work, so I resolved this issue without waiting for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants