Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mitigate memory leaks in jest-environment-node #15215

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

eyalroth
Copy link
Contributor

@eyalroth eyalroth commented Jul 26, 2024

Summary

This PR attempts to drastically mitigate memory leaks in jest-environment-node, raised by multiple issues:

The leaks are likely caused by the Jest/NodeJS sandbox (vm context) not being fully cleaned up between test suites.

This PR introduces 3 orthogonal mitigations, all made during env teardown.

The mitigations do not eliminate the leak entirely, but they drastically reduce it (at least in my testing).

The mitigations (each in its own commit):

1. Global cleanup

In my testing, the biggest memory leak was caused by custom objects on global that keep references to source-code strings.

This change deletes all properties of all the objects in the environment's global; i.e:

// pseudo code
for (value in global) {
  for (key in value) {
    delete value[key];
  }
}

The exceptions are:

  • Objects set on the global in the environment constructor
  • Objects protected by a new "shreder" API in jest-utils (setNotShreddable)
    • For example, jest-circus's state is leaking, so it should be deleted; however, some of the properties are still used after teardown -- for instance, unhandledErrors and unhandledRejectionErrorByPromise.
  • The performance property (hack)
    • react-native comes with a preset that overrides this property on env-setup and never reverts it. Then, react-test-renderer attempts to access performance (via scheduler) after the env has already been torn down. A fix needs to be made to react-native to either (a) revert the performance property at teardown or (b) protect the custom performance object with the new shredder API.

The PR requires an update to the official docs, so users would know about this behavior, and how to avoid it if the need arises.

2. Event listeners cleanup

Another memory leak came from event listeners on the env process and node:cluster module, so these are removed at env teardown.

3. source-map-support cleanup

Although some cleanup of source-map-support exists in Jest, it's not enough. This module overrides Error.prepareStackTrace with an implementation that holds a cache that contains (memory-heavy) source code.

The mitigation here is to override Error.prepareStackTrace back to a no-op method at env teardown.

Stacked of nested (cause) errors were harmed by this change, so a better assurance of stack generation was added to jest-runner.

Test plan

I have been using one of our private repositories that are suffering from heavy memory leaks.

All of them use some version of NodeJS (18.x), Jest (27/29) and Typescript (4.x). They also make use of sequelize, lodash, moment, log4js, AWS SDK (v3) and many more packages.

Method

  1. Clone Jest.
  2. Modify runTest.ts in jest-runner to run with the local jest environment (/path/to/jest/packages/jest-environment-node/build/index.js) instead of the test repo.
  3. Build the project.
  4. Add .nvmrc with 20.14.0.
  5. Run jest with this script in jest-cli:
// package.json
"scripts": {
  "jest": "NODE_OPTIONS='--no-experimental-fetch' node --inpsect-brk ./bin/jest.js -c /path/to/repo/test/unit/jest.config.js --detectOpenHandles --runInBand --forceExit --logHeapUsage"
}

(we use --no-experimental-fetch due to our use of nock for mocking HTTP)

  1. Attach to the process via Chrome devtools, and let it run.

  2. Capture a full heap snapshot soon after letting the process run.

  3. Capture a second full heap snapshot after letting the process run a while and the heap grow.

  4. Examined the objects that were allocated between the snapshots, targeting objects with highest shallow size.

Steps

1. Original code

With Jest's original code, we can see the following:
memory1

A huge chunk of memory is lodash's source code replicated by references to it from a custom matcher:

import _ from 'lodash';

expect.extend({ satisfies(...) { 
  // code using lodash
}});

However, the vast majority of memory is probably held by jest-circus state (symbol JEST_STATE_SYMBOL):
image

In this snapshot we can see that just one state (of one test file) is leaking ~25MB in memory.

2. With global cleanup

This is after applying the first mitigation:
memory2

The leak has been reduced dramatically.

What we see now is (again) references to lodash's source code, but from a custom log4js we use:

import log4js from 'log4js';
import _ from 'lodash';

log4js.addLayout('json', () => {
  // code using lodash
});

Turns out that log4js adds the layout function to node:cluster module as an event listener.

3. With global cleanup + event listeners cleanup

Next, after adding the event-listeners cleanup as well, we see this:
memory3

This time, it seems that sequelize's source (map) code is being duplicated in memory.

In our version of seqeulize, it creates a new Error() on each query (this was replaced in sequelize/sequelize#15559). This error is somehow captured and persisted in memory, keeping a custom prepareStackTrace (from source-map-support) that keeps the source code.

4. With all mitigations

After all mitigations in this PR applied:
memory4

We see that there's still some leaking source code, but that's for another PR :)

Copy link

netlify bot commented Jul 26, 2024

Deploy Preview for jestjs ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit 05592c7
🔍 Latest deploy log https://app.netlify.com/sites/jestjs/deploys/66acca0c2ebedb000861cf93
😎 Deploy Preview https://deploy-preview-15215--jestjs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@eyalroth eyalroth force-pushed the fix-memory-leak branch 10 times, most recently from 2c95e2d to 7423314 Compare July 27, 2024 16:32
eyalroth added a commit to eyalroth/jest that referenced this pull request Jul 27, 2024
eyalroth added a commit to eyalroth/jest that referenced this pull request Jul 27, 2024
eyalroth added a commit to eyalroth/jest that referenced this pull request Jul 27, 2024
@eyalroth eyalroth changed the title Fix memory leak (all solutions combined) Mitigate memory leaks in jest-environment-node Jul 27, 2024
@eyalroth eyalroth marked this pull request as ready for review July 27, 2024 18:48
@SimenB
Copy link
Member

SimenB commented Jul 29, 2024

Thank you so much for working on this!

My main feedback would be to split up the changes here (already in atomic commits, which is nice!) into separate PRs so they can be merged independently of each other.

Some quick feedback on each of these

  1. This seems a bit much to me at first glance, but I'll need to dig deeper
  2. Unless those listeners are added by Jest, I don't think we should clean them up. All the node core modules share a cache (it's impossible to load node core modules within vm: Evaluate node core modules within a vm.Context nodejs/node#31852). This needs to be done by users (if log4js has global side effects it should also expose a way of cleaning that up).
  3. This seems reasonable, but could we do this in the teardown of the test env rather than the environment? I.e. after
    sourcemapSupport.resetRetrieveHandlers();
    (which I thought was supposed to solve exactly the issue you highlight 🤔)

On the log4js thing - essentially if this adds more and more listeners, you'll need to clean up yourself.

const log4jsPath = require.resolve('log4js');

require(log4jsPath);
delete require.cache[log4jsPath];
require(log4jsPath);
delete require.cache[log4jsPath];
require(log4jsPath);
// etc.

@eyalroth
Copy link
Contributor Author

eyalroth commented Aug 2, 2024

@SimenB Thanks for the reply!

My main feedback would be to split up the changes here

Agreed. I wanted 1 PR mostly for the tests and the "dev journey". I'll keep the first method (global cleanup) here and move the others to separate PRs.

  1. This seems a bit much to me at first glance, but I'll need to dig deeper

Waiting on your thoughts :) This is the bulk of the PR and the thing that reduces memory leak the most (by far).

  1. All the node core modules share a cache

Yep. I missed that. At first I tried to clean listeners directly on node:cluster (without VM) and it didn't remove log4js's listeners; however, that was likely due to webpack replacing require.

This code illustrates how the listeners are indeed shared:

import vm from 'node:vm';
import cluster from 'node:cluster';

console.log(cluster.listenerCount('foo')); // 0
cluster.on('foo', () => {});
console.log(cluster.listenerCount('foo')); // 1

const context = vm.createContext();
vm.runInContext(`
  const cluster = require('node:cluster');
  cluster.removeAllListeners();
`, Object.assign(context, { require, console }));

console.log(cluster.listenerCount('foo')); // 0

I could potentially use the same technique I did with global but with cluster's listeners, but I feel like it's not worth it.

  1. could we do this in the teardown of the test env rather than the environment?

Sure, I'll do it in a separate PR.

(which I thought was supposed to solve exactly the issue you highlight 🤔)

It only cleans custom handlers, but it doesn't clean fileContentsCache.

eyalroth added a commit to eyalroth/jest that referenced this pull request Aug 2, 2024
eyalroth added a commit to eyalroth/jest that referenced this pull request Aug 2, 2024
@SimenB
Copy link
Member

SimenB commented Aug 2, 2024

(which I thought was supposed to solve exactly the issue you highlight 🤔)

It only cleans custom handlers, but it doesn't clean fileContentsCache.

Mind opening a PR upstream with that? It was my PR some years ago that added the cleanup, seems I missed some 😀

evanw/node-source-map-support#215

eyalroth added a commit to eyalroth/jest that referenced this pull request Aug 3, 2024
@eyalroth
Copy link
Contributor Author

eyalroth commented Aug 3, 2024

@SimenB

Mind opening a PR upstream with that? It was my PR some years ago that added the cleanup, seems I missed some 😀

Tried my best (without tests):
evanw/node-source-map-support#339

@sibelius
Copy link

sibelius commented Oct 1, 2024

any progress on this?

does this also improves CPU ?

Copy link

This PR is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the Stale label Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants