Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout leaks when converted into a primitive and not cleared #53335

Closed
mitsuhiko opened this issue Jun 4, 2024 · 1 comment · Fixed by #53337
Closed

Timeout leaks when converted into a primitive and not cleared #53335

mitsuhiko opened this issue Jun 4, 2024 · 1 comment · Fixed by #53337
Labels
timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.

Comments

@mitsuhiko
Copy link

mitsuhiko commented Jun 4, 2024

Version

21.2.0

Platform

Darwin cheetah.local 23.5.0 Darwin Kernel Version 23.5.0: Wed May 1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64

Subsystem

timers.js

What steps will reproduce the bug?

On node calling setTimeout returns a Timeout object. That object is tracked in an internal list of timers and that list is maintained in two places. On the one hand in unenroll which is used by clearTimeout (and clearInterval) and one when the timer runs.

However only the unenroll path also removes a timer from the internal knownTimersById map. This map is updated whenever the Timeout is converted into a primitive. From that moment onwards a timer can be cleared by it's internal async id.

So to get a setTimeout to leak you just need to call +setTimeout(...) and it wait for it to complete. The entry from the knownTimersById map is not removed and we leak.

The repro case is trivial:

// leaks
for (i = 0; i < 500000; i++) {
  +setTimeout(() => {}, 0);
}

This will create 500000 un-collectable Timeouts that can be found in the knownTimersById map in timers.js. Removing the + fixes it.

Timer is removed here from the list but not from knownTimersById:

node/lib/internal/timers.js

Lines 544 to 545 in 7d14d1f

// The actual logic for when a timeout happens.
L.remove(timer);

Compare this to how unenroll clears:

node/lib/timers.js

Lines 86 to 93 in 7d14d1f

if (item[kHasPrimitive])
delete knownTimersById[item[async_id_symbol]];
// Fewer checks may be possible, but these cover everything.
if (destroyHooksExist() && item[async_id_symbol] !== undefined)
emitDestroy(item[async_id_symbol]);
L.remove(item);

How often does it reproduce? Is there a required condition?

Always

What is the expected behavior? Why is that the expected behavior?

Not leak memory

What do you see instead?

Leaks memory

Additional information

We ran into this with the Sentry SDK though it's not entirely clear yet what actually converts the value there into a primitive. Might be some monkey patching going on somewhere.

The code looks the same on the latest version but I did not try to repro it there yet.

Refs getsentry/sentry-javascript#12317 (comment)

@anonrig
Copy link
Member

anonrig commented Jun 4, 2024

cc @nodejs/timers

@anonrig anonrig added the timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. label Jun 4, 2024
@theanarkh theanarkh mentioned this issue Jun 4, 2024
4 tasks
nodejs-github-bot pushed a commit that referenced this issue Jun 7, 2024
PR-URL: #53337
Fixes: #53335
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
Reviewed-By: Feng Yu <F3n67u@outlook.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
targos pushed a commit that referenced this issue Jun 20, 2024
PR-URL: #53337
Fixes: #53335
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
Reviewed-By: Feng Yu <F3n67u@outlook.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
EliphazBouye pushed a commit to EliphazBouye/node that referenced this issue Jun 20, 2024
PR-URL: nodejs#53337
Fixes: nodejs#53335
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
Reviewed-By: Feng Yu <F3n67u@outlook.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
bmeck pushed a commit to bmeck/node that referenced this issue Jun 22, 2024
PR-URL: nodejs#53337
Fixes: nodejs#53335
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
Reviewed-By: Feng Yu <F3n67u@outlook.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
marco-ippolito pushed a commit that referenced this issue Jul 19, 2024
PR-URL: #53337
Fixes: #53335
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
Reviewed-By: Feng Yu <F3n67u@outlook.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
marco-ippolito pushed a commit that referenced this issue Jul 19, 2024
PR-URL: #53337
Fixes: #53335
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
Reviewed-By: Feng Yu <F3n67u@outlook.com>
Reviewed-By: Yagiz Nizipli <yagiz.nizipli@sentry.io>
lubieowoce added a commit to vercel/next.js that referenced this issue Feb 6, 2025
Potential fix for a leak reported in #74855 on older node versions (see
[comment](#74855 (comment))).

### Background

When running middleware (or other edge functions) in `next start`, we
wrap them in an edge runtime sandbox. this includes polyfills of
`setTimeout` and `setInterval` which return `number` instead of
`NodeJS.Timeout`.

Unfortunately, on some older node versions, converting a
`NodeJS.Timeout` to a number will cause that timeout to leak:
nodejs/node#53335
The leaked timeout will also hold onto the callback, thus also leaking
anything that was closed over (which can be a lot of things!)

### Solution

Ideally, users just upgrade to a Node version that includes the fix:
- [node v20.16.0](nodejs/node#53945)
- [node v22.4.0](nodejs/node#53583)
- node v23.0.0

But we're currently still supporting node 18, so we can't necessarily
rely on that. Luckily, as noted in the description of the nodejs issue,
calling `clearTimeout` seems to unleak the timeout, so we can just do
that after the callback runs!

### Unrelated stuff I did

While i was at it, I also fixed a (very niche) discrepancy from how
`setTimeout` and `setInterval` behave on the web. when running the
callback, node sets `this` to the Timeout instance:
```js
> void setTimeout(function () {console.log('this in setTimeout:', this) } )
undefined
> this in setTimeout: Timeout { ... }
```
but on the web, `this` is always set to `globalThis`. Our wrapper now
correctly does this.

### Testing

<details>
<summary>Collapsed because it's long</summary>

Verifying this is kinda tricky, so bear with me...

Here's a script that can verify whether calling `clearTimeout` fixes the
leak by using a FinalizationRegistry and triggering GC to observe
whether memory leaked or not.
`setTimeoutWithFix` is a simplified version of `webSetTimeoutPolyfill`
from the PR.

```js
// setTimeout-test.js

if (typeof gc !== 'function') {
  console.log('this test must be run with --expose-gc')
  process.exit(1)
}

function setTimeoutWithFix(callback, ms, ...args) {
  const wrappedCallback = function () {
    try {
      return callback.apply(this, args)
    } finally {
      clearTimeout(timeout)
    }
  }
  const timeout = setTimeout(wrappedCallback, ms)
  return timeout
}

const didFinalize = {}
const registry = new FinalizationRegistry((id) => {
  didFinalize[id] = true
})

{
  const id = 'node setTimeout'.padEnd(26, ' ')

  const timeout = setTimeout(() => {}, 0)

  registry.register(timeout, id)
  didFinalize[id] = false
}

{
  const id = 'node setTimeout as number'.padEnd(26, ' ')

  const timeout = setTimeout(() => {}, 0)
  timeout[Symbol.toPrimitive]()

  registry.register(timeout, id)
  didFinalize[id] = false
}

{
  const id = 'fixed setTimeout'.padEnd(26, ' ')

  const timeout = setTimeoutWithFix(() => {}, 0)

  registry.register(timeout, id)
  didFinalize[id] = false
}

{
  const id = 'fixed setTimeout as number'.padEnd(26, ' ')

  const timeout = setTimeoutWithFix(() => {}, 0)
  timeout[Symbol.toPrimitive]()

  registry.register(timeout, id)
  didFinalize[id] = false
}

// wait for the timeouts to run
void setTimeout(() => {
  gc() // trigger garbage collection
  void registry // ...but make sure we keep the registry alive

  // wait a task so that finalization callbacks can run
  setTimeout(() =>
    console.log('did the Timeout get released after GC?', didFinalize)
  )
}, 10)
```

To run it, Install the required node versions:
```bash
for ver in v20.15.0 v20.16.0 v22.3.0 v22.4.0 v23.0.0; do ( nvm install "$ver" ); done
```

And run the test:
```bash
for ver in v20.15.0 v20.16.0 v22.3.0 v22.4.0 v23.0.0; do
  (
    echo '-------------------'
    nvm use "$ver" && node --expose-gc setTimeout-test.js
    echo
  );
done
```

The output on my machine is as follows. Note that the `node setTimeout
as number` case comes up as false on the older versions (because it
leaks and doesn't get finalized), but `fixed setTimeout as number`
(which calls `clearTimeout`) gets released fine, which is exactly what
we want.

```terminal
-------------------
Now using node v20.15.0 (npm v10.7.0)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': false,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v20.16.0 (npm v10.8.1)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': true,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v22.3.0 (npm v10.8.1)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': false,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v22.4.0 (npm v10.8.1)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': true,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v23.0.0 (npm v10.9.0)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': true,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}
```
</details>
ztanner pushed a commit to vercel/next.js that referenced this issue Feb 10, 2025
Potential fix for a leak reported in #74855 on older node versions (see
[comment](#74855 (comment))).

### Background

When running middleware (or other edge functions) in `next start`, we
wrap them in an edge runtime sandbox. this includes polyfills of
`setTimeout` and `setInterval` which return `number` instead of
`NodeJS.Timeout`.

Unfortunately, on some older node versions, converting a
`NodeJS.Timeout` to a number will cause that timeout to leak:
nodejs/node#53335
The leaked timeout will also hold onto the callback, thus also leaking
anything that was closed over (which can be a lot of things!)

### Solution

Ideally, users just upgrade to a Node version that includes the fix:
- [node v20.16.0](nodejs/node#53945)
- [node v22.4.0](nodejs/node#53583)
- node v23.0.0

But we're currently still supporting node 18, so we can't necessarily
rely on that. Luckily, as noted in the description of the nodejs issue,
calling `clearTimeout` seems to unleak the timeout, so we can just do
that after the callback runs!

### Unrelated stuff I did

While i was at it, I also fixed a (very niche) discrepancy from how
`setTimeout` and `setInterval` behave on the web. when running the
callback, node sets `this` to the Timeout instance:
```js
> void setTimeout(function () {console.log('this in setTimeout:', this) } )
undefined
> this in setTimeout: Timeout { ... }
```
but on the web, `this` is always set to `globalThis`. Our wrapper now
correctly does this.

### Testing

<details>
<summary>Collapsed because it's long</summary>

Verifying this is kinda tricky, so bear with me...

Here's a script that can verify whether calling `clearTimeout` fixes the
leak by using a FinalizationRegistry and triggering GC to observe
whether memory leaked or not.
`setTimeoutWithFix` is a simplified version of `webSetTimeoutPolyfill`
from the PR.

```js
// setTimeout-test.js

if (typeof gc !== 'function') {
  console.log('this test must be run with --expose-gc')
  process.exit(1)
}

function setTimeoutWithFix(callback, ms, ...args) {
  const wrappedCallback = function () {
    try {
      return callback.apply(this, args)
    } finally {
      clearTimeout(timeout)
    }
  }
  const timeout = setTimeout(wrappedCallback, ms)
  return timeout
}

const didFinalize = {}
const registry = new FinalizationRegistry((id) => {
  didFinalize[id] = true
})

{
  const id = 'node setTimeout'.padEnd(26, ' ')

  const timeout = setTimeout(() => {}, 0)

  registry.register(timeout, id)
  didFinalize[id] = false
}

{
  const id = 'node setTimeout as number'.padEnd(26, ' ')

  const timeout = setTimeout(() => {}, 0)
  timeout[Symbol.toPrimitive]()

  registry.register(timeout, id)
  didFinalize[id] = false
}

{
  const id = 'fixed setTimeout'.padEnd(26, ' ')

  const timeout = setTimeoutWithFix(() => {}, 0)

  registry.register(timeout, id)
  didFinalize[id] = false
}

{
  const id = 'fixed setTimeout as number'.padEnd(26, ' ')

  const timeout = setTimeoutWithFix(() => {}, 0)
  timeout[Symbol.toPrimitive]()

  registry.register(timeout, id)
  didFinalize[id] = false
}

// wait for the timeouts to run
void setTimeout(() => {
  gc() // trigger garbage collection
  void registry // ...but make sure we keep the registry alive

  // wait a task so that finalization callbacks can run
  setTimeout(() =>
    console.log('did the Timeout get released after GC?', didFinalize)
  )
}, 10)
```

To run it, Install the required node versions:
```bash
for ver in v20.15.0 v20.16.0 v22.3.0 v22.4.0 v23.0.0; do ( nvm install "$ver" ); done
```

And run the test:
```bash
for ver in v20.15.0 v20.16.0 v22.3.0 v22.4.0 v23.0.0; do
  (
    echo '-------------------'
    nvm use "$ver" && node --expose-gc setTimeout-test.js
    echo
  );
done
```

The output on my machine is as follows. Note that the `node setTimeout
as number` case comes up as false on the older versions (because it
leaks and doesn't get finalized), but `fixed setTimeout as number`
(which calls `clearTimeout`) gets released fine, which is exactly what
we want.

```terminal
-------------------
Now using node v20.15.0 (npm v10.7.0)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': false,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v20.16.0 (npm v10.8.1)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': true,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v22.3.0 (npm v10.8.1)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': false,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v22.4.0 (npm v10.8.1)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': true,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v23.0.0 (npm v10.9.0)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': true,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}
```
</details>
ztanner pushed a commit to vercel/next.js that referenced this issue Feb 10, 2025
Potential fix for a leak reported in #74855 on older node versions (see
[comment](#74855 (comment))).

### Background

When running middleware (or other edge functions) in `next start`, we
wrap them in an edge runtime sandbox. this includes polyfills of
`setTimeout` and `setInterval` which return `number` instead of
`NodeJS.Timeout`.

Unfortunately, on some older node versions, converting a
`NodeJS.Timeout` to a number will cause that timeout to leak:
nodejs/node#53335
The leaked timeout will also hold onto the callback, thus also leaking
anything that was closed over (which can be a lot of things!)

### Solution

Ideally, users just upgrade to a Node version that includes the fix:
- [node v20.16.0](nodejs/node#53945)
- [node v22.4.0](nodejs/node#53583)
- node v23.0.0

But we're currently still supporting node 18, so we can't necessarily
rely on that. Luckily, as noted in the description of the nodejs issue,
calling `clearTimeout` seems to unleak the timeout, so we can just do
that after the callback runs!

### Unrelated stuff I did

While i was at it, I also fixed a (very niche) discrepancy from how
`setTimeout` and `setInterval` behave on the web. when running the
callback, node sets `this` to the Timeout instance:
```js
> void setTimeout(function () {console.log('this in setTimeout:', this) } )
undefined
> this in setTimeout: Timeout { ... }
```
but on the web, `this` is always set to `globalThis`. Our wrapper now
correctly does this.

### Testing

<details>
<summary>Collapsed because it's long</summary>

Verifying this is kinda tricky, so bear with me...

Here's a script that can verify whether calling `clearTimeout` fixes the
leak by using a FinalizationRegistry and triggering GC to observe
whether memory leaked or not.
`setTimeoutWithFix` is a simplified version of `webSetTimeoutPolyfill`
from the PR.

```js
// setTimeout-test.js

if (typeof gc !== 'function') {
  console.log('this test must be run with --expose-gc')
  process.exit(1)
}

function setTimeoutWithFix(callback, ms, ...args) {
  const wrappedCallback = function () {
    try {
      return callback.apply(this, args)
    } finally {
      clearTimeout(timeout)
    }
  }
  const timeout = setTimeout(wrappedCallback, ms)
  return timeout
}

const didFinalize = {}
const registry = new FinalizationRegistry((id) => {
  didFinalize[id] = true
})

{
  const id = 'node setTimeout'.padEnd(26, ' ')

  const timeout = setTimeout(() => {}, 0)

  registry.register(timeout, id)
  didFinalize[id] = false
}

{
  const id = 'node setTimeout as number'.padEnd(26, ' ')

  const timeout = setTimeout(() => {}, 0)
  timeout[Symbol.toPrimitive]()

  registry.register(timeout, id)
  didFinalize[id] = false
}

{
  const id = 'fixed setTimeout'.padEnd(26, ' ')

  const timeout = setTimeoutWithFix(() => {}, 0)

  registry.register(timeout, id)
  didFinalize[id] = false
}

{
  const id = 'fixed setTimeout as number'.padEnd(26, ' ')

  const timeout = setTimeoutWithFix(() => {}, 0)
  timeout[Symbol.toPrimitive]()

  registry.register(timeout, id)
  didFinalize[id] = false
}

// wait for the timeouts to run
void setTimeout(() => {
  gc() // trigger garbage collection
  void registry // ...but make sure we keep the registry alive

  // wait a task so that finalization callbacks can run
  setTimeout(() =>
    console.log('did the Timeout get released after GC?', didFinalize)
  )
}, 10)
```

To run it, Install the required node versions:
```bash
for ver in v20.15.0 v20.16.0 v22.3.0 v22.4.0 v23.0.0; do ( nvm install "$ver" ); done
```

And run the test:
```bash
for ver in v20.15.0 v20.16.0 v22.3.0 v22.4.0 v23.0.0; do
  (
    echo '-------------------'
    nvm use "$ver" && node --expose-gc setTimeout-test.js
    echo
  );
done
```

The output on my machine is as follows. Note that the `node setTimeout
as number` case comes up as false on the older versions (because it
leaks and doesn't get finalized), but `fixed setTimeout as number`
(which calls `clearTimeout`) gets released fine, which is exactly what
we want.

```terminal
-------------------
Now using node v20.15.0 (npm v10.7.0)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': false,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v20.16.0 (npm v10.8.1)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': true,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v22.3.0 (npm v10.8.1)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': false,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v22.4.0 (npm v10.8.1)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': true,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}

-------------------
Now using node v23.0.0 (npm v10.9.0)
did the Timeout get released after GC? {
  'node setTimeout           ': true,
  'node setTimeout as number ': true,
  'fixed setTimeout          ': true,
  'fixed setTimeout as number': true
}
```
</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants