Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle out-of-memory errors in the worker thread #4273

Merged
merged 10 commits into from
Oct 16, 2023

Conversation

yassin-kammoun-sonarsource
Copy link
Contributor

@yassin-kammoun-sonarsource yassin-kammoun-sonarsource commented Oct 13, 2023

Fixes #4254

When the worker thread runs out of memory, here is the new behavior to expect:

  1. The worker thread will terminate due to running out of memory
  2. The main thread will "catch" that error
  3. A guide to troubleshoot memory issues will be logged
    • Alternatively, a generic message will be printed in case of an unexpected non-memory related error
  4. The bridge server will forcefully close all existing connections
    • Requests on the plugin side won't need to timeout for the analysis to stop anymore
    • Instead, an IO exception will be thrown immediately due to connection ending
    • A runtime error hinting that the bridge is unrespsonsive will be thrown consequently (as before)
  5. The bridge server will shutdown

Since it's kinda tricky to write a unit test or an integration for that, I checked manually this behaviour.
To do so, I analyzed TypeScript with sonar.javascript.node.maxspace=512 to make Node.js run out of memory.

Below are comparisons of the logs we print before and after the changes of this pull request.

@yassin-kammoun-sonarsource
Copy link
Contributor Author

Non-verbose logs

Before

INFO: Sensor JavaScript/TypeScript analysis [javascript]
INFO: Detected os: Mac OS X arch: x86_64 alpine: false. Platform: UNSUPPORTED
INFO: Using Node.js executable: '/tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node'.
...
INFO: 376/566 files analyzed, current file: /tmp/typescript/src/compiler/transformers/esDecorators.ts
ERROR: 
ERROR: <--- Last few GCs --->
ERROR: 
ERROR: [45397:0x7f8f38040000]   132311 ms: Mark-sweep (reduce) 505.5 (521.3) -> 504.4 (521.3) MB, 166.3 / 0.0 ms  (average mu = 0.237, current mu = 0.148) allocation failure; scavenge might not succeed
ERROR: [45397:0x7f8f38040000]   132509 ms: Mark-sweep (reduce) 505.5 (521.3) -> 504.5 (521.3) MB, 165.5 / 0.0 ms  (average mu = 0.202, current mu = 0.166) allocation failure; scavenge might not succeed
ERROR: 
ERROR: 
ERROR: <--- JS stacktrace --->
ERROR: 
ERROR: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
ERROR:  1: 0x104868dd5 node::Abort() [/usr/local/bin/node]
ERROR:  2: 0x104868fb5 node::OOMErrorHandler(char const*, bool) [/usr/local/bin/node]
ERROR:  3: 0x1049e9acc v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
ERROR:  4: 0x104bacde5 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/bin/node]
ERROR:  5: 0x104bb10c2 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/usr/local/bin/node]
ERROR:  6: 0x104badb2c v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*, v8::GCCallbackFlags) [/usr/local/bin/node]
ERROR:  7: 0x104baa924 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node]
ERROR:  8: 0x104b9cffc v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node]
ERROR:  9: 0x104b9d9d5 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node]
ERROR: 10: 0x104b80bfe v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/usr/local/bin/node]
ERROR: 11: 0x104faebb0 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/usr/local/bin/node]
ERROR: 12: 0x1053a12f9 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/usr/local/bin/node]
ERROR: 13: 0x1053232c7 Builtins_FastNewClosure [/usr/local/bin/node]
ERROR: 14: 0x1161743a7 
ERROR: 15: 0x1053b67ba Builtins_ArrayPrototypeFind [/usr/local/bin/node]
ERROR: /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node: line 57: 45397 Abort trap: 6           node "$@"
ERROR: the bridge Node.js process is unresponsive. This is most likely caused by process running out of memory. Consider setting sonar.javascript.node.maxspace to higher value (e.g. 4096).
ERROR: Failure during analysis, Node.js command to start the bridge was: /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node --max-old-space-size=512 /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/bin/server 49549 127.0.0.1 /tmp/typescript/.scannerwork true false 
java.lang.IllegalStateException: the bridge is unresponsive
	at org.sonar.plugins.javascript.eslint.BridgeServerImpl.request(BridgeServerImpl.java:392)
	...
...
INFO: Sensor JavaScript/TypeScript analysis [javascript] (done) | time=136511ms

After

INFO: Sensor JavaScript/TypeScript analysis [javascript]
INFO: Detected os: Mac OS X arch: x86_64 alpine: false. Platform: UNSUPPORTED
INFO: Configured Node.js --max-old-space-size=512.
INFO: Using Node.js executable: '/Users/yassin.kammoun/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node'.
INFO: OS memory 17179 MB. Node.js heap size limit: 587 MB.
...
INFO: 387/566 files analyzed, current file: /tmp/typescript/src/compiler/transformers/generators.ts
ERROR: The analysis will stop due to the Node.js process running out of memory (heap size limit 560 MB)
ERROR: You can see how Node.js heap usage evolves during analysis with "sonar.javascript.node.FIXME=true"
ERROR: Try setting "sonar.javascript.node.maxspace" to a higher value to increase Node.js heap size limit
ERROR: If the problem persists, please report the issue at https://community.sonarsource.com
ERROR: Failure during analysis
java.lang.IllegalStateException: The bridge server is unresponsive
	at org.sonar.plugins.javascript.bridge.BridgeServerImpl.request(BridgeServerImpl.java:395)
	...
...
INFO: Sensor JavaScript/TypeScript analysis [javascript] (done) | time=137947ms

@yassin-kammoun-sonarsource
Copy link
Contributor Author

Verbose logs

Before

16:57:57:27.111 INFO: Sensor JavaScript/TypeScript analysis [javascript]
16:57:27.120 DEBUG: Deploying bundle
16:57:27.121 DEBUG: Deploying the bridge into /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle
16:57:30.437 DEBUG: Deploying bundle (done) | time=3316ms
16:57:30.893 INFO: Detected os: Mac OS X arch: x86_64 alpine: false. Platform: UNSUPPORTED
16:57:30.441 DEBUG: Starting server
16:57:30.903 DEBUG: Looking for Node.js in the PATH using run-node (macOS)
16:57:30.906 INFO: Using Node.js executable: '/tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node'.
16:57:30.906 DEBUG: Checking Node.js version
16:57:30.907 DEBUG: Launching command /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node -v
16:57:31.227 DEBUG: Using Node.js v18.17.1.
16:57:31.749 DEBUG: Starting Node.js process to start the bridge server at port 49803
16:57:31.750 DEBUG: Launching command /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node --max-old-space-size=512 /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/bin/server 49803 127.0.0.1 /tmp/typescript/.scannerwork true false 
16:57:32.748 DEBUG: starting the bridge server at port 49803
16:57:32.757 DEBUG: the bridge server is running at port 49803
16:57:32.854 DEBUG: Starting server (done) | time=2414ms
...
16:59:44.084 INFO: 385/566 files analyzed, current file: /tmp/typescript/src/compiler/transformers/es2015.ts
16:59:44.552 ERROR: 
16:59:44.553 ERROR: <--- Last few GCs --->
16:59:44.553 ERROR: 
16:59:44.553 ERROR: [45774:0x7f7ec8040000]   133531 ms: Mark-sweep 504.1 (522.3) -> 502.8 (522.3) MB, 180.8 / 0.0 ms  (average mu = 0.099, current mu = 0.024) allocation failure; scavenge might not succeed
16:59:44.553 ERROR: 
16:59:44.553 ERROR: 
16:59:44.553 ERROR: <--- JS stacktrace --->
16:59:44.553 ERROR: 
16:59:44.554 ERROR: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
16:59:44.554 ERROR:  1: 0x1020dddd5 node::Abort() [/usr/local/bin/node]
16:59:44.554 ERROR:  2: 0x1020ddfb5 node::OOMErrorHandler(char const*, bool) [/usr/local/bin/node]
16:59:44.554 ERROR:  3: 0x10225eacc v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
16:59:44.554 ERROR:  4: 0x102421de5 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/bin/node]
16:59:44.554 ERROR:  5: 0x1024260c2 v8::internal::Heap::RecomputeLimits(v8::internal::GarbageCollector) [/usr/local/bin/node]
16:59:44.555 ERROR:  6: 0x102422b2c v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::internal::GarbageCollectionReason, char const*, v8::GCCallbackFlags) [/usr/local/bin/node]
16:59:44.555 ERROR:  7: 0x10241f924 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node]
16:59:44.556 ERROR:  8: 0x102411ffc v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node]
16:59:44.556 ERROR:  9: 0x1024129d5 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/usr/local/bin/node]
16:59:44.557 ERROR: 10: 0x1023f5bfe v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/usr/local/bin/node]
16:59:44.557 ERROR: 11: 0x102823bb0 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/usr/local/bin/node]
16:59:44.558 ERROR: 12: 0x102c162f9 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit [/usr/local/bin/node]
16:59:44.558 ERROR: 13: 0x11392f5d6 
16:59:44.558 ERROR: 14: 0x113999a83 
16:59:44.558 ERROR: /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node: line 57: 45774 Abort trap: 6           node "$@"
16:59:44.566 ERROR: the bridge Node.js process is unresponsive. This is most likely caused by process running out of memory. Consider setting sonar.javascript.node.maxspace to higher value (e.g. 4096).
16:59:44.582 ERROR: Failure during analysis, Node.js command to start the bridge was: /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node --max-old-space-size=512 /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/bin/server 49803 127.0.0.1 /tmp/typescript/.scannerwork true false 
java.lang.IllegalStateException: the bridge is unresponsive
	at org.sonar.plugins.javascript.eslint.BridgeServerImpl.request(BridgeServerImpl.java:392)
	...
...
16:59:44.592 INFO: Sensor JavaScript/TypeScript analysis [javascript] (done) | time=137481ms

After

14:21:37.444 INFO: Sensor JavaScript/TypeScript analysis [javascript]
14:21:37.451 DEBUG: Deploying bundle
14:21:37.451 DEBUG: Deploying the bridge server into /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle
14:21:40.891 DEBUG: Deploying bundle (done) | time=3439ms
14:21:40.893 INFO: Detected os: Mac OS X arch: x86_64 alpine: false. Platform: UNSUPPORTED
14:21:40.896 DEBUG: Starting server
14:21:40.903 INFO: Configured Node.js --max-old-space-size=512.
14:21:40.903 DEBUG: Looking for Node.js in the PATH using run-node (macOS)
14:21:40.906 INFO: Using Node.js executable: '/tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node'.
14:21:40.906 DEBUG: Checking Node.js version
14:21:40.907 DEBUG: Launching command /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node -v
14:21:41.227 DEBUG: Using Node.js v18.17.1.
14:21:41.229 DEBUG: Creating Node.js process to start the bridge server on port 57373
14:21:41.229 DEBUG: Launching command /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node --max-old-space-size=512 /tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/bin/server 57373 127.0.0.1 /tmp/typescript/.scannerwork true false 
14:21:43.343 INFO: OS memory 17179 MB. Node.js heap size limit: 587 MB.
14:21:43.344 DEBUG: Starting the bridge server
14:21:43.361 DEBUG: The bridge server is listening on port 57373
14:21:43.413 DEBUG: The worker thread is running
14:21:43.428 DEBUG: Starting server (done) | time=2532ms
...
14:23:57.293 DEBUG: Analyzing file "/tmp/typescript/src/compiler/transformers/generators.ts" with linterId "default"
14:23:57.990 DEBUG: The worker thread failed: Error [ERR_WORKER_OUT_OF_MEMORY]: Worker terminated due to reaching memory limit: JS heap out of memory
14:23:57.997 DEBUG: Shutting down the bridge server due to failure
14:23:57.999 DEBUG: The worker thread exited with code 1
14:23:57.999 DEBUG: The bridge server shutted down
14:23:58.000 ERROR: The analysis will stop due to the Node.js process running out of memory (heap size limit 560 MB)
14:23:58.000 ERROR: You can see how Node.js heap usage evolves during analysis with "sonar.javascript.node.FIXME=true"
14:23:58.001 ERROR: Try setting "sonar.javascript.node.maxspace" to a higher value to increase Node.js heap size limit
14:23:58.001 ERROR: If the problem persists, please report the issue at https://community.sonarsource.com
14:23:58.037 ERROR: Failure during analysis
java.lang.IllegalStateException: The bridge server is unresponsive
	at org.sonar.plugins.javascript.bridge.BridgeServerImpl.request(BridgeServerImpl.java:395)
	...
...
14:23:58.047 INFO: Sensor JavaScript/TypeScript analysis [javascript] (done) | time=140603ms

@yassin-kammoun-sonarsource
Copy link
Contributor Author

yassin-kammoun-sonarsource commented Oct 13, 2023

Unrelated, but I wonder whether those logs should be printed at debug-level.

14:21:40.893 INFO: Detected os: Mac OS X arch: x86_64 alpine: false. Platform: UNSUPPORTED
...
14:21:40.906 INFO: Using Node.js executable: '/tmp/typescript/.scannerwork/.sonartmp/bridge-bundle/package/node_modules/run-node/run-node'.

What do you think? I mean we are mixing log levels about Node.js, and it's hard to tell what should be at info-level and what should be at debug-level..

@yassin-kammoun-sonarsource
Copy link
Contributor Author

yassin-kammoun-sonarsource commented Oct 13, 2023

Also unrelated, but I would like to change these logs from:

14:21:40.903 INFO: Configured Node.js --max-old-space-size=512.
...
14:21:43.343 INFO: OS memory 17179 MB. Docker mem: 1234 MB. Node.js heap size limit: 587 MB.

to

14:21:40.903 INFO: Configured Node.js heap size limit (--max-old-space-size=512).
...
14:21:43.343 INFO: Memory configuration: OS (17179 MB), Docker (1234 MB), Node.js (587 MB).

Would that be any better? Am I just nitpicking?

@saberduck
Copy link
Contributor

Also unrelated, but I would like to change these logs from:

14:21:40.903 INFO: Configured Node.js --max-old-space-size=512.
...
14:21:43.343 INFO: OS memory 17179 MB. Docker mem: 1234 MB. Node.js heap size limit: 587 MB.

to

14:21:40.903 INFO: Configured Node.js heap size limit (--max-old-space-size=512).
...
14:21:43.343 INFO: Memory configuration: OS (17179 MB), Docker (1234 MB), Node.js (587 MB).

Would that be any better? Am I just nitpicking?

it's bit of a detail, I can create PR later today to improve this logs

Copy link
Contributor

@saberduck saberduck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job on cleaning up and rewording the logs

@sonarqube-next
Copy link

SonarQube Quality Gate

Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@yassin-kammoun-sonarsource yassin-kammoun-sonarsource merged commit ffc53a9 into master Oct 16, 2023
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle out-of-memory errors in the worker thread
2 participants