-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide better error messages for OOM browser scenarios #605
Comments
Pinging @elastic/uptime (Team:Uptime) |
As a note, today when exceeding memory limits, when the OOM killer kicks in you get the following for a node JS kill # tested with -m 80m
{"log.level":"warn","@timestamp":"2022-09-20T00:45:21.398Z","log.origin":{"file.name":"synthexec/synthexec.go","file.line":280},"message":"Error executing command '/usr/share/heartbeat/.node/node/bin/elastic-synthetics elastic-synthetics --screenshots on --inline --rich-events' (-1): signal: killed","service.name":"heartbeat","ecs.version":"1.6.0"} Boosting the memory slight to |
The error event looks like: {
"_index": ".ds-synthetics-browser-default-2022.08.22-000003",
"_id": "0apiWIMBLxoT0iGx4bJq",
"_score": null,
"_source": {
"summary": {
"up": 0,
"down": 1
},
"agent": {
"name": "docker-desktop",
"id": "959ebb20-beca-45db-a143-7acb7d0c299e",
"type": "heartbeat",
"ephemeral_id": "281781e7-650f-40e8-bf37-2ff1a0bc7bc4",
"version": "8.4.1"
},
"@timestamp": "2022-09-20T00:53:42.146Z",
"ecs": {
"version": "8.0.0"
},
"data_stream": {
"namespace": "default",
"type": "synthetics",
"dataset": "browser"
},
"synthetics": {
"journey": {
"name": "inline",
"id": "inline",
"tags": null
},
"type": "heartbeat/summary"
},
"monitor": {
"duration": {
"us": 268403
},
"name": "No Mem",
"id": "no-mem",
"timespan": {
"lt": "2022-09-20T00:54:42.191Z",
"gte": "2022-09-20T00:53:42.191Z"
},
"check_group": "ab348eef-387e-11ed-a397-f64628dbf41f",
"type": "browser",
"status": "down"
},
"error": {
"code": "",
"stack_trace": """page.goto: Navigation failed because page crashed!
=========================== logs ===========================
navigating to "https://www.nytimes.com/", waiting until "load"
============================================================
at Step.eval [as callback] (eval at loadInlineScript (/usr/share/heartbeat/.node/node/lib/node_modules/@elastic/synthetics/src/loader.ts:89:20), <anonymous>:3:48)
at Runner.runStep (/usr/share/heartbeat/.node/node/lib/node_modules/@elastic/synthetics/src/core/runner.ts:211:18)
at async Runner.runSteps (/usr/share/heartbeat/.node/node/lib/node_modules/@elastic/synthetics/src/core/runner.ts:261:16)
at async Runner.runJourney (/usr/share/heartbeat/.node/node/lib/node_modules/@elastic/synthetics/src/core/runner.ts:351:27)
at async Runner.run (/usr/share/heartbeat/.node/node/lib/node_modules/@elastic/synthetics/src/core/runner.ts:447:11)
at async Command.<anonymous> (/usr/share/heartbeat/.node/node/lib/node_modules/@elastic/synthetics/src/cli.ts:132:23)""",
"message": "error executing step: page.goto: Navigation failed because page crashed!",
"type": "Error"
},
"event": {
"agent_id_status": "auth_metadata_missing",
"ingested": "2022-09-20T00:53:38Z",
"type": "heartbeat/summary",
"dataset": "browser"
},
"url": {
"path": "/",
"scheme": "https",
"port": 443,
"domain": "www.nytimes.com",
"full": "https://www.nytimes.com/"
}
},
"sort": [
1663635222146
]
} |
I think it probably makes the most sense for the synthetics lib to categorize this error, give it a proper code, rather than heartbeat. So I'm moving it to that repo. |
We should remove this from 1.0 MVP scope, and instead will document the danger of setting browser concurrency levels too high on-prem. |
It's easy, when using browser-based monitors, to provision a docker container with insufficient memory. In these scenarios chrome frequently crash, being killed by the OOM killer. These errors are hard to understand for users. We should enhance the error messages to provide specific guidance that lack of memory is a likely culprit. While we have other issues like elastic/beats#32317 and elastic/beats#23687 that aim to be more proactive about memory issues, when failures do occur we should provide specific guidance.
This issue proposes that we add a note about memory utilization to any errors related to chrome crashes.
The text was updated successfully, but these errors were encountered: