Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Heartbeat] Improve browser heap memory usage #32317

Merged
merged 4 commits into from
Jul 18, 2022
Merged

Conversation

andrewvc
Copy link
Contributor

@andrewvc andrewvc commented Jul 12, 2022

This is a draft in WIP until I can confirm the leak.

This patch fixes does not fix a potential memory leak as mentioned on the forum, but does dramatically increases heartbeat's efficiency WRT parsing the JSON output of the synthetics agent.

It does appear that we don't close all the readers used by scanToSynthEvents which this patch fixes. I did observe heap growth w/o this patch over a short timescale, but that's almost certainly not a resource leak, those readers should have been auto closed when exec ended anyway.

We do however waste a lot of memory / allocations buffering lines of JSON, which can be quite large. The json decoder can actually parse ndjson very efficiently itself. This also lets us use a smaller buffer for stdout/stderr buffering. We had allocated a larger one prior to handle the base64 image data passed by the agent.

Switching to using it totally altered the heap profile, removing the top user of memory scanToSynthEvents.

Heap profile before

profile-scan-to-synth

Heap profile after

image

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Run with some browser monitors of your choice. To profile I'm using the following commands:

# Run heartbeat inside x-pack/heartbeat with pprof enabled
mage build && env ELASTIC_SYNTHETICS_CAPABLE=true ./heartbeat -e --httpprof localhost:4567 2>&1 | jq .message

# Dump memory profile
curl http://localhost:4567/debug/pprof/heap -o dump.out

# Show web UI for memory profile
go tool pprof -http=:8081 heapdirectscan.out

This patch fixes a potential memory leak, and also dramatically increases heartbeat's efficiency WRT parsing the JSON output of the synthetics agent.

It does appear that we don't close all the readers used by `scanToSynthEvents` which this patch fixes. I did observe heap growth w/o this patch over a short timescale.
I'll run longer tests to confirm soon.

Additionally, we waste a lot of memory / allocations buffering lines of JSON, which can be quite large. The `json` decoder can actually parse ndjson very efficiently itself.

Switching to using it totally altered the heap profile, removing the top user of memory `scanToSynthEvents`.
@andrewvc andrewvc added enhancement Team:obs-ds-hosted-services Label for the Observability Hosted Services team v8.4.0 labels Jul 12, 2022
@andrewvc andrewvc self-assigned this Jul 12, 2022
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jul 12, 2022
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 12, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-07-14T13:29:41.442+0000

  • Duration: 42 min 18 sec

Test stats 🧪

Test Results
Failed 0
Passed 142
Skipped 0
Total 142

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@andrewvc andrewvc requested a review from emilioalvap July 13, 2022 19:05
@andrewvc andrewvc marked this pull request as ready for review July 13, 2022 21:03
@andrewvc andrewvc requested a review from a team as a code owner July 13, 2022 21:03
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (Team:Uptime)

Copy link
Collaborator

@emilioalvap emilioalvap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested in parallel with 8.4.0-SNAPSHOT and saw lower memory allocation per monitor.

@andrewvc andrewvc merged commit e6db9a5 into elastic:main Jul 18, 2022
@andrewvc andrewvc deleted the fix-mem branch July 18, 2022 16:51
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
This patch  does not fix a potential memory leak as mentioned on the forum, but does dramatically increases heartbeat's efficiency WRT parsing the JSON output of the synthetics agent.

It does appear that we don't close all the readers used by scanToSynthEvents which this patch fixes. I did observe heap growth w/o this patch over a short timescale, but that's almost certainly not a resource leak, those readers should have been auto closed when exec ended anyway.

We do however waste a lot of memory / allocations buffering lines of JSON, which can be quite large. The json decoder can actually parse ndjson very efficiently itself. This also lets us use a smaller buffer for stdout/stderr buffering. We had allocated a larger one prior to handle the base64 image data passed by the agent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Team:obs-ds-hosted-services Label for the Observability Hosted Services team v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants