Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip copying jfr recoding into a temporary file #1242

Merged
merged 2 commits into from
May 8, 2023

Conversation

laurit
Copy link
Collaborator

@laurit laurit commented May 2, 2023

No description provided.

@laurit laurit requested review from a team as code owners May 2, 2023 08:43
Comment on lines -108 to -115
### What about this escape hatch?

If the escape hatch becomes active, it will log with `com.splunk.opentelemetry.profiler.RecordingEscapeHatch`
(you can grep for this in the logs). You may also look for `"** THIS WILL RESULT IN LOSS OF PROFILING DATA **"`
as a big hint that things are not well.

You may need to free up some disk space and/or give the JVM more resources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laurit @breedx-splk So this is gone? If that's the case, I'll remove it from the official troubleshooting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current text:

Loss of profiling data or gaps in profiling data
-------------------------------------------------------------

If there are less than 100 megabytes of space available for the Java Virtual Machine, AlwaysOn Profiling activates the recording escape hatch, which appears in the logs as ``com.splunk.opentelemetry.profiler.RecordingEscapeHatch``. The escape hatch drops all logs with profiling data until more space is available.

To avoid the loss of profiling data due to the escape hatch, provide enough resources to the JVM.

Copy link
Contributor

@mateuszrzeszutek mateuszrzeszutek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@breedx-splk
Copy link
Contributor

Oh I think I had missed that we were doing that copy in the first place. Seems like a good idea to prevent that if we can.

I'm not sure why this change also removes support for the escape hatch. Shouldn't we still use that? We don't want to fill up the disk.

@laurit
Copy link
Collaborator Author

laurit commented May 5, 2023

I'm not sure why this change also removes support for the escape hatch. Shouldn't we still use that? We don't want to fill up the disk.

As we are not copying the recording to a file there is no need to check for free space. As for backlogged check I suspect it didn't work as these files were processed one at a time, previous one was deleted before the next was written do disk.

@breedx-splk
Copy link
Contributor

I'm not sure why this change also removes support for the escape hatch. Shouldn't we still use that? We don't want to fill up the disk.

As we are not copying the recording to a file there is no need to check for free space. As for backlogged check I suspect it didn't work as these files were processed one at a time, previous one was deleted before the next was written do disk.

Ah ok, I hadn't realized that we're not writing to disk at ALL now in the normal flow. I had assumed that the recordings were still to disk, but I do see that we setToDisk(false) in start(). Seems good to me.

I suppose there is chance that we could fill up the disk with the keepfiles setting, but I'm also ok with that since it's only really for development/testing purposes anyway.

@theletterf
Copy link
Contributor

theletterf commented May 6, 2023 via email

@laurit laurit merged commit eba92af into signalfx:main May 8, 2023
@laurit laurit deleted the tmp-jrf-file branch May 8, 2023 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants