Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elastic-agent pprof enablement defaults #29108

Closed
michel-laterman opened this issue Nov 23, 2021 · 6 comments · Fixed by #29155
Closed

elastic-agent pprof enablement defaults #29108

michel-laterman opened this issue Nov 23, 2021 · 6 comments · Fixed by #29155
Assignees
Labels
discuss Issue needs further discussion. Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v7.16.0 v8.0.0 v8.1.0

Comments

@michel-laterman
Copy link
Contributor

The current behaviour for the elastic-agent and its underlying beats is to have all of them expose /debug/pprof/ endpoints on the monitoring http handler.

These handler are available on beats and the agent even when the policy disables monitoring.
For example, given an agent (v7.15.2) enrolled via fleet where monitoring is disabled in the policy:

$ sudo elastic-agent inspect
agent:
  monitoring:
    enabled: false
    logs: false
    metrics: false
...

An operator is still able to access the http handler bound to a socket for the beats as well as the agent:

$ sudo curl --unix-socket /Library/Elastic/Agent/data/tmp/default/filebeat/filebeat.sock http://socket/
{"beat":"filebeat","hostname":"Michels-MacBook-Pro.local","name":"Michels-MacBook-Pro.local","uuid":"e3511b3c-efe3-4d91-837c-ea5bb358ca6c","version":"7.15.2"}%

$ sudo curl --unix-socket /Library/Elastic/Agent/data/tmp/elastic-agent.sock http://socket/stats
{"beat":{"cpu":{"system":{"ticks":2450,"time":{"ms":2451}},"total":{"ticks":7256,"time":{"ms":7257},"value":7256},"user":{"ticks":4806,"time":{"ms":4806}}},"info":{"ephemeral_id":"bcc20dff-f2f8-47e4-af6d-7515eca1e1d7","uptime":{"ms":293981},"version":"7.15.2"},"memstats":{"gc_next":9362480,"memory_alloc":4846208,"memory_sys":78201864,"memory_total":131217912,"rss":44634112},"runtime":{"goroutines":46}},"system":{"cpu":{"cores":16},"load":{"1":3.0078,"15":3.4629,"5":3.6128,"norm":{"1":0.188,"15":0.2164,"5":0.2258}}}}%

The http handlers are bound to local unix sockets or Windows n pipes by default (for the beats and the agent).
The agent's handler/endpoint can be configured with the agent.monitoring.* options, and pprof can be enabled/disabled with the agent.monitoring.pprof option (these options are not passed to the underlying beats).

Having the http handlers available in all cases should not be an issue as the (non-pprof) endpoints do not expose any confidential information, and the endpoints are also used by the diagnostics commands.
However we need to decide if adding the /debug/pprof/ endpoints to these handlers (by default) is acceptable.

@michel-laterman michel-laterman added discuss Issue needs further discussion. v8.0.0 v7.16.0 Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v8.1.0 labels Nov 23, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@michel-laterman
Copy link
Contributor Author

@ruflin

@ruflin
Copy link
Contributor

ruflin commented Nov 24, 2021

My personal preference it to turn pprof endpoints off by default if monitoring is enabled and let the user decide if these should be enabled too.

If these are enabled by default, will the diagnostic command be able to enable these on demand or then this information is not available?

@michel-laterman
Copy link
Contributor Author

michel-laterman commented Nov 24, 2021

If these are enabled by default, the diagnostics command will be able to collect pprof information from the agent/beats on request.
If they are disabled by default the diagnostics command will not be able to gather any pprof data

@ruflin
Copy link
Contributor

ruflin commented Nov 25, 2021

It is unfortunate that we need to pick either one or the other but as the pprof output potentially could be a security issue, I'm leaning towards disabling it by default. If pprof is run as part of diagnostic and it was enabled, we likely need to tell the user to restart.

@michel-laterman What is your recommendation?

@michel-laterman
Copy link
Contributor Author

we can disable by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs further discussion. Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v7.16.0 v8.0.0 v8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants