Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving event loop delay representation #119069

Closed
mshustov opened this issue Nov 18, 2021 · 4 comments · Fixed by #120451
Closed

Improving event loop delay representation #119069

mshustov opened this issue Nov 18, 2021 · 4 comments · Fixed by #120451
Assignees
Labels
Feature:Logging impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. loe:small Small Level of Effort Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@mshustov
Copy link
Contributor

Starting from v7.16, Kibana collects event loop delay distribution provided by nodejs natively in a form of a histogram.
The mean property of the histogram is transformed into ms units and emitted to the Kibana logs

memory: 690.3MB uptime: 0:00:25 load: [4.16,7.93,7.34] delay: 16.996

Also, the same value is provided via /api/status API, which in turn is used by Kibana Monitoring UI.

Using mean as a basic metric has at least one disadvantage: it's hard to tell how many event loop cycles were above a certain limit.
To simplify the investigation of performance problems, Kibana could

  • include in the logs an extended sub-set of event loop delay histogram: { 50th: number; 95th: number; 99th: number }.
  • provide the event loop delay histogram via /api/status API to allow Monitoring UI to provide more detailed metrics.

cc @pmuellr @kobelb to the question of improving the format of the logs to investigate Kibana performance problems.

@mshustov mshustov added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Feature:Logging labels Nov 18, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@pmuellr
Copy link
Member

pmuellr commented Nov 18, 2021

It would be awesome to have the times the EL block was "maxed out", but since the native APIs don't give us that, oh well. TBH, I want the stack trace at the time as well, think I'll be waiting a bit on that :-)

Since the native APIs do let us get p numbers, I think p50/p90/p99 would be nice addition. Just a bit more insight, and more importantly, basically the max.

@Bamieh
Copy link
Member

Bamieh commented Nov 30, 2021

@pmuellr I was hoping the exceeds metric would give that detail but I was unable to figure out what it actually does even after digging into the nodejs code

@exalate-issue-sync exalate-issue-sync bot added impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. loe:small Small Level of Effort labels Dec 1, 2021
@TinaHeiligers TinaHeiligers self-assigned this Dec 2, 2021
@TinaHeiligers
Copy link
Contributor

I'll be pushing up a pr soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Logging impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. loe:small Small Level of Effort Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants