Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed new strategic initiative: revamping tracing/metrics collection #853

Closed
jasnell opened this issue Apr 23, 2020 · 6 comments
Closed

Comments

@jasnell
Copy link
Member

jasnell commented Apr 23, 2020

Node.js currently uses a number of different mechanisms for tracking performance metrics internally.

  • Trace events
  • DTrace / ETW probe points
  • Perf_hooks
  • process.memoryUsage()
  • process.cpuUsage()
  • process.resourceUsage()
  • v8.getHeapStatistics()
  • v8.getHeapCodestatistics()
    and so on.

These use a number of divergent mechanisms internally with very little consistency, making it complicated and cumbersome for someone to take a complete system-wide view of the metrics.

Further, Worker Threads make it even more difficult because some metrics become thread specific (e.g. process.memoryUsage()) while others are process wide.

Lastly, some of the mechanisms (DTrace, ETW and the trace events implementation) are under supported and problematic. The trace events implementation, for instance, will often abort under load when running worker threads because it has not yet been made fully thread safe. The team at google that had been working on the implementation is no longer engaged and has moved on to other things so the code has largely sat unfinished.

I have started investigating a top down overhaul of the metrics collection mechanisms in Node with the intent on providing a single, clear, coherent subsystem for per-process and per-isolate metrics tracking and reporting that will support multiple targets and use cases with a much cleaner implementation. A key goal will be to make it easier and more reliable to attach various analytics tools on top of Node.js (e.g. clinic.js, n|solid, apms, etc) without having to rely on hacks or building custom versions of the runtime. I also want to increase the visibility/observability of various key components of the platform and modernize metrics collection and reporting for tools such as Prometheus.

This will be a large effort that will take some time to get right and will require input from a number of folks. I'm still working through some work plan details now but I wanted to at least provide some notification that I was starting this effort.

/cc @nodejs/diagnostics @mmarchini @addaleax @sam-github @mcollina

@mhdawson
Copy link
Member

I'm definitely supportive of an effort on the Diagnostics side. Do you want to summit a PR to added it to the strategic initiatives list? It would be great to have a top level issues that can be used to hold references to the ongoing/complete work.

One other thing is how current metrics feed into reporting through Prometheus and anything we should be making available that can be exposed through modules like prom-client.

@sam-github
Copy link
Contributor

Would make a good collab summit topic, too.

@legendecas
Copy link
Member

legendecas commented Jun 24, 2021

Since this proposal is opened for one year, I'd like to ask if there is any further discussion on this? Also, I'm wondering if there is anything the diagnostics team could get involved or lead the discussion and following up actions on this topic since I see most areas in the topic are somewhat related to diagnostic tools.

@mhdawson
Copy link
Member

This has been open for almost a year since the last comment. I think we should likely close unless we can find a champion for the initiative. Otherwise related discussion can take place in the diagnostics wg.

@jasnell unless you are still planning to work on this as announced in the original post is it ok if I close this issue?

@Jamlee
Copy link

Jamlee commented Jun 29, 2022

add http https http2 perfermance mertic. 😄

@mhdawson
Copy link
Member

mhdawson commented Apr 5, 2023

@jasnell I think this was meant as an FYI and since its been almost a year and a half since the FYI it can be closed. Please let me know if you think that was not the right thing to do.

@mhdawson mhdawson closed this as completed Apr 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants