-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus metrics blocks tornado main thread #123
Comments
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗 |
Poor mans fix for jupyter-server#123
Posted PR #124 which obviously doesn't solve the underlying issue but at least lets users who aren't using prometheus to avoid the issue. A good solution is to also run psutil on a separate thread for Prometheus. The best solution is two merge the code for getting metrics. For example you could have the API handler use the most recent entry in the prometheus list of metrics |
This turned out to be an issue for my deployments. To remedy the problem we ended up completely removing the Prometheus callback by commenting out the appropriate section in The periodic Prometheus handler caused the introduction of "skipping" & "lag" while using jupyter-server-proxy to connect to a VNC server via a proxied websocket; making it completely unusable. |
We've encountered this issue as well. Maybe we can get @dleen's PR merged for now, until there is a better solution. |
Poor mans fix for jupyter-server#123
Poor mans fix for jupyter-server#123
Poor mans fix for jupyter-server#123
Poor mans fix for jupyter-server#123
* Allow users to opt out of prometheus metrics Poor mans fix for #123 * Update README.md Co-authored-by: Kevin Bates <kbates4@gmail.com> * Lint README.md Co-authored-by: Kevin Bates <kbates4@gmail.com> Co-authored-by: Jeremy Tuloup <jeremy.tuloup@gmail.com>
Description
A bug was reported in jtpio/jupyterlab-system-monitor#87 about the UI lagging with several kernels running. The issue was traced to the system monitor extension as disabling that extension while keeping the same load on the system made the UI issue go away.
Reproduce
Create multiple notebooks with contents:
Open a terminal and (hopefully your key repeat speed is high enough) hold down a character e.g. "x" to get continuous input into the terminal. This should be very smooth, you should see characters appearing rapidly and without pause.
Now relaunch the server with
--ResourceUseDisplay.track_cpu_percent=True
.Repeat the process. While holding down a key in the terminal you will notice frequent lags and pauses.
Expected behavior
The UI does not lag with the extension enabled.
Problem
The API handler does the right thing by running the call to psutil on a separate thread: https://github.com/jupyter-server/jupyter-resource-usage/blob/master/jupyter_resource_usage/api.py#L66
However the prometheus metrics uses a different implementation (why?) and does the same expensive operation on the main tornado thread which blocks other calls: https://github.com/jupyter-server/jupyter-resource-usage/blob/master/jupyter_resource_usage/metrics.py#L40
You can prove this is the root cause by simply disabling this and the following lines: https://github.com/jupyter-server/jupyter-resource-usage/blob/master/jupyter_resource_usage/server_extension.py#L22
When this callback is removed the UI no longer lags every second.
The text was updated successfully, but these errors were encountered: