Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: telemetry - loop duration & delay #256

Merged
merged 5 commits into from
Nov 27, 2024
Merged

feat: telemetry - loop duration & delay #256

merged 5 commits into from
Nov 27, 2024

Conversation

bajtos
Copy link
Member

@bajtos bajtos commented Nov 21, 2024

Monitor how long each loop takes to finish and what's the delay before the next
iteration, so that we can alert when things become too slow.

@bajtos bajtos requested a review from juliangruber November 21, 2024 13:18
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos force-pushed the monitor-loop-duration branch from 345d41f to 44b259a Compare November 21, 2024 13:19
observer/bin/spark-observer.js Show resolved Hide resolved
observer/bin/spark-observer.js Outdated Show resolved Hide resolved
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos requested a review from juliangruber November 21, 2024 13:35
observer/bin/spark-observer.js Outdated Show resolved Hide resolved
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
Signed-off-by: Miroslav Bajtoš <oss@bajtos.net>
@bajtos bajtos requested a review from juliangruber November 21, 2024 13:54
@bajtos bajtos enabled auto-merge (squash) November 27, 2024 13:07
@bajtos bajtos merged commit ae308f1 into main Nov 27, 2024
9 checks passed
@bajtos bajtos deleted the monitor-loop-duration branch November 27, 2024 13:07
@bajtos
Copy link
Member Author

bajtos commented Nov 27, 2024

I created a new Grafana panel, configured an alert when the iteration takes longer than 60% of the interval between iterations, and disabled the CPU throttling alert.

Screenshot 2024-11-27 at 15 11 08

@bajtos
Copy link
Member Author

bajtos commented Nov 27, 2024

Unfortunately, the alert does not work 😠

The composer UI shows that all is good. But when the Grafana alerting system runs, it reports the following error:

Something went wrong when evaluating this alert rule
invalid format of evaluation results for the alert definition : frame cannot uniquely be identified by its labels: has duplicate results with labels {}

I searched the internet but could not find any useful answer. This is such a waste of time. I decided to re-enable the cpu-throttling alert and pause the new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants