-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(stdlib): sort statuses also by _time #3996
fix(stdlib): sort statuses also by _time #3996
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix looks good and the changes make sense. Thanks for fixing this!
Can I trouble you to adapt your test to our newer format which uses a Flux keyword testcase
? We want to be using it moving forward. Here is an example:
flux/stdlib/universe/window_test.flux
Line 7 in 955bc03
testcase window_period_gaps { |
104b1af
to
462093c
Compare
@wolffcm Done. Nice feature! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look great. Thanks again!
@alespour Unfortunately we had to revert this commit because the new test doesn't pass in the OSS version of The culprit seems to be that the data being passed to
The first option is the easiest and that's what I recommend. It should be as simple as just deleting the Unfortunately, I'm not sure of the right steps to check if a new Flux test like this will pass in |
This PR fixes issue in alerting where, depending on check and rule scheduling and timing configuration, state changes are sometimes not detected and therefore notifications not sent.
The problem is that status of data points may be evaluated multiple times with certain scheduling, for example as reported in #3807, deadman check evaluates status based on data point time and not a metric value. That results in existence of multiple statuses with the same source timestamp but different time (and level) in the monitoring bucket. Then, when an alert rule attempts to detect state change, it gets incorrect sequence of statuses, because sorting by only (the same) source timestamp is insufficient.
Affected functions are
stateChangesOnly
and_stateChanges
functions inmonitor
package, where beforesort
,_level
is dropped from the group key which results in a new series with possibly unpredictable order (by_time
).The issue is a side effect of the changes in #2725. Mea culpa.
Done checklist