-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUM-1633 Drop batch telemetry where duration or age have negative values #1850
RUM-1633 Drop batch telemetry where duration or age have negative values #1850
Conversation
@@ -89,7 +90,7 @@ internal class BatchMetricsDispatcher( | |||
deletionReason: RemovalReason | |||
): Map<String, Any?>? { | |||
val fileCreationTimestamp = file.nameAsTimestampSafe(internalLogger) ?: return null | |||
val fileAgeInMillis = dateTimeProvider.getDeviceTimestamp() - fileCreationTimestamp | |||
val fileAgeInMillis = max(0, (dateTimeProvider.getDeviceTimestamp() - fileCreationTimestamp)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it would still skew, but towards the zero. maybe it is better to ignore than this metric if it is negative?
Or maybe we don't need to use dateTimeProvider
here? Because file is created using newFileName = System.currentTimeMillis().toString()
, which is not affected at all by NTP synchronization. Meaning that to calculate file age property, we still need to use System.currentTimeMillis()
directly, this will be reliable, because we want duration, and not instant.
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iOS has the same and we wanted to align Android also to make sure we have the same charts. I am ready to discuss this into the parking lot today during our daily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is that file created using System.currentTimeMillis()
, without NTP-adjustment (well, maybe there is adjustment done by the device). It means that for file age at least we need also to use System.currentTimeMillis()
without NTP-adjustment, this should guarantee that time behind these calls is monotonic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly why we are using that here and not the NTP adjustment one. But no one can guarantee that by the time we send the metric the System time was not skewed or the time zone was not changed. This will give a negative age. We've already noticed this behaviour in our data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that getDeviceTimestamp
is affected by our NTP adjustment, apparently it is not and it is plain System.currentTimeMillis()
under the hood. Then yes, not so much we can do (apart of ignoring the metric is value is negative).
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## develop #1850 +/- ##
===========================================
- Coverage 83.45% 83.36% -0.09%
===========================================
Files 470 470
Lines 16594 16593 -1
Branches 2492 2492
===========================================
- Hits 13847 13832 -15
- Misses 2064 2072 +8
- Partials 683 689 +6
|
( | ||
fakeMetadata.lastTimeWasUsedInMs + | ||
forge.aLong(min = 100, max = 1000) | ||
).toString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just extract this into a variable? formatting looks strange here.
3a2681d
to
02a77de
Compare
02a77de
to
fddada4
Compare
What does this PR do?
In order to prevent the graphs skewing in the telemetry dashboards we are dropping the metrics for which the batch age and batch duration values are negatives due to the device time manual modification or time zone change.
Motivation
What inspired you to submit this pull request?
Additional Notes
Anything else we should know when reviewing?
Review checklist (to be filled by reviewers)