Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shutdown Kronos clock only after executors #1127

Merged

Conversation

0xnm
Copy link
Member

@0xnm 0xnm commented Nov 4, 2022

What does this PR do?

It is possible that Kronos is accessed from tasks still running on executors (like from telemetry caught during some I/O operation) and it will lead to the IllegalStateException, like here:

11-04 04:04:42.160: E/AndroidRuntime(7085): Process: com.datadog.android.sdk.integration, PID: 7085
11-04 04:04:42.160: E/AndroidRuntime(7085): java.lang.IllegalStateException: Service already shutdown
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.lyft.kronos.internal.ntp.SntpServiceImpl.ensureServiceIsRunning(SntpService.kt:142)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.lyft.kronos.internal.ntp.SntpServiceImpl.currentTime(SntpService.kt:98)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.lyft.kronos.internal.KronosClockImpl.getCurrentTime(KronosClockImpl.kt:19)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.lyft.kronos.KronosClock$DefaultImpls.getCurrentTimeMs(Clock.kt:39)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.lyft.kronos.internal.KronosClockImpl.getCurrentTimeMs(KronosClockImpl.kt:8)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.core.internal.time.KronosTimeProvider.getServerOffsetMillis(KronosTimeProvider.kt:25)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.telemetry.internal.TelemetryEventHandler.handleEvent(TelemetryEventHandler.kt:40)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.rum.internal.monitor.DatadogRumMonitor.handleEvent$dd_sdk_android_debug(DatadogRumMonitor.kt:385)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.rum.internal.monitor.DatadogRumMonitor.sendErrorTelemetryEvent(DatadogRumMonitor.kt:346)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.telemetry.internal.Telemetry.error(Telemetry.kt:21)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.log.internal.logger.TelemetryLogHandler.handleLog(TelemetryLogHandler.kt:22)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.log.internal.logger.InternalLogHandler.handleLog(InternalLogHandler.kt:40)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.log.Logger.internalLog$dd_sdk_android_debug(Logger.kt:566)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.log.Logger.internalLog$dd_sdk_android_debug$default(Logger.kt:556)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.log.Logger.log(Logger.kt:169)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.log.internal.utils.LogUtilsKt.errorWithTelemetry(LogUtils.kt:45)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.log.internal.utils.LogUtilsKt.errorWithTelemetry$default(LogUtils.kt:40)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.core.internal.persistence.file.batch.PlainBatchFileReaderWriter.writeData(PlainBatchFileReaderWriter.kt:50)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.v2.core.internal.storage.FileEventBatchWriter.write(FileEventBatchWriter.kt:48)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.rum.internal.domain.RumDataWriter.write(RumDataWriter.kt:43)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.rum.internal.domain.scope.RumViewScope$sendViewUpdate$1.invoke(RumViewScope.kt:726)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.rum.internal.domain.scope.RumViewScope$sendViewUpdate$1.invoke(RumViewScope.kt:660)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.core.internal.SdkFeature$withWriteContext$1.invoke(SdkFeature.kt:135)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.core.internal.SdkFeature$withWriteContext$1.invoke(SdkFeature.kt:134)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.v2.core.internal.storage.ConsentAwareStorage.writeCurrentBatch$lambda-0(ConsentAwareStorage.kt:74)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.v2.core.internal.storage.ConsentAwareStorage.$r8$lambda$VSlkAQX2O5ir-BQmjPcJjBdEv4c(Unknown Source:0)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at com.datadog.android.v2.core.internal.storage.ConsentAwareStorage$$ExternalSyntheticLambda0.run(Unknown Source:6)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
11-04 04:04:42.160: E/AndroidRuntime(7085): 	at java.lang.Thread.run(Thread.java:764)

This change modifies SDK shutdown sequence, so that Kronos is shutdown only after all executors are shut down, meaning there is no tasks running left.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Make sure you discussed the feature or bugfix with the maintaining team in an Issue
  • Make sure each commit and the PR mention the Issue number (cf the CONTRIBUTING doc)

@0xnm 0xnm requested a review from a team as a code owner November 4, 2022 11:38
@codecov-commenter
Copy link

Codecov Report

Merging #1127 (1aa8b50) into develop (a669a1f) will decrease coverage by 0.05%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop    #1127      +/-   ##
===========================================
- Coverage    83.30%   83.25%   -0.05%     
===========================================
  Files          273      273              
  Lines         9372     9372              
  Branches      1504     1504              
===========================================
- Hits          7807     7802       -5     
+ Misses        1149     1148       -1     
- Partials       416      422       +6     
Impacted Files Coverage Δ
...n/com/datadog/android/core/internal/CoreFeature.kt 91.60% <100.00%> (ø)
...android/rum/internal/ndk/DatadogNdkCrashHandler.kt 84.86% <0.00%> (-4.32%) ⬇️
.../android/rum/internal/monitor/DatadogRumMonitor.kt 92.05% <0.00%> (-0.57%) ⬇️
...ndroid/core/internal/persistence/file/EventMeta.kt 80.00% <0.00%> (ø)
...ain/java/com/datadog/opentracing/PendingTrace.java 58.62% <0.00%> (+0.86%) ⬆️
...nternal/persistence/file/batch/BatchFileHandler.kt 81.13% <0.00%> (+2.83%) ⬆️

@0xnm 0xnm merged commit c54a0ba into develop Nov 7, 2022
@0xnm 0xnm deleted the nogorodnikov/shutdown-kronos-clock-after-the-executors branch November 7, 2022 12:42
@xgouchet xgouchet added this to the 1.15.0 milestone Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants