[PLAT-7848] Improve handling of concurrent crashes #1286

nickdowell · 2022-01-20T17:07:18Z

Goal

Successfully report crashes in the event that multiple threads crash at the same time.

In the current release, a secondary crash that occurs while the first crash handling thread is still working (and presumably has not yet suspended other threads) will be incorrectly identified as a "crash in the crash reporter" and trigger an internal error report (via a recrash report.)

There are also other race conditions that could cause corruption of the crash reporting process.

Changeset

Crash reporting is now explicitly one-shot (in practical terms it already was - a single crash report path is configured per process lifetime) so that only a single crash report will attempt to be written.

A recrash report is now only written for a secondary crash which occured in the original crash reporting thread.

Both of these checks are performed in bsg_kscrashsentry_beginHandlingCrash() which now takes the offending (crashed) thread as its argument. An atomic compare-exchange operation is used to ensure only a single thread can win the race to be reported.

Crashes in secondary threads are prevented from immediately killing the process by waiting in bsg_kscrashsentry_beginHandlingCrash() until the crash handling has finished.

Testing

An E2E scenario that reliably reproduced the problem has been added, and the fix verified in multiple runs. Attempts to include checks of the stackframe contents failed because C++ exception stacktraces are not currently recorded when Bugsnag is linked dynamically, which it is for the Mac fixture.

Existing test for recrash reports have also been run successfully.

Bugsnag/KSCrash/Source/KSCrash/Recording/Sentry/BSG_KSCrashSentry.c

github-actions · 2022-01-20T17:22:51Z

Infer: No issues found 🎉

OCLint: No issues found 🎉

Bugsnag.framework binary size increased by 984 bytes from 1,299,736 to 1,300,720

Generated by 🚫 Danger

github-actions bot reviewed Jan 20, 2022

View reviewed changes

Bugsnag/KSCrash/Source/KSCrash/Recording/Sentry/BSG_KSCrashSentry.c Outdated Show resolved Hide resolved

nickdowell force-pushed the nickdowell/multiple-crashing-threads branch from f39da20 to 0605f83 Compare January 21, 2022 09:23

Improve handling of concurrent crashes

a5800f4

nickdowell force-pushed the nickdowell/multiple-crashing-threads branch from 0605f83 to a5800f4 Compare January 21, 2022 15:10

nickdowell requested review from kattrali and kstenerud January 24, 2022 08:59

kstenerud approved these changes Jan 24, 2022

View reviewed changes

nickdowell merged commit 9f4f8f2 into next Jan 25, 2022

nickdowell deleted the nickdowell/multiple-crashing-threads branch January 25, 2022 13:17

nickdowell mentioned this pull request Jan 26, 2022

Release v6.16.2 #1288

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PLAT-7848] Improve handling of concurrent crashes #1286

[PLAT-7848] Improve handling of concurrent crashes #1286

nickdowell commented Jan 20, 2022 •

edited

Loading

github-actions bot commented Jan 20, 2022 •

edited

Loading

[PLAT-7848] Improve handling of concurrent crashes #1286

[PLAT-7848] Improve handling of concurrent crashes #1286

Conversation

nickdowell commented Jan 20, 2022 • edited Loading

Goal

Changeset

Testing

github-actions bot commented Jan 20, 2022 • edited Loading

nickdowell commented Jan 20, 2022 •

edited

Loading

github-actions bot commented Jan 20, 2022 •

edited

Loading