Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent overlapped decompression of embedded assemblies #7732

Merged
merged 6 commits into from
Jan 26, 2023

Conversation

grendello
Copy link
Contributor

@grendello grendello commented Jan 24, 2023

Fixes: #7335

Commit d236af5 introduced embedded assembly compression,
using the lz4 algorithm, which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-allocated buffers for
the decompressed data are allocated in libxamarin-app.so. The buffers are then passed
to the LZ4 APIs, all threads using the same output buffer. The assumption was that we can
do fine without locking as even if overlapped decompression happens, the output data will be
the same and so even if two threads do the same thing at the same time, the data will be valid
at all times, so long as at least one thread completes the decompression.

This assumption proved to be largely true, but it appears that in high concurrency cases it
is possible that the data in decompression buffer differs. My guess is that LZ4 either uses the
output buffer as a scratchpad area when decompressing or that it initializes/modifies the buffer
before writing actual data in it. With overlapped decompression, it may lead to one thread
overwriting valid data previously written by another thread, so that when the latter returns the
buffer it thought to have had valid data may contain certain bytes temporarily overwritten by the
decompression session in the other, still running, thread. It may happen that MonoVM reads the
corrupted data just when it is still invalid (before the still running decompression session actually
writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-aware mutex. Mutex will be held
only after the initial startup phase is completed, so there should not be much loss of startup
performance.

Comment on lines -119 to -120
internal_timing->end_event (decompress_time_index, true /* uses_more_info */);
internal_timing->add_more_info (decompress_time_index, name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this timing message not that helpful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this timing message not that helpful?

Yeah, I used it initially to see how much decompression itself costs us, but right now it's just wasted CPU cycles.

@@ -81,31 +89,32 @@ EmbeddedAssemblies::get_assembly_data (uint8_t *data, uint32_t data_size, [[mayb
if (header->magic == COMPRESSED_DATA_MAGIC) {
if (XA_UNLIKELY (compressed_assemblies.descriptors == nullptr)) {
log_fatal (LOG_ASSEMBLY, "Compressed assembly found but no descriptor defined");
exit (FATAL_EXIT_MISSING_ASSEMBLY);
abort ();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why abort() and not exit()? I forget what abort() does on Android…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exit() causes the application to restart, abort() kills it with a native SIGABRT stack trace

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of places that users complain about a "white screen", and it ended up their app was crashing and restarting.

Should we audit all of the exit() calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should, I've been meaning to do it for quite a while now. I'll open a separate PR for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #7734 opened

@grendello
Copy link
Contributor Author

Let's wait for PR #7734 to be merged first.

@grendello grendello added the do-not-merge PR should not be merged. label Jan 25, 2023
* main:
  [monodroid] Replace `exit()` with `abort()` in native code (dotnet#7734)
  Bump to xamarin/Java.Interop/main@8a1ae57 (dotnet#7738)
  [build] bump `$(AndroidNet7Version)` (dotnet#7737)
  Bump to xamarin/Java.Interop/main@1366d99 (dotnet#7718)
  [Xamarin.Android.Build.Tasks] fix AndroidGenerateResourceDesigner (dotnet#7721)
  Bump to xamarin/monodroid@50faac94 (dotnet#7725)
@grendello grendello removed the do-not-merge PR should not be merged. label Jan 25, 2023
@jonpryor
Copy link
Member

jonpryor commented Jan 26, 2023

[monodroid] Prevent overlapped decompression of embedded assemblies (#7732)

Fixes: https://github.com/xamarin/xamarin-android/issues/7335

Context: d236af54538ef1fe62d0125798cdde78e46dcd8e

Commit d236af54 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the **decompressed** data are allocated in
`libxamarin-app.so`.  The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer.  The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be **largely** true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs.  This can result in app crashes:

	A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
	A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
	A/DEBUG:       #01 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
	A/DEBUG:       #02 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       #03 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       #04 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it.  With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread.  It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex.  Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.

@jonpryor jonpryor merged commit 2ff72d7 into dotnet:main Jan 26, 2023
jonpryor pushed a commit to jonpryor/xamarin-android that referenced this pull request Jan 26, 2023
…otnet#7732)

Fixes: dotnet#7335

Context: d236af5

Commit d236af5 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the **decompressed** data are allocated in
`libxamarin-app.so`.  The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer.  The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be **largely** true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs.  This can result in app crashes:

	A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
	A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
	A/DEBUG:       #1 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
	A/DEBUG:       #2 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#3 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#4 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	…

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it.  With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread.  It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex.  Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.
jonpryor pushed a commit to jonpryor/xamarin-android that referenced this pull request Jan 27, 2023
…otnet#7732)

Fixes: dotnet#7335

Context: d236af5

Commit d236af5 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the **decompressed** data are allocated in
`libxamarin-app.so`.  The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer.  The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be **largely** true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs.  This can result in app crashes:

	A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
	A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
	A/DEBUG:       #1 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
	A/DEBUG:       #2 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#3 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#4 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	…

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it.  With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread.  It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex.  Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.
grendello added a commit to grendello/xamarin-android that referenced this pull request Jan 27, 2023
* main:
  Bump r8 from 3.3.75 to 4.0.48 (dotnet#7700)
  [monodroid] Prevent overlapped decompression of embedded assemblies (dotnet#7732)
  [xaprepare] Support arm64 emulator components (dotnet#7743)
jonpryor pushed a commit to jonpryor/xamarin-android that referenced this pull request Jan 27, 2023
…otnet#7732)

Fixes: dotnet#7335

Context: d236af5

Commit d236af5 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the **decompressed** data are allocated in
`libxamarin-app.so`.  The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer.  The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be **largely** true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs.  This can result in app crashes:

	A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
	A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
	A/DEBUG:       #1 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
	A/DEBUG:       #2 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#3 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#4 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	…

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it.  With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread.  It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex.  Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.
jonpryor pushed a commit that referenced this pull request Jan 28, 2023
…7732)

Fixes: #7335

Context: d236af5

Commit d236af5 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the **decompressed** data are allocated in
`libxamarin-app.so`.  The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer.  The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be **largely** true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs.  This can result in app crashes:

	A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
	A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
	A/DEBUG:       #1 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
	A/DEBUG:       #2 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       #3 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       #4 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	…

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it.  With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread.  It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex.  Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.
grendello added a commit to grendello/xamarin-android that referenced this pull request Jan 30, 2023
* main:
  [Xamarin.Android.Build.Tasks] Fix issue where app will not install (dotnet#7719)
  Bump to dotnet/installer@779a644 8.0.100-alpha.1.23070.23 (dotnet#7728)
  LEGO: Merge pull request 7751
  [Mono.Android] Wrap connection exceptions in HttpRequestException (dotnet#7661)
  [Mono.Android] Fix View.SystemUiVisibility enumification (dotnet#7730)
  Bump r8 from 3.3.75 to 4.0.48 (dotnet#7700)
  [monodroid] Prevent overlapped decompression of embedded assemblies (dotnet#7732)
  [xaprepare] Support arm64 emulator components (dotnet#7743)
  Bump SQLite to 3.40.1 (dotnet#7733)
  Bump to xamarin/xamarin-android-binutils/L_15.0.7-5.0.3@6721af4b (dotnet#7742)
  [monodroid] Replace `exit()` with `abort()` in native code (dotnet#7734)
  Bump to xamarin/Java.Interop/main@8a1ae57 (dotnet#7738)
  [build] bump `$(AndroidNet7Version)` (dotnet#7737)
  Bump to xamarin/Java.Interop/main@1366d99 (dotnet#7718)
  [Xamarin.Android.Build.Tasks] fix AndroidGenerateResourceDesigner (dotnet#7721)
  Bump to xamarin/monodroid@50faac94 (dotnet#7725)
grendello added a commit to grendello/xamarin-android that referenced this pull request Feb 9, 2023
* main: (112 commits)
  [ci] Remove classic Mono.Android-Tests runs (dotnet#7778)
  $(AndroidPackVersionSuffix)=preview.2; net8 is 34.0.0-preview.2 (dotnet#7761)
  Localized file check-in by OneLocBuild Task (dotnet#7758)
  Bump to dotnet/installer@dec1209 8.0.100-alpha.1.23080.11 (dotnet#7755)
  LEGO: Merge pull request 7756
  Localized file check-in by OneLocBuild (dotnet#7752)
  [Xamarin.Android.Build.Tasks] Fix issue where app will not install (dotnet#7719)
  Bump to dotnet/installer@779a644 8.0.100-alpha.1.23070.23 (dotnet#7728)
  LEGO: Merge pull request 7751
  [Mono.Android] Wrap connection exceptions in HttpRequestException (dotnet#7661)
  [Mono.Android] Fix View.SystemUiVisibility enumification (dotnet#7730)
  Bump r8 from 3.3.75 to 4.0.48 (dotnet#7700)
  [monodroid] Prevent overlapped decompression of embedded assemblies (dotnet#7732)
  [xaprepare] Support arm64 emulator components (dotnet#7743)
  Bump SQLite to 3.40.1 (dotnet#7733)
  Bump to xamarin/xamarin-android-binutils/L_15.0.7-5.0.3@6721af4b (dotnet#7742)
  [monodroid] Replace `exit()` with `abort()` in native code (dotnet#7734)
  Bump to xamarin/Java.Interop/main@8a1ae57 (dotnet#7738)
  [build] bump `$(AndroidNet7Version)` (dotnet#7737)
  Bump to xamarin/Java.Interop/main@1366d99 (dotnet#7718)
  ...
grendello added a commit to grendello/xamarin-android that referenced this pull request Feb 21, 2023
…otnet#7732)

Fixes: dotnet#7335

Context: d236af5

Commit d236af5 introduced embedded assembly compression, using LZ4,
which speeds up startup and reduces final package size.

Assemblies are compressed at build time and, at the same time, pre-
allocated buffers for the **decompressed** data are allocated in
`libxamarin-app.so`.  The buffers are then passed to the LZ4 APIs,
all threads using the same output buffer.  The assumption was that we
can do fine without locking as even if overlapped decompression
happens, the output data will be the same and so even if two threads
do the same thing at the same time, the data will be valid at all
times, so long as at least one thread completes the decompression.

This assumption proved to be **largely** true, but it appears that in
high concurrency cases it is possible that the data in the
decompression buffer differs.  This can result in app crashes:

	A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 3092 (.NET ThreadPool), pid 2727 (myapp.name)
	A/DEBUG: pid: 2727, tid: 3092, name: .NET ThreadPool  >>> myapp.name <<<
	A/DEBUG:       #1 pc 0000000000029b1c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmono-android.release.so (offset 0x103d000) (xamarin::android::internal::MonodroidRuntime::mono_log_handler(char const*, char const*, char const*, int, void*)+144) (BuildId: 29c5a3805a0bedee1eede9b6668d7c676aa63371)
	A/DEBUG:       #2 pc 00000000002680bc  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       #3 pc 00000000002681e8  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	A/DEBUG:       dotnet#4 pc 000000000008555c  /data/app/myapp.name-B9t_3dF9i8mDxJEKodZw5w==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (offset 0x109b000) (mono_metadata_string_heap+188) (BuildId: 4a5dd4396e8816b7f69881838bd549285213d53b)
	…

My guess is that LZ4 either uses the output buffer as a scratchpad
area when decompressing or that it initializes/modifies the buffer
before writing actual data in it.  With overlapped decompression, it
may lead to one thread overwriting valid data previously written by
another thread, so that when the latter returns the buffer it thought
to have had valid data may contain certain bytes temporarily
overwritten by the decompression session in the other, still running,
thread.  It may happen that MonoVM reads the corrupted data just when
it is still invalid (before the still running decompression session
actually writes the valid data), a classic race condition.

To fix this, the decompression block is now protected with a startup-
aware mutex.  Mutex will be held only after the initial startup phase
is completed, so there should not be much loss of startup performance.
@github-actions github-actions bot locked and limited conversation to collaborators Jan 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crashes on startup significantly increased after migration to .NET6
4 participants