Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

heap corruption/crash when pulseaudio audio device is removed #39

Closed
ProMarbler14 opened this issue Aug 25, 2023 · 10 comments · Fixed by #40
Closed

heap corruption/crash when pulseaudio audio device is removed #39

ProMarbler14 opened this issue Aug 25, 2023 · 10 comments · Fixed by #40

Comments

@ProMarbler14
Copy link
Contributor

Expected behaviour

Using a USB audio device, if the audio device disconnects, MATE software should handle the device removal gracefully.

Actual behaviour

If the audio device vanishes, applications which integrate libmatemixer will error or crash. In mate-settings-daemon, the corruption is detected by malloc:
free(): corrupted unsorted chunks
corrupted double-linked list
malloc_consolidate(): unaligned fastbin chunk detected
In mate-volume-control-status-icon, it presents as so:
malloc(): unaligned tcache chunk detected
../glib/glib/gmem.c:207: failed to allocate 103079215112 bytes

Steps to reproduce the behaviour

  1. plug in USB audio adapter
  2. unplug USB audio adapter

The mate settings daemon backtrace shows the "device-removed" signal being fired and a device name string being removed from a hash table, I think because of on_connection_card_removed, while in mate-volume-control-status-icon, the error seems to occur in pulse_ext_stream_update->g_object_freeze_notify for some reason.

crashes.tar.gz

MATE general version

1.26.0

Package version

1.26.0 (no patches)

Linux Distribution

Arch Linux (rolling)

@raveit65
Copy link
Member

I tried to reproduce that with a simple usb c-media 2.0 chanel device but i couldn't trigger a crash.
Also, i used a usb headset for zoom in the last 2 years. I can't remember any crash
Same with connecting/reconnecting a bluetooth headset.
So, is this crash clear reproducible for you? Or was it a one shot?
And what hardware are you using?

@lukefromdc
Copy link
Member

Looking at the code in pulse-backend.c, we have these functions:

static void
on_connection_card_removed (PulseConnection *connection,
                            guint            index,
                            PulseBackend    *pulse)
{
    PulseDevice *device;
    gchar       *name;

    device = g_hash_table_lookup (pulse->priv->devices, GUINT_TO_POINTER (index));
    if (G_UNLIKELY (device == NULL))
        return;

    name = g_strdup (mate_mixer_device_get_name (MATE_MIXER_DEVICE (device)));

    g_hash_table_remove (pulse->priv->devices, GUINT_TO_POINTER (index));

    free_list_devices (pulse);
    g_signal_emit_by_name (G_OBJECT (pulse),
                           "device-removed",
                           name);
    g_free (name);
}

in which we call free_list_devices (pulse); which is also called on adding a device and is as follows:

static void
free_list_devices (PulseBackend *pulse)
{
    if (pulse->priv->devices_list == NULL)
        return;

    g_list_free_full (pulse->priv->devices_list, g_object_unref);

    pulse->priv->devices_list = NULL;
}

@lukefromdc
Copy link
Member

lukefromdc commented Aug 29, 2023

https://www.unix.com/red-hat/233292-free-corrupted-unsorted-chunks.html

"The most common causes for this type of corruption are (1) using an uninitialized pointer and (2) writing more data into memory than was allocated for the buffer into which the data is being written.
"

Also in
https://stackoverflow.com/questions/25767566/corrupted-unsorted-chunks-while-calling-free
we have a suggestion to run under valgrind (how would this be attached to a library?) while invoking the crash

@ProMarbler14
Copy link
Contributor Author

The crash is pretty consistent, I think it happens every time.

valgrind gives the exact same stacktrace for both mate-settings-daemon and mate-volume-control-status-icon. I compiled and installed libmatemixer 1.27 with debug symbols, and the error prints whenever I unplug the device.

It's not specific to the usb audio device; I have two, and removing either one triggers a crash.

==2608089== 40 errors in context 1 of 1:
==2608089== Invalid write of size 8
==2608089==    at 0x55CED19: g_nullify_pointer (gutils.c:2860)
==2608089==    by 0x54F4CE6: weak_refs_notify (gobject.c:3286)
==2608089==    by 0x5572BD4: g_data_set_internal (gdataset.c:410)
==2608089==    by 0x54F6697: g_object_real_dispose.lto_priv.0 (gobject.c:1364)
==2608089==    by 0x54F97E2: UnknownInlinedFun (gobject.c:3891)
==2608089==    by 0x54F97E2: g_object_unref (gobject.c:3802)
==2608089==    by 0x54EB6BF: g_closure_invoke (gclosure.c:832)
==2608089==    by 0x5519937: signal_emit_unlocked_R.isra.0 (gsignal.c:3812)
==2608089==    by 0x550AAA6: g_signal_emit_valist (gsignal.c:3565)
==2608089==    by 0x550AD33: g_signal_emit (gsignal.c:3622)
==2608089==    by 0xC90EA05: pa_command_subscribe_event (subscribe.c:53)
==2608089==    by 0xC96F03B: pa_pdispatch_run (pdispatch.c:349)
==2608089==    by 0xC8EA243: pstream_packet_callback (context.c:364)
==2608089==  Address 0x9409c00 is 352 bytes inside a block of size 432 free'd
==2608089==    at 0x484412F: free (vg_replace_malloc.c:974)
==2608089==    by 0x5510DAC: g_type_free_instance (gtype.c:2055)
==2608089==    by 0x49DA822: mate_mixer_stream_dispose (matemixer-stream.c:284)
==2608089==    by 0x54F97E2: UnknownInlinedFun (gobject.c:3891)
==2608089==    by 0x54F97E2: g_object_unref (gobject.c:3802)
==2608089==    by 0x54EB6BF: g_closure_invoke (gclosure.c:832)
==2608089==    by 0x5519937: signal_emit_unlocked_R.isra.0 (gsignal.c:3812)
==2608089==    by 0x550AAA6: g_signal_emit_valist (gsignal.c:3565)
==2608089==    by 0x550AD33: g_signal_emit (gsignal.c:3622)
==2608089==    by 0xC90EA05: pa_command_subscribe_event (subscribe.c:53)
==2608089==    by 0xC96F03B: pa_pdispatch_run (pdispatch.c:349)
==2608089==    by 0xC8EA243: pstream_packet_callback (context.c:364)
==2608089==    by 0xC973F94: do_read (pstream.c:1023)
==2608089==  Block was alloc'd at
==2608089==    at 0x48469B3: calloc (vg_replace_malloc.c:1554)
==2608089==    by 0x559D00A: g_malloc0 (gmem.c:163)
==2608089==    by 0x55161EF: g_type_create_instance (gtype.c:1955)
==2608089==    by 0x54FBD90: g_object_new_internal.part.0 (gobject.c:2246)
==2608089==    by 0x54FDF0A: UnknownInlinedFun (gobject.c:2563)
==2608089==    by 0x54FDF0A: g_object_new_valist (gobject.c:2585)
==2608089==    by 0x54FE29D: g_object_new (gobject.c:2058)
==2608089==    by 0xC887A1D: pulse_source_control_new (pulse-source-control.c:81)
==2608089==    by 0xC887343: pulse_source_new (pulse-source.c:135)
==2608089==    by 0xC87A58E: on_connection_source_info (pulse-backend.c:950)
==2608089==    by 0x54EB6BF: g_closure_invoke (gclosure.c:832)
==2608089==    by 0x5519937: signal_emit_unlocked_R.isra.0 (gsignal.c:3812)
==2608089==    by 0x550AAA6: g_signal_emit_valist (gsignal.c:3565)
==2608089== 

@raveit65
Copy link
Member

This kind of issue will be reported very often in fedora.
https://bugzilla.redhat.com/show_bug.cgi?id=2236383

Truncated backtrace:
Thread no. 1 (19 frames)
 #8 g_malloc at ../glib/gmem.c:130
 #9 g_strdup at ../glib/gstrfuncs.c:363
 #10 g_strdup_inline at ../glib/gstrfuncs.h:321
 #11 mate_mixer_device_set_property at /usr/src/debug/libmatemixer-1.26.0-4.fc38.x86_64/libmatemixer/matemixer-device.c:271
 #12 object_set_property at ../gobject/gobject.c:1812
 #13 g_object_new_internal at ../gobject/gobject.c:2291
 #15 g_object_new_valist at ../gobject/gobject.c:2585
 #17 pulse_device_new at /usr/src/debug/libmatemixer-1.26.0-4.fc38.x86_64/backends/pulse/pulse-device.c:236
 #18 on_connection_card_info at /usr/src/debug/libmatemixer-1.26.0-4.fc38.x86_64/backends/pulse/pulse-backend.c:762
 #21 signal_emit_unlocked_R.isra.0 at ../gobject/gsignal.c:3812
 #24 context_get_card_info_callback at ../src/pulse/introspect.c:990
 #25 run_action at ../src/pulsecore/pdispatch.c:291
 #26 pa_pdispatch_run at ../src/pulsecore/pdispatch.c:344
 #27 pstream_packet_callback at ../src/pulse/context.c:364
 #28 do_read at ../src/pulsecore/pstream.c:1023
 #29 do_pstream_read_write at ../src/pulsecore/pstream.c:261
 #30 dispatch_func at ../src/pulse/glib-mainloop.c:581
 #33 g_main_context_iterate.isra.0 at ../glib/gmain.c:4276
 #35 gtk_main at ../gtk/gtkmain.c:1329

Full stacktrace:
https://bugzilla.redhat.com/attachment.cgi?id=1986248

@lukefromdc
Copy link
Member

Please test #40

@ProMarbler14
Copy link
Contributor Author

ProMarbler14 commented Sep 2, 2023

As the author of the PR, I have tested against mate-volume-control-status-icon many times (verifying under valgrind to be sure) and have tested with mate-settings-daemon as well.

I'm not sure which local factors are triggering the crash exactly, but to help with reproduction, you could try this:

  • Have a device with a monitor and both output/input. Nothing unusual there. Presumably, the more streams the greater opportunity for corruption.
  • Create multiple sink inputs (applications playing data). mpv will gladly keep the sink input open, even when paused. For me, pacmd list-sink-inputs | grep -cE 'media\.name' shows I have 6 sink inputs.
  • Have a very large stream-restore database. This requires "load-module module-stream-restore" in your default pulse config (should be standard). Look for a recent "somehash-stream-volumes.tdb" in ~/.config/pulse, and count the entries with tdbdump HASH-stream-volumes.tdb | grep -c ^key. Mine gives 226. Apps like Steam will fill this pretty easily. Do note the program names aren't anonymous. (tdb should be installed as a part of Samba.)

The more stuff that is recreated or loaded, the more it should disturb the heap enough to trigger an assertion.

@lukefromdc
Copy link
Member

lukefromdc commented Sep 2, 2023 via email

@raveit65
Copy link
Member

raveit65 commented Sep 3, 2023

I am using fedora 38 with pipewire in result pulseaudio commands like pacmd do not show any sinks anymore.

[rave@mother ~]$ pacmd list-sink-inputs | grep -cE 'media\.name'
No PulseAudio daemon running, or not running as session daemon.
0

@lukefromdc
Copy link
Member

Do the inputs show up in mate-volume-control? I don't know much about this but maybe it only shows up in use?
I don't know jack about Pipewire so to speak, but do have it installed by default

On my setup with this installed I have

luke@ubuntu:~$ pacmd list-sink-inputs
0 sink input(s) available.
luke@ubuntu:~$ 

but recording works fine.

If I have something recording sound, I get

luke@ubuntu:~$  pacmd list-sink-inputs | grep -cE 'media\.name'
1

If not recording sound I get

luke@ubuntu:~$  pacmd list-sink-inputs | grep -cE 'media\.name'
0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants