Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connectd: avoid use-after-free upon multiple reconnections by a peer #5300

Merged

Commits on Jun 1, 2022

  1. connectd: avoid use-after-free upon multiple reconnections by a peer

    `peer_reconnected` was freeing a `struct peer_reconnected` instance
    while a pointer to that instance was registered to be passed as an
    argument to the `retry_peer_connected` callback function. This caused a
    use-after-free crash when `retry_peer_connected` attempted to reparent
    the instance to the temporary context.
    
    Instead, never have `peer_reconnected` free a `struct peer_reconnected`
    instance, and only ever allow such an instance to be freed after the
    `retry_peer_connected` callback has finished with it. To ensure that the
    instance is freed even if the connection is closed before the callback
    can be invoked, parent the instance to the connection rather than to the
    daemon.
    
    Absent the need to free `struct peer_reconnected` instances outside of
    the `retry_peer_connected` callback, there is no use for the
    `reconnected` hashtable, so remove it as well.
    
    See: ElementsProject#5282 (comment)
    Fixes: ElementsProject#5282
    Fixes: ElementsProject#5284
    Changelog-Fixed: connectd no longer crashes when peers reconnect.
    whitslack committed Jun 1, 2022
    Configuration menu
    Copy the full SHA
    15a1d8b View commit details
    Browse the repository at this point in the history

Commits on Jun 19, 2022

  1. lightningd: fix crash on rapid reconnect.

    Happens occasionally when running
    `tests/test_connection.py::test_mutual_reconnect_race` (which is too
    flaky to add, without more fixes):
    
    
    ```
    lightningd: lightningd/peer_control.c:1252: peer_active: Assertion `!channel->owner' failed.
    lightningd: FATAL SIGNAL 6 (version v0.11.0.1-38-g4f167da)
    0x5594a41f8f45 send_backtrace
    	common/daemon.c:33
    0x5594a41f8fef crashdump
    	common/daemon.c:46
    0x7f7cb585c08f ???
    	/build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
    0x7f7cb585c00b __GI_raise
    	../sysdeps/unix/sysv/linux/raise.c:51
    0x7f7cb583b858 __GI_abort
    	/build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:79
    0x7f7cb583b728 __assert_fail_base
    	/build/glibc-SzIz7B/glibc-2.31/assert/assert.c:92
    0x7f7cb584cfd5 __GI___assert_fail
    	/build/glibc-SzIz7B/glibc-2.31/assert/assert.c:101
    0x5594a41b45ca peer_active
    	lightningd/peer_control.c:1252
    0x5594a418794c connectd_msg
    	lightningd/connect_control.c:457
    0x5594a41cd457 sd_msg_read
    	lightningd/subd.c:556
    0x5594a41ccbe5 read_fds
    	lightningd/subd.c:357
    0x5594a4269fc2 next_plan
    	ccan/ccan/io/io.c:59
    0x5594a426abca do_plan
    	ccan/ccan/io/io.c:407
    0x5594a426ac0c io_ready
    	ccan/ccan/io/io.c:417
    0x5594a426ceff io_loop
    	ccan/ccan/io/poll.c:453
    0x5594a41930d9 io_loop_with_timers
    	lightningd/io_loop_with_timers.c:22
    0x5594a4199293 main
    	lightningd/lightningd.c:1181
    0x7f7cb583d082 __libc_start_main
    	../csu/libc-start.c:308
    0x5594a416e15d ???
    	???:0
    0xffffffffffffffff ???
    	???:0
    ```
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Jun 19, 2022
    Configuration menu
    Copy the full SHA
    41d67d5 View commit details
    Browse the repository at this point in the history
  2. connectd: don't keep around more than one old connection.

    This was fixed in 1c495ca ("connectd:
    fix accidental handling of old reconnections.") and then reverted by
    the rework in "connectd: avoid use-after-free upon multiple
    reconnections by a peer".
    
    The latter made the race much less likely, since we cleaned up the
    reconnecting struct once the connection was hung up by the remote
    node, but it's still theoretically possible.
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    rustyrussell committed Jun 19, 2022
    Configuration menu
    Copy the full SHA
    debd45c View commit details
    Browse the repository at this point in the history