Skip to content

Commit

Permalink
fix rare List::remove(&target) segfault (envoyproxy#4244)
Browse files Browse the repository at this point in the history
This fixes a rare but repeatedly observed segfault, described in detail in the comments of InstanceImpl::~InstanceImpl().

TL;DR; the InstanceImpl's listener manager instance can potentially contain a target with a callback which, at destruction time (see RdsRouteConfigSubscription::~RdsRouteConfigSubscription) goes to unregister it from the top-level InitManager, which has already been freed.

In practice, I see this segfault a few times a week, and only ever at teardown (inside ~MainCommonBase), which could be why it hasn't been crashing anyone.

After a medium amount of trying, I couldn't figure out how to reproduce this in a unit or integration test. A unit test for //test/server:server_test would need to stand up a real or mocked ListenerManager which owns some target with a callback to server.initManager(), which seems very complex (unless I'm overlooking something and there's in-repo precedent for this sort of thing).

Risk Level: Medium -- this passes all tests, but what fixes one uncaught bug might cause another.
Testing: Al tests pass, no tests changed.
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: James Buckland <jbuckland@google.com>
  • Loading branch information
ambuc authored and htuch committed Aug 30, 2018
1 parent 89e0f23 commit aa4481e
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions source/server/server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,17 @@ InstanceImpl::~InstanceImpl() {
// Stop logging to file before all the AccessLogManager and its dependencies are
// destructed to avoid crashing at shutdown.
file_logger_.reset();

// Destruct the ListenerManager explicitly, before InstanceImpl's local init_manager_ is
// destructed.
//
// The ListenerManager's DestinationPortsMap contains FilterChainSharedPtrs. There is a rare race
// condition where one of these FilterChains contains an HttpConnectionManager, which contains an
// RdsRouteConfigProvider, which contains an RdsRouteConfigSubscriptionSharedPtr. Since
// RdsRouteConfigSubscription is an Init::Target, ~RdsRouteConfigSubscription triggers a callback
// set at initialization, which goes to unregister it from the top-level InitManager, which has
// already been destructed (use-after-free) causing a segfault.
listener_manager_.reset();
}

Upstream::ClusterManager& InstanceImpl::clusterManager() { return *config_->clusterManager(); }
Expand Down

0 comments on commit aa4481e

Please sign in to comment.