Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix envoy crash during early shutdown #4937

Merged
merged 9 commits into from
Nov 2, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions include/envoy/server/instance.h
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,11 @@ class Instance {
*/
virtual void shutdown() PURE;

/**
* @return whether the shutdown method has been called.
*/
virtual bool isShutdown() PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if the "don't call virtual functions in destructor" (e.g. https://www.artima.com/cppsource/nevercall.html) rule applies here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointer. I think the way I'm using it here is safe though - here is the relevant bit "Once a derived class destructor has run, the object's derived class data members assume undefined values, so C++ treats them as if they no longer exist. Upon entry to the base class destructor, the object becomes a base class object".

In this case, we're calling the virtual method (isShutdown) from the derived class destructor (InstanceImpl) not that of the base class (Instance) so the object should still be treated as an InstanceImpl and call the correct method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch Actually clang-analyzer-optin.cplusplus.VirtualCall check will warn about this (though, not an error now so it will pass) in clang-tidy CI. It doesn't seems the case?

In this case, we're calling the virtual method (isShutdown) from the derived class destructor (InstanceImpl) not that of the base class (Instance) so the object should still be treated as an InstanceImpl and call the correct method.

@eziskind if you make the virtual method called by the derived class destructor final, then clang-analyzer won't complain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


/**
* Shutdown the server's admin processing. This includes the admin API, stat flushing, etc.
*/
Expand Down
1 change: 1 addition & 0 deletions source/server/config_validation/server.h
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ class ValidationInstance : Logger::Loggable<Logger::Id::main>,
}
Runtime::Loader& runtime() override { return *runtime_loader_; }
void shutdown() override;
bool isShutdown() override { return false; }
void shutdownAdmin() override { NOT_IMPLEMENTED_GCOVR_EXCL_LINE; }
Singleton::Manager& singletonManager() override { return *singleton_manager_; }
OverloadManager& overloadManager() override { return *overload_manager_; }
Expand Down
34 changes: 16 additions & 18 deletions source/server/server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ InstanceImpl::InstanceImpl(Options& options, Event::TimeSystem& time_system,
ComponentFactory& component_factory,
Runtime::RandomGeneratorPtr&& random_generator,
ThreadLocal::Instance& tls)
: options_(options), time_system_(time_system), restarter_(restarter),
: shutdown_(false), options_(options), time_system_(time_system), restarter_(restarter),
start_time_(time(nullptr)), original_start_time_(start_time_), stats_store_(store),
thread_local_(tls), api_(new Api::Impl(options.fileFlushIntervalMsec())),
secret_manager_(std::make_unique<Secret::SecretManagerImpl>()),
Expand Down Expand Up @@ -388,15 +388,15 @@ void InstanceImpl::loadServerFlags(const absl::optional<std::string>& flags_path

uint64_t InstanceImpl::numConnections() { return listener_manager_->numConnections(); }

RunHelper::RunHelper(Event::Dispatcher& dispatcher, Upstream::ClusterManager& cm,
HotRestart& hot_restart, AccessLog::AccessLogManager& access_log_manager,
RunHelper::RunHelper(Instance& instance, Event::Dispatcher& dispatcher,
Upstream::ClusterManager& cm, AccessLog::AccessLogManager& access_log_manager,
InitManagerImpl& init_manager, OverloadManager& overload_manager,
std::function<void()> workers_start_cb) {

// Setup signals.
sigterm_ = dispatcher.listenForSignal(SIGTERM, [this, &hot_restart, &dispatcher]() {
sigterm_ = dispatcher.listenForSignal(SIGTERM, [&instance]() {
ENVOY_LOG(warn, "caught SIGTERM");
shutdown(dispatcher, hot_restart);
instance.shutdown();
});

sig_usr_1_ = dispatcher.listenForSignal(SIGUSR1, [&access_log_manager]() {
Expand All @@ -413,8 +413,8 @@ RunHelper::RunHelper(Event::Dispatcher& dispatcher, Upstream::ClusterManager& cm
// this can fire immediately if all clusters have already initialized. Also note that we need
// to guard against shutdown at two different levels since SIGTERM can come in once the run loop
// starts.
cm.setInitializedCb([this, &init_manager, &cm, workers_start_cb]() {
if (shutdown_) {
cm.setInitializedCb([&instance, &init_manager, &cm, workers_start_cb]() {
if (instance.isShutdown()) {
return;
}

Expand All @@ -424,8 +424,11 @@ RunHelper::RunHelper(Event::Dispatcher& dispatcher, Upstream::ClusterManager& cm
cm.adsMux().pause(Config::TypeUrl::get().RouteConfiguration);

ENVOY_LOG(info, "all clusters initialized. initializing init manager");
init_manager.initialize([this, workers_start_cb]() {
if (shutdown_) {

// Note: the lamda below should not capture "this" since the RunHelper object may
// have been destructed by the time it gets executed.
init_manager.initialize([&instance, workers_start_cb]() {
if (instance.isShutdown()) {
return;
}

Expand All @@ -440,16 +443,10 @@ RunHelper::RunHelper(Event::Dispatcher& dispatcher, Upstream::ClusterManager& cm
overload_manager.start();
}

void RunHelper::shutdown(Event::Dispatcher& dispatcher, HotRestart& hot_restart) {
shutdown_ = true;
hot_restart.terminateParent();
dispatcher.exit();
}

void InstanceImpl::run() {
// We need the RunHelper to be available to call from InstanceImpl::shutdown() below, so
// we save it as a member variable.
run_helper_ = std::make_unique<RunHelper>(*dispatcher_, clusterManager(), restarter_,
run_helper_ = std::make_unique<RunHelper>(*this, *dispatcher_, clusterManager(),
access_log_manager_, init_manager_, overloadManager(),
[this]() -> void { startWorkers(); });

Expand Down Expand Up @@ -505,8 +502,9 @@ void InstanceImpl::terminate() {
Runtime::Loader& InstanceImpl::runtime() { return *runtime_loader_; }

void InstanceImpl::shutdown() {
ASSERT(run_helper_.get() != nullptr);
run_helper_->shutdown(*dispatcher_, restarter_);
shutdown_ = true;
restarter_.terminateParent();
dispatcher_->exit();
}

void InstanceImpl::shutdownAdmin() {
Expand Down
9 changes: 3 additions & 6 deletions source/server/server.h
Original file line number Diff line number Diff line change
Expand Up @@ -113,19 +113,14 @@ class InstanceUtil : Logger::Loggable<Logger::Id::main> {
*/
class RunHelper : Logger::Loggable<Logger::Id::main> {
public:
RunHelper(Event::Dispatcher& dispatcher, Upstream::ClusterManager& cm, HotRestart& hot_restart,
RunHelper(Instance& instance, Event::Dispatcher& dispatcher, Upstream::ClusterManager& cm,
AccessLog::AccessLogManager& access_log_manager, InitManagerImpl& init_manager,
OverloadManager& overload_manager, std::function<void()> workers_start_cb);

// Helper function to inititate a shutdown. This can be triggered either by catching SIGTERM
// or be called from ServerImpl::shutdown().
void shutdown(Event::Dispatcher& dispatcher, HotRestart& hot_restart);

private:
Event::SignalEventPtr sigterm_;
Event::SignalEventPtr sig_usr_1_;
Event::SignalEventPtr sig_hup_;
bool shutdown_{};
};

/**
Expand Down Expand Up @@ -170,6 +165,7 @@ class InstanceImpl : Logger::Loggable<Logger::Id::main>, public Instance {
}
Runtime::Loader& runtime() override;
void shutdown() override;
bool isShutdown() override final { return shutdown_; }
void shutdownAdmin() override;
Singleton::Manager& singletonManager() override { return *singleton_manager_; }
bool healthCheckFailed() override;
Expand All @@ -196,6 +192,7 @@ class InstanceImpl : Logger::Loggable<Logger::Id::main>, public Instance {
void startWorkers();
void terminate();

bool shutdown_;
Options& options_;
Event::TimeSystem& time_system_;
HotRestart& restarter_;
Expand Down
1 change: 1 addition & 0 deletions test/mocks/server/mocks.h
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,7 @@ class MockInstance : public Instance {
MOCK_METHOD0(rateLimitClient_, RateLimit::Client*());
MOCK_METHOD0(runtime, Runtime::Loader&());
MOCK_METHOD0(shutdown, void());
MOCK_METHOD0(isShutdown, bool());
MOCK_METHOD0(shutdownAdmin, void());
MOCK_METHOD0(singletonManager, Singleton::Manager&());
MOCK_METHOD0(startTimeCurrentEpoch, time_t());
Expand Down
16 changes: 11 additions & 5 deletions test/server/server_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,14 @@
#include "gtest/gtest.h"

using testing::_;
using testing::Assign;
using testing::HasSubstr;
using testing::InSequence;
using testing::Invoke;
using testing::InvokeWithoutArgs;
using testing::Property;
using testing::Ref;
using testing::Return;
using testing::SaveArg;
using testing::StrictMock;

Expand Down Expand Up @@ -53,23 +55,24 @@ TEST(ServerInstanceUtil, flushHelper) {

class RunHelperTest : public testing::Test {
public:
RunHelperTest() {
RunHelperTest() : shutdown_(false) {
InSequence s;

sigterm_ = new Event::MockSignalEvent(&dispatcher_);
sigusr1_ = new Event::MockSignalEvent(&dispatcher_);
sighup_ = new Event::MockSignalEvent(&dispatcher_);
EXPECT_CALL(cm_, setInitializedCb(_)).WillOnce(SaveArg<0>(&cm_init_callback_));
EXPECT_CALL(overload_manager_, start());
ON_CALL(server_, shutdown()).WillByDefault(Assign(&shutdown_, true));

helper_ = std::make_unique<RunHelper>(dispatcher_, cm_, hot_restart_, access_log_manager_,
init_manager_, overload_manager_,
[this] { start_workers_.ready(); });
helper_ =
std::make_unique<RunHelper>(server_, dispatcher_, cm_, access_log_manager_, init_manager_,
overload_manager_, [this] { start_workers_.ready(); });
}

NiceMock<MockInstance> server_;
NiceMock<Event::MockDispatcher> dispatcher_;
NiceMock<Upstream::MockClusterManager> cm_;
NiceMock<MockHotRestart> hot_restart_;
NiceMock<AccessLog::MockAccessLogManager> access_log_manager_;
NiceMock<MockOverloadManager> overload_manager_;
InitManagerImpl init_manager_;
Expand All @@ -79,6 +82,7 @@ class RunHelperTest : public testing::Test {
Event::MockSignalEvent* sigterm_;
Event::MockSignalEvent* sigusr1_;
Event::MockSignalEvent* sighup_;
bool shutdown_ = false;
};

TEST_F(RunHelperTest, Normal) {
Expand All @@ -89,6 +93,7 @@ TEST_F(RunHelperTest, Normal) {
TEST_F(RunHelperTest, ShutdownBeforeCmInitialize) {
EXPECT_CALL(start_workers_, ready()).Times(0);
sigterm_->callback_();
EXPECT_CALL(server_, isShutdown()).WillOnce(Return(shutdown_));
cm_init_callback_();
}

Expand All @@ -99,6 +104,7 @@ TEST_F(RunHelperTest, ShutdownBeforeInitManagerInit) {
EXPECT_CALL(target, initialize(_));
cm_init_callback_();
sigterm_->callback_();
EXPECT_CALL(server_, isShutdown()).WillOnce(Return(shutdown_));
target.callback_();
}

Expand Down