-
-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
heap-use-after-free after heartbeats timed out when using UDP connection #2207
Comments
Thanks for the detailed report! Nice debugging work! My first thought is that we need to add the call to unregister the message subscription in the destructor of However, you mention that access of that object is not synchronized though, so I wonder if we also need a mutex somewhere. The message registration and calling is already protected/synchronized, so maybe that's enough... 🤔 |
A mutex, perhaps. But the problem is capturing raw Instead, if the Using I understand the sequence of events but I'm not sure that a proposed solution fixes the problem until a regression test can reliably crash at the same location. I'm not 100% sure how to build a regression test for it. diff --git a/src/mavsdk/plugins/gimbal/gimbal_impl.cpp b/src/mavsdk/plugins/gimbal/gimbal_impl.cpp
index ee0e3d22..002e677c 100644
--- a/src/mavsdk/plugins/gimbal/gimbal_impl.cpp
+++ b/src/mavsdk/plugins/gimbal/gimbal_impl.cpp
@@ -6,6 +6,7 @@
#include <cmath>
#include <functional>
#include <thread>
+#include <mutex>
namespace mavsdk {
@@ -50,7 +51,7 @@ void GimbalImpl::enable()
void GimbalImpl::disable()
{
- _gimbal_protocol.reset(nullptr);
+ _gimbal_protocol.reset();
}
void GimbalImpl::receive_protocol_timeout()
@@ -58,7 +59,7 @@ void GimbalImpl::receive_protocol_timeout()
// We did not receive a GIMBAL_MANAGER_INFORMATION in time, so we have to
// assume Version2 is not available.
LogDebug() << "Falling back to Gimbal Version 1";
- _gimbal_protocol.reset(new GimbalProtocolV1(*_system_impl));
+ _gimbal_protocol = GimbalProtocolV1::make(*_system_impl);
_protocol_cookie = nullptr;
}
@@ -74,8 +75,8 @@ void GimbalImpl::process_gimbal_manager_information(const mavlink_message_t& mes
_system_impl->unregister_timeout_handler(_protocol_cookie);
_protocol_cookie = nullptr;
- _gimbal_protocol.reset(new GimbalProtocolV2(
- *_system_impl, gimbal_manager_information, message.sysid, message.compid));
+ _gimbal_protocol = GimbalProtocolV2::make(
+ *_system_impl, gimbal_manager_information, message.sysid, message.compid);
}
}
diff --git a/src/mavsdk/plugins/gimbal/gimbal_impl.h b/src/mavsdk/plugins/gimbal/gimbal_impl.h
index 7a7ab1f8..1e1374c9 100644
--- a/src/mavsdk/plugins/gimbal/gimbal_impl.h
+++ b/src/mavsdk/plugins/gimbal/gimbal_impl.h
@@ -59,7 +59,7 @@ public:
const GimbalImpl& operator=(const GimbalImpl&) = delete;
private:
- std::unique_ptr<GimbalProtocolBase> _gimbal_protocol{nullptr};
+ std::shared_ptr<GimbalProtocolBase> _gimbal_protocol{nullptr};
void* _protocol_cookie{nullptr};
diff --git a/src/mavsdk/plugins/gimbal/gimbal_protocol_base.h b/src/mavsdk/plugins/gimbal/gimbal_protocol_base.h
index 5bb5a4d5..ca64588e 100644
--- a/src/mavsdk/plugins/gimbal/gimbal_protocol_base.h
+++ b/src/mavsdk/plugins/gimbal/gimbal_protocol_base.h
@@ -1,14 +1,21 @@
#pragma once
+#include <memory>
+
#include "plugins/gimbal/gimbal.h"
#include "plugin_impl_base.h"
#include "system_impl.h"
namespace mavsdk {
-class GimbalProtocolBase {
+class GimbalProtocolBase : public std::enable_shared_from_this<GimbalProtocolBase> {
+protected:
+ struct Protected {};
public:
- GimbalProtocolBase(SystemImpl& system_impl) : _system_impl(system_impl) {}
+ using shared_ptr = std::shared_ptr<GimbalProtocolBase>;
+ using weak_ptr = std::weak_ptr<GimbalProtocolBase>;
+
+ GimbalProtocolBase(Protected, SystemImpl& system_impl) : _system_impl(system_impl) {}
virtual ~GimbalProtocolBase() = default;
virtual Gimbal::Result set_pitch_and_yaw(float pitch_deg, float yaw_deg) = 0;
diff --git a/src/mavsdk/plugins/gimbal/gimbal_protocol_v1.cpp b/src/mavsdk/plugins/gimbal/gimbal_protocol_v1.cpp
index 434c0907..16136ddb 100644
--- a/src/mavsdk/plugins/gimbal/gimbal_protocol_v1.cpp
+++ b/src/mavsdk/plugins/gimbal/gimbal_protocol_v1.cpp
@@ -1,12 +1,18 @@
#include <functional>
#include <cmath>
+#include <memory>
#include "gimbal_protocol_v1.h"
#include "gimbal_impl.h"
#include "unused.h"
namespace mavsdk {
-GimbalProtocolV1::GimbalProtocolV1(SystemImpl& system_impl) : GimbalProtocolBase(system_impl) {}
+GimbalProtocolV1::shared_ptr GimbalProtocolV1::make(SystemImpl& system_impl)
+{
+ return std::make_shared<GimbalProtocolV1>(Protected{}, system_impl);
+}
+
+GimbalProtocolV1::GimbalProtocolV1(Protected, SystemImpl& system_impl) : GimbalProtocolBase(Protected{}, system_impl) {}
GimbalProtocolV1::~GimbalProtocolV1()
{
@@ -206,7 +212,15 @@ void GimbalProtocolV1::control_async(Gimbal::ControlCallback callback)
if (_control_callback == nullptr && callback != nullptr) {
_control_callback = callback;
_system_impl.add_call_every(
- [this]() { _control_callback(_current_control_status); }, 1.0, &_control_cookie);
+ [w = weak_from_this()]() {
+ if (auto s = w.lock())
+ {
+ auto self = std::dynamic_pointer_cast<GimbalProtocolV1>(s);
+ self->_control_callback(self->_current_control_status);
+ }
+ },
+ 1.0,
+ &_control_cookie );
} else if (_control_callback != nullptr && callback == nullptr) {
_control_callback = callback;
diff --git a/src/mavsdk/plugins/gimbal/gimbal_protocol_v1.h b/src/mavsdk/plugins/gimbal/gimbal_protocol_v1.h
index 25595fac..c9f55ce5 100644
--- a/src/mavsdk/plugins/gimbal/gimbal_protocol_v1.h
+++ b/src/mavsdk/plugins/gimbal/gimbal_protocol_v1.h
@@ -10,7 +10,11 @@ namespace mavsdk {
class GimbalProtocolV1 : public GimbalProtocolBase {
public:
- GimbalProtocolV1(SystemImpl& system_impl);
+ using shared_ptr = std::shared_ptr<GimbalProtocolV1>;
+ using weak_ptr = std::weak_ptr<GimbalProtocolV1>;
+ static shared_ptr make(SystemImpl& system_impl);
+
+ GimbalProtocolV1(Protected, SystemImpl& system_impl);
~GimbalProtocolV1() override;
Gimbal::Result set_pitch_and_yaw(float pitch_deg, float yaw_deg) override;
diff --git a/src/mavsdk/plugins/gimbal/gimbal_protocol_v2.cpp b/src/mavsdk/plugins/gimbal/gimbal_protocol_v2.cpp
index 81f4f4ef..67a2a89d 100644
--- a/src/mavsdk/plugins/gimbal/gimbal_protocol_v2.cpp
+++ b/src/mavsdk/plugins/gimbal/gimbal_protocol_v2.cpp
@@ -7,12 +7,28 @@
namespace mavsdk {
+GimbalProtocolV2::shared_ptr GimbalProtocolV2::make(
+ SystemImpl& system_impl,
+ const mavlink_gimbal_manager_information_t& information,
+ uint8_t gimbal_manager_sysid,
+ uint8_t gimbal_manager_compid)
+{
+ return std::make_shared<GimbalProtocolV2>(
+ Protected{},
+ system_impl,
+ information,
+ gimbal_manager_sysid,
+ gimbal_manager_compid
+ );
+}
+
GimbalProtocolV2::GimbalProtocolV2(
+ Protected,
SystemImpl& system_impl,
const mavlink_gimbal_manager_information_t& information,
uint8_t gimbal_manager_sysid,
uint8_t gimbal_manager_compid) :
- GimbalProtocolBase(system_impl),
+ GimbalProtocolBase(Protected{}, system_impl),
_gimbal_device_id(information.gimbal_device_id),
_gimbal_manager_sysid(gimbal_manager_sysid),
_gimbal_manager_compid(gimbal_manager_compid)
@@ -286,8 +302,12 @@ void GimbalProtocolV2::control_async(Gimbal::ControlCallback callback)
_system_impl.register_mavlink_message_handler(
MAVLINK_MSG_ID_GIMBAL_MANAGER_STATUS,
- [this](const mavlink_message_t& message) {
- process_gimbal_manager_status(message);
+ [w = weak_from_this()](const mavlink_message_t& message) {
+ if (auto s = w.lock())
+ {
+ auto self = std::dynamic_pointer_cast<GimbalProtocolV2>(s);
+ self->process_gimbal_manager_status(message);
+ }
},
this);
}
diff --git a/src/mavsdk/plugins/gimbal/gimbal_protocol_v2.h b/src/mavsdk/plugins/gimbal/gimbal_protocol_v2.h
index 3c2b1d4b..e3b519bf 100644
--- a/src/mavsdk/plugins/gimbal/gimbal_protocol_v2.h
+++ b/src/mavsdk/plugins/gimbal/gimbal_protocol_v2.h
@@ -7,7 +7,17 @@ namespace mavsdk {
class GimbalProtocolV2 : public GimbalProtocolBase {
public:
+ using shared_ptr = std::shared_ptr<GimbalProtocolV2>;
+ using weak_ptr = std::weak_ptr<GimbalProtocolV2>;
+
+ static shared_ptr make(
+ SystemImpl& system_impl,
+ const mavlink_gimbal_manager_information_t& information,
+ uint8_t gimbal_manager_sysid,
+ uint8_t gimbal_manager_compid);
+
GimbalProtocolV2(
+ Protected,
SystemImpl& system_impl,
const mavlink_gimbal_manager_information_t& information,
uint8_t gimbal_manager_sysid, |
Hmm, I see, there is more to this... |
@keith-bennett-airmap can you still reproduce your issue after #2208? I still wonder if that isn't enough. |
As stated at the top of this:
We upgraded to c263fde on January 16. Previously, this crash would occur as rarely as once every two or three days and sometimes as often as multiple times in a single day. I checked the logs today and do not see this crash since January 16. So it looks like it might be resolved. But I'm not sure without a regression test to prove it. I saw that #2208 didn't include a regression test. |
Right, that's good to hear but you're right. We could try to create a system test that excites it that we can then run many times over night with address sanitizer enabled. |
tl;dr
This appears to be a timing issue and the application typically crashes at intermittent intervals. Shared or overused compute clusters can cause delays to network traffic (especially with a cloud-based simulation). Without directly observing a packet capture, this is a proposed sequence of events to reproduce in a regression test:
Description
This is observed with mavsdk 7fe1b6083545a0760dc306388ea946ded78316da when compiled with Address Sanitizer.
mavsdk library is used by a DroneUp application, the
mavlink_shim
and the whole application is built in a Docker-based workflow. We see a heap-use-after-free error which we believe is related to mavsdk and UDP network connectivity. The issue's reproducibility is intermittent -- we don't see it frequently nor for every heartbeats timed out event. Debugging an intermittent issue is difficult 🤣We observe these events:
mavlink_client
object is using mavsdk. It's configured to use a udp connection and has successfully connected.The crash provides the following stack trace information.
Address Sanitizer Defect Report
Understanding the Stack Trace
The crash here:
is running in thread
T17
, which was created by mavsdk when we asked it to open a udp connection:Therefore,
T17
is mavsdk's internal socket thread.The object was freed by thread
T10
while processing the heartbeat-timeout event:That thread was created by us when we created the
mavsdk
object library:Therefore,
T10
is mavsdk's event thread.Summary:
GimbalProtocolBase
object during continued notifications of the disconnected status.heartbeats_timed_out()
installed as a timeout handler to hereheartbeats_timed_out()
callsset_disconnected()
hereset_disconnected()
notifies us (so we see a log message first) and then it disables all of the plugins hereGimbalImpl
's implementation ofdisable()
just resets theunique_ptr
, which is an unsynchronized delete ofGimbalProtocolV2
.GimbalProtocolBase
object after it had been freed, and crashed the application.GimbalProtocolV2
and crashes inprocess_gimbal_manager_status()
here.this
here for the callback.UDP, being connectionless, still processes received datagrams even though the event thread is handling a disconnect event and deletes things that the socket thread needs for continued processing.
The text was updated successfully, but these errors were encountered: