-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock when calling get_unread_count() and destroying subscription #646
Comments
ros2/rmw_fastrtps#646 Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
I cannot reproduce the deadlock from sample code, but core crash as following. stacktrace with debug build[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/root/ros2_ws/colcon_ws/install/prover_rclcpp/lib/prover_rclcpp/rmw_fastrtps_64'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fff50c71950 in ?? ()
[Current thread is 1 (Thread 0x7f2c32304640 (LWP 37624))]
(gdb) bt
#0 0x00007fff50c71950 in ?? ()
#1 0x000055df7a3cc575 in std::function<void (unsigned long)>::operator()(unsigned long) const (this=0x55df7bc2fda0, __args#0=1)
at /usr/include/c++/11/bits/std_function.h:590
#2 0x000055df7a3c852c in rclcpp::SubscriptionBase::set_on_new_message_callback(std::function<void (unsigned long)>)::{lambda(unsigned long)#1}::operator()(unsigned long) const (__closure=0x55df7bc2fda0, number_of_messages=1) at /root/ros2_ws/colcon_ws/install/rclcpp/include/rclcpp/rclcpp/subscription_base.hpp:328
#3 0x000055df7a3dda9a in std::__invoke_impl<void, rclcpp::SubscriptionBase::set_on_new_message_callback(std::function<void (unsigned long)>)::{lambda(unsigned long)#1}&, unsigned long>(std::__invoke_other, rclcpp::SubscriptionBase::set_on_new_message_callback(std::function<void (unsigned long)>)::{lambda(unsigned long)#1}&, unsigned long&&) (__f=...) at /usr/include/c++/11/bits/invoke.h:61
#4 0x000055df7a3d7741 in std::__invoke_r<void, rclcpp::SubscriptionBase::set_on_new_message_callback(std::function<void (unsigned long)>)::{lambda(unsigned long)#1}&, unsigned long>(rclcpp::SubscriptionBase::set_on_new_message_callback(std::function<void (unsigned long)>)::{lambda(unsigned long)#1}&, unsigned long&&) (__fn=...)
at /usr/include/c++/11/bits/invoke.h:111
#5 0x000055df7a3d378a in std::_Function_handler<void (unsigned long), rclcpp::SubscriptionBase::set_on_new_message_callback(std::function<void (unsigned long)>)::{lambda(unsigned long)#1}>::_M_invoke(std::_Any_data const&, unsigned long&&) (__functor=..., __args#0=@0x7f2c32301ef0: 1) at /usr/include/c++/11/bits/std_function.h:290
#6 0x000055df7a3cc575 in std::function<void (unsigned long)>::operator()(unsigned long) const (this=0x55df7bc3cbe8, __args#0=1)
at /usr/include/c++/11/bits/std_function.h:590
#7 0x000055df7a3cc871 in rclcpp::detail::cpp_callback_trampoline<void const*, unsigned long, void> (user_data=0x55df7bc3cbe8)
at /root/ros2_ws/colcon_ws/install/rclcpp/include/rclcpp/rclcpp/detail/cpp_callback_trampoline.hpp:61
#8 0x00007f2c399a3dd5 in SubListener::on_data_available(eprosima::fastdds::dds::DataReader*) ()
from /root/ros2_ws/colcon_ws/install/rmw_fastrtps_shared_cpp/lib/librmw_fastrtps_shared_cpp.so
#9 0x00007f2c38fe0961 in eprosima::fastdds::dds::DataReaderImpl::InnerDataReaderListener::on_data_available (this=0x55df7bc4ad28, writer_guid=..., first_sequence=...,
last_sequence=..., should_notify_individual_changes=@0x7f2c3230204e: false)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/subscriber/DataReaderImpl.cpp:904
#10 0x00007f2c38ec6a22 in eprosima::fastrtps::rtps::StatefulReader::NotifyChanges (this=0x55df7bc470d0, prox=0x55df7bc4faa0)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/reader/StatefulReader.cpp:1168
#11 0x00007f2c38ec666b in eprosima::fastrtps::rtps::StatefulReader::change_received (this=0x55df7bc470d0, a_change=0x55df7bc2d8a0, prox=0x55df7bc4faa0,
unknown_missing_changes_up_to=0) at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/reader/StatefulReader.cpp:1099
#12 0x00007f2c38ec4bf8 in eprosima::fastrtps::rtps::StatefulReader::processDataMsg (this=0x55df7bc470d0, change=0x55df7bc0a730)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/reader/StatefulReader.cpp:597
#13 0x00007f2c38e0a429 in eprosima::fastrtps::rtps::StatefulWriter::intraprocess_delivery (this=0x55df7bc12ca0, change=0x55df7bc0a730, reader_proxy=0x55df7bc4e940)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/writer/StatefulWriter.cpp:473
#14 0x00007f2c38e0aef5 in eprosima::fastrtps::rtps::StatefulWriter::deliver_sample_to_intraprocesses (this=0x55df7bc12ca0, change=0x55df7bc0a730)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/writer/StatefulWriter.cpp:625
#15 0x00007f2c38e111dd in eprosima::fastrtps::rtps::StatefulWriter::deliver_sample_nts (this=0x55df7bc12ca0, cache_change=0x55df7bc0a730, group=...,
locator_selector=..., max_blocking_time=...) at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/writer/StatefulWriter.cpp:2070
#16 0x00007f2c39299730 in eprosima::fastdds::rtps::FlowControllerImpl<eprosima::fastdds::rtps::FlowControllerSyncPublishMode, eprosima::fastdds::rtps::FlowControllerFifoSchedule>::add_new_sample_impl<eprosima::fastdds::rtps::FlowControllerSyncPublishMode> (this=0x55df7bab61a0, writer=0x55df7bc12ca0, change=0x55df7bc0a730,
max_blocking_time=...) at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/flowcontrol/FlowControllerImpl.hpp:1151
#17 0x00007f2c392968c6 in eprosima::fastdds::rtps::FlowControllerImpl<eprosima::fastdds::rtps::FlowControllerSyncPublishMode, eprosima::fastdds::rtps::FlowControllerFifoSchedule>::add_new_sample (this=0x55df7bab61a0, writer=0x55df7bc12ca0, change=0x55df7bc0a730, max_blocking_time=...)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/flowcontrol/FlowControllerImpl.hpp:998
#18 0x00007f2c38e0a168 in eprosima::fastrtps::rtps::StatefulWriter::unsent_change_added_to_history (this=0x55df7bc12ca0, change=0x55df7bc0a730, max_blocking_time=...)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/writer/StatefulWriter.cpp:434
#19 0x00007f2c38e485c2 in eprosima::fastrtps::rtps::WriterHistory::notify_writer (this=0x55df7bc0ff20, a_change=0x55df7bc0a730, max_blocking_time=...)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/rtps/history/WriterHistory.cpp:123
--Type <RET> for more, q to quit, c to continue without paging--
#20 0x00007f2c38f8d581 in eprosima::fastrtps::rtps::WriterHistory::add_change_with_commit_hook<eprosima::fastdds::dds::DataWriterImpl::perform_create_new_change(eprosima::fastrtps::rtps::ChangeKind_t, void*, eprosima::fastrtps::rtps::WriteParams&, const InstanceHandle_t&)::<lambda(eprosima::fastdds::dds::DataWriterImpl::CacheChange_t&)> >(eprosima::fastrtps::rtps::CacheChange_t *, eprosima::fastrtps::rtps::WriteParams &, struct {...}, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1, 1000000000> > >) (this=0x55df7bc0ff20, a_change=0x55df7bc0a730, wparams=..., pre_commit=..., max_blocking_time=...)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/include/fastdds/rtps/history/WriterHistory.h:183
#21 0x00007f2c38f8d1c3 in eprosima::fastdds::dds::DataWriterHistory::add_pub_change_with_commit_hook<eprosima::fastdds::dds::DataWriterImpl::perform_create_new_change(eprosima::fastrtps::rtps::ChangeKind_t, void*, eprosima::fastrtps::rtps::WriteParams&, const InstanceHandle_t&)::<lambda(eprosima::fastdds::dds::DataWriterImpl::CacheChange_t&)> >(eprosima::fastrtps::rtps::CacheChange_t *, eprosima::fastrtps::rtps::WriteParams &, struct {...}, std::unique_lock<std::recursive_timed_mutex> &, const std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1, 1000000000> > > &) (this=0x55df7bc0ff20, change=0x55df7bc0a730, wparams=...,
pre_commit=..., lock=..., max_blocking_time=...) at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/publisher/DataWriterHistory.hpp:139
#22 0x00007f2c38f86e70 in eprosima::fastdds::dds::DataWriterImpl::perform_create_new_change (this=0x55df7bc0fa70, change_kind=eprosima::fastrtps::rtps::ALIVE,
data=0x7f2c32302da0, wparams=..., handle=...) at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/publisher/DataWriterImpl.cpp:947
#23 0x00007f2c38f87470 in eprosima::fastdds::dds::DataWriterImpl::create_new_change_with_params (this=0x55df7bc0fa70, changeKind=eprosima::fastrtps::rtps::ALIVE,
data=0x7f2c32302da0, wparams=...) at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/publisher/DataWriterImpl.cpp:1021
#24 0x00007f2c38f86709 in eprosima::fastdds::dds::DataWriterImpl::create_new_change (this=0x55df7bc0fa70, changeKind=eprosima::fastrtps::rtps::ALIVE,
data=0x7f2c32302da0) at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/publisher/DataWriterImpl.cpp:868
#25 0x00007f2c38f84fe4 in eprosima::fastdds::dds::DataWriterImpl::write (this=0x55df7bc0fa70, data=0x7f2c32302da0)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/publisher/DataWriterImpl.cpp:548
#26 0x00007f2c38f7ece1 in eprosima::fastdds::dds::DataWriter::write (this=0x55df7bc096d0, data=0x7f2c32302da0)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/publisher/DataWriter.cpp:84
#27 0x00007f2c399ed445 in rmw_fastrtps_shared_cpp::__rmw_publish(char const*, rmw_publisher_s const*, void const*, rmw_publisher_allocation_s*) ()
from /root/ros2_ws/colcon_ws/install/rmw_fastrtps_shared_cpp/lib/librmw_fastrtps_shared_cpp.so
#28 0x00007f2c39ac6213 in rmw_publish (publisher=0x55df7bc06950, ros_message=0x7f2c32303310, allocation=0x0)
at /root/ros2_ws/colcon_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp/src/rmw_publish.cpp:34
#29 0x00007f2c3a291f77 in rcl_publish (publisher=0x55df7bc06690, ros_message=0x7f2c32303310, allocation=0x0)
at /root/ros2_ws/colcon_ws/src/ros2/rcl/rcl/src/rcl/publisher.c:249
#30 0x000055df7a3d1ab6 in rclcpp::Publisher<std_msgs::msg::String_<std::allocator<void> >, std::allocator<void> >::do_inter_process_publish (this=0x55df7ba4b730,
msg=...) at /root/ros2_ws/colcon_ws/install/rclcpp/include/rclcpp/rclcpp/publisher.hpp:453
#31 0x000055df7a3ce850 in rclcpp::Publisher<std_msgs::msg::String_<std::allocator<void> >, std::allocator<void> >::publish<std_msgs::msg::String_<std::allocator<void> > > (this=0x55df7ba4b730, msg=...) at /root/ros2_ws/colcon_ws/install/rclcpp/include/rclcpp/rclcpp/publisher.hpp:297
#32 0x000055df7a3caf3d in MinimalPublisher::MinimalPublisher(rclcpp::NodeOptions)::{lambda()#1}::operator()() const (__closure=0x55df7bc141a0)
at /root/ros2_ws/colcon_ws/src/ros2/ros2_test_prover/prover_rclcpp/src/rmw_fastrtps_646.cpp:18
#33 0x000055df7a435b80 in rclcpp::GenericTimer<MinimalPublisher::MinimalPublisher(rclcpp::NodeOptions)::{lambda()#1}, (void*)0>::execute_callback_delegate<MinimalPublisher::MinimalPublisher(rclcpp::NodeOptions)::{lambda()#1}, (void*)0>() (this=0x55df7bc14170) at /root/ros2_ws/colcon_ws/install/rclcpp/include/rclcpp/rclcpp/timer.hpp:244
#34 0x000055df7a432e87 in rclcpp::GenericTimer<MinimalPublisher::MinimalPublisher(rclcpp::NodeOptions)::{lambda()#1}, (void*)0>::execute_callback() (
this=0x55df7bc14170) at /root/ros2_ws/colcon_ws/install/rclcpp/include/rclcpp/rclcpp/timer.hpp:230
#35 0x00007f2c3acbdb95 in rclcpp::Executor::execute_timer (timer=warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<rclcpp::WallTimer<MinimalPublisher::MinimalPublisher(rclcpp::NodeOptions)::{lambda()#1}, (void*)0>, std::allocator<rclcpp::WallTimer<MinimalPublisher::MinimalPublisher(rclcpp::NodeOptions)::{lambda()#1}, (void*)0> >, (__gnu_cxx::_Lock_policy)2>'
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr_inplace<rclcpp::WallTimer<MinimalPublisher::MinimalPublisher(rclcpp::NodeOptions)::{lambda()#1}, (void*)0>, std::allocator<rclcpp::WallTimer<MinimalPublisher::MinimalPublisher(rclcpp::NodeOptions)::{lambda()#1}, (void*)0> >, (__gnu_cxx::_Lock_policy)2>'
std::shared_ptr<rclcpp::TimerBase> (use count 3, weak count 1) = {...})
at /root/ros2_ws/colcon_ws/src/ros2/rclcpp/rclcpp/src/rclcpp/executor.cpp:656
#36 0x00007f2c3acbc5c7 in rclcpp::Executor::execute_any_executable (this=0x55df7bbe4a50, any_exec=...)
at /root/ros2_ws/colcon_ws/src/ros2/rclcpp/rclcpp/src/rclcpp/executor.cpp:518
#37 0x00007f2c3accd8a1 in rclcpp::executors::SingleThreadedExecutor::spin (this=0x55df7bbe4a50)
at /root/ros2_ws/colcon_ws/src/ros2/rclcpp/rclcpp/src/rclcpp/executors/single_threaded_executor.cpp:37
#38 0x000055df7a3c4d2e in operator() (__closure=0x55df7bc144c8) at /root/ros2_ws/colcon_ws/src/ros2/ros2_test_prover/prover_rclcpp/src/rmw_fastrtps_646.cpp:63
#39 0x000055df7a3c5558 in std::__invoke_impl<void, main(int, char**)::<lambda()> >(std::__invoke_other, struct {...} &&) (__f=...)
--Type <RET> for more, q to quit, c to continue without paging--
at /usr/include/c++/11/bits/invoke.h:61
#40 0x000055df7a3c551b in std::__invoke<main(int, char**)::<lambda()> >(struct {...} &&) (__fn=...) at /usr/include/c++/11/bits/invoke.h:96
#41 0x000055df7a3c54c8 in std::thread::_Invoker<std::tuple<main(int, char**)::<lambda()> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x55df7bc144c8)
at /usr/include/c++/11/bits/std_thread.h:253
#42 0x000055df7a3c549c in std::thread::_Invoker<std::tuple<main(int, char**)::<lambda()> > >::operator()(void) (this=0x55df7bc144c8)
at /usr/include/c++/11/bits/std_thread.h:260
#43 0x000055df7a3c5480 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<main(int, char**)::<lambda()> > > >::_M_run(void) (this=0x55df7bc144c0)
at /usr/include/c++/11/bits/std_thread.h:211
#44 0x00007f2c3a0d92b3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#45 0x00007f2c39e49b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#46 0x00007f2c39edba00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 |
Hi @fujitatomoya, yes I forgot to mention. This example I created to reproduce easily the error you just got. The fix is ros2/rclcpp#2024 . |
@mauropasse Please see #657 for a possible fix of the deadlock. |
Hi, I tested it today and it fixes the issue! |
I've identified a deadlock when a subscription (having assigned an
on_new_message_callback
) is being destroyed, while is also getting a message from a publisher. I'm using the latest master branch of FastDDS.Attached, an example to reproduce the issue and the resulting cleaned backtrace.
FastDDS-example-and-backtrace.zip
You can run the example with GDB, wait until it deadlocks, and
ctrl+c
to check where happened.The issue seems to be in
rmw_fastrtps/rmw_fastrtps_shared_cpp/include/rmw_fastrtps_shared_cpp/custom_subscriber_info.hpp
Line 166 in ccea7a3
According to @jefferyyjhsu, one mutex belongs to
rmw_fastrtps
while the other tofast-dds
.This issue happens since this PR got merged: https://github.com/ros2/rmw_fastrtps/pull/633/files#diff-26246d1a82280de3805240670a0e83425979cd7598ac0eaad21db840ebab7d7dR118-R122
Besides this, I can also see that in
custom_service_info.hpp
, there is no check about ifon_new_request_cb_
exists before calling it:rmw_fastrtps/rmw_fastrtps_shared_cpp/include/rmw_fastrtps_shared_cpp/custom_service_info.hpp
Lines 209 to 220 in ccea7a3
Let me know if you need any more details.
The text was updated successfully, but these errors were encountered: