-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Access to RTCP member variable participants_ is not thread safe, leading to a crash #172
Comments
Hi. I'm sorry to hear that you have encountered a bug. Are you setting the SDES items by calling send_sdes_packet()-function? From your description, the problem seems to be that new SDES items appear between the time the compound_packet_size is calculated and the packet itself is constructed. There is a mutex called packet_mutex_ which should prevent all modifications to packet data while they are being sent, so I'm unable to figure out the cause for this error at this time. BR, Joni |
No, I am not calling send_sdes_packet() |
Are you recreating or restarting the stream? I was just wondering if there is something wrong with the stopping of the RTCP, which could explain this behavior. BR, Joni |
I am not sure, I have observed a crash after running a test for a day, and after running for a minute. I cannot say whether stream recreation has something to do with it |
@jrsnen I have found the problem. I don't know how to fix it, hopefully you have some ideas uint8_t reports = 0;
for (auto& p : participants_)
{
if (p.second->stats.received_rtp_packet)
{
++reports;
}
} a little later it is used to allocate memory for packet array: uint32_t compound_packet_size = size_of_compound_packet(reports, sr_packet, rr_packet, sdes_packet, app_packets_size, bye_packet); Later in this function, when the memory is already allocated, we check participant stats again to populate the array: for (auto& p : participants_)
{
// only add report blocks if we have received data from them
if (p.second->stats.received_rtp_packet)
{
uint32_t dropped_packets = p.second->stats.dropped_pkts;
// TODO: Fraction should be the number of packets lost compared to number of packets expected (see fraction lost in RFC 3550)
// see https://datatracker.ietf.org/doc/html/rfc3550#appendix-A.3
//uint8_t fraction = dropped_packets ? p.second->stats.received_bytes / dropped_packets : 0;
uint8_t fraction = 0; // disabled, because it was incorrect
uint64_t diff = (u_long)uvgrtp::clock::hrc::diff_now(p.second->stats.sr_ts);
uint32_t dlrs = (uint32_t)uvgrtp::clock::ms_to_jiffies(diff);
/* calculate delay of last SR only if SR has been received at least once */
if (p.second->stats.lsr == 0)
{
dlrs = 0;
}
construct_report_block(frame, write_ptr, p.first, fraction, dropped_packets,
p.second->stats.cycles, p.second->stats.max_seq, (uint32_t)p.second->stats.jitter,
p.second->stats.lsr, dlrs);
// we only send reports if there is something to report since last report
p.second->stats.received_rtp_packet = false;
}
} The problem is that the stats of participants can be modified from another thread. So, when allocating the array, received_rtp_packet is false, but it can happen so that when we are populating the array received_rtp_packet becomes true, and the array overflows. |
@wowaser thanks, based on your description I think I know what is the problem. The problem seems to be that while modifying the SDES structures is guarded by a mutex, adding participants is not guarded by any mutexes and that leads to this crash. This problem seems to exist also in other functions in RTCP class. This crash and other possible problems would in my option be best solved by adding a new mutex and locking it around every usage of participants_-variable and in generate_report()-function the locked section should be around both uses so that no modification can take place in between. BR, Joni |
@jrsnen Should I add a pull request where I place a couple mutexes around, or do you have more major changes in mind? |
@wowaser A PR would be great! I guess the smart way to do this is to the reverse of what reception does using the same structures, but this does not seem worth the effort at this point. |
Fixed by #188 |
Hello,
I've been trying to debug this for a while now, but can't quite get it. All I can do is share the clues that I have gathered.
The problem is with the compound packet. Shortly before the heap corruption, this function gets weird values, especially header.length gets a ridiculous value like 80000 or something.
After that a random set of errors occurs (version is constant 2, so it being wrong points at memory problems):
Shortly after the whole program crashes.
The problem might be with sent bytes, because I encountered this once (running on localhost):
The text was updated successfully, but these errors were encountered: