Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Fixed CRcvBuffer::getAvailSize() may jump around. #2490

Merged
merged 2 commits into from
Nov 8, 2022

Conversation

gou4shi1
Copy link
Contributor

@codecov-commenter

This comment was marked as off-topic.

@gou4shi1
Copy link
Contributor Author

gou4shi1 commented Oct 18, 2022

Another related issue:

17:21:33.710592/SRT:TsbPd!W:SRT.br: @540757540:RCV-DROPPED 8 packet(s). Packet seqno %1121649207 delayed for 0.061 ms
17:29:41.975790/SRT:RcvQ:w1*E:SRT.qr: @540757540:SEQUENCE DISCREPANCY. BREAKING CONNECTION. seq=1121707496 buffer=(1121699532:1121707476+1121707722), 7719 past max. Reception no longer possible. REQUESTING TO CLOSE.
17:29:42.618912/SRT:GC.N:SRT.sm: @540757540 IS MEMBER OF $1614499363 - REMOVING FROM GROUP

1121707496(imcoming_seqno) - 1121699532(m_iRcvLastAck) = 7964 < 8191(capacity) and the rcv_buf is empty, but getAvailSize() == 246

@gou4shi1
Copy link
Contributor Author

any comment?

@maxsharabayko maxsharabayko added Type: Bug Indicates an unexpected problem or unintended behavior [core] Area: Changes in SRT library core labels Nov 1, 2022
@maxsharabayko maxsharabayko added this to the v1.6.0 milestone Nov 1, 2022
@maxsharabayko
Copy link
Collaborator

I think the data[ACKD_BUFFERLEFT] = (int) getAvailRcvBufferSizeNoLock(); in sendCtrlAck() would not suffer from this issue, because it's after ackDataUpTo(). It ensures m_iRcvLastSkipAck == m_iRcvLastAck >= rcv_buf.m_iStartSeqNo, so I think we can safely change that code block to

I would mention this in a comment near

data[ACKD_BUFFERLEFT] = (int) getAvailRcvBufferSizeNoLock();

@maxsharabayko
Copy link
Collaborator

Might also make sense to add an assertion 🤔

SRT_ASSERT(m_iRcvLastSkipAck == m_iRcvLastAck);

@gou4shi1
Copy link
Contributor Author

gou4shi1 commented Nov 8, 2022

Might also make sense to add an assertion

done

@maxsharabayko maxsharabayko merged commit b76c8b2 into Haivision:master Nov 8, 2022
@gou4shi1 gou4shi1 deleted the fix-getAvailSize branch November 8, 2022 14:09
@maxsharabayko
Copy link
Collaborator

And suddenly I've got this during a long-term test with an artificial 3% packet loss.
A debugger didn't catch up, sadly I don't know the full state and actual values.

16:57:07.036350 [W] [METRICS] Detected loss of 1 packets (seqno [1190757; 1190758))
Assertion failed: m_iRcvLastSkipAck == m_iRcvLastAck, file srtcore\core.cpp, line 8003

@gou4shi1
Copy link
Contributor Author

gou4shi1 commented Nov 12, 2022

Oops, it may caused by this un_bufflock

        ackDataUpTo(ack);
#if ENABLE_BONDING
        const int32_t group_read_seq = m_pRcvBuffer->getFirstReadablePacketInfo(steady_clock::now()).seqno;
#endif
        InvertedLock un_bufflock (m_RcvBufferLock);

@gou4shi1
Copy link
Contributor Author

If https://github.com/Haivision/srt/pull/2530/files#r1020280948 was applied, I'm OK to revert this PR.

@gou4shi1
Copy link
Contributor Author

But it's strange to return capacity() - CSeqNo::seqlen(iFirstUnackSeqNo, iRBufSeqNo) + 1 to the sender if the buffer was dropped/read during the un_bufflock. The sender should suppose [ACK, ACK+ACKD_BUFFERLEFT] is ready to accept packets. And the received ACKD_BUFFERLEFT may jump around without this PR.

@ethouris
Copy link
Collaborator

The value that should be reported back to the sender should be the number of packet cells in the receiver buffer that can be still filled in. Not sure whether the retransmission ignores this value or not, the intention is to make the sender stop sending if there's no storage on the receiver side that could store it. This means that a retransmission should be always allowed, while only new packets could be held back, but then the value reported should be the distance between the last received seequence and the first sequence taken out from capacity. So you should do simply:

int occupied = CSeqNo::seqoff(m_pRcvBuffer->getStartSeqNo, m_iRcvCurrPhySeqNo);
bufferleft = m_pRcvBuffer->capacity() - occupied; // with check if less than 0

The value of occupied should not be negative (guaranteed) and should not exceed capacity (should never happen, but might be on buffer overflow, in the moment when the link should break, so it still should be checked). I don't know exactly what it had to mean in UDT that it was using the last ACK-ed packet, but this makes little sense - the buffer remaining capacity is an information important for sending a new packet, but for a retransmission there should always be reserved place in the buffer, as well as the range between the last received and the last ACK-ed doesn't belong to the place available for new incoming packets whatsoever, hence it shouldn't be included in this number.

Note that there will be little significance of the border values here in live mode, but in file mode this functionality is crucial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants