Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LATEREXMIT now uses LastACK timer, instead of connection expiration timer #741

Merged
merged 3 commits into from
Jul 22, 2019

Conversation

maxsharabayko
Copy link
Collaborator

As reported by a user, there is an issue in the file mode. When transfering a big file via SRT file mode, the connection may be idle forever before all data have been transferred.

In the method CUDT::checkTimers(), for the file mode, the loss packets retransmission will be delayed by the time calculated by the following:
exp_int_tk = (m_iEXPCount * (m_iRTT + 4 * m_iRTTVar) + COMM_SYN_INTERVAL_US) * m_ullCPUFrequency;

The loss packets will be retransmitted at the time:
next_exp_time_tk = m_ullLastRspTime_tk + exp_int_tk;

This time will never be reached if exp_int_tk > (COMM_KEEPALIVE_PERIOD_US * m_ullCPUFrequency). Because m_ullLastRspTime_tk may be updated by keepalive packets every COMM_KEEPALIVE_PERIOD_US microseconds.

Further investigation showed, LATEREXMIT packet retransmission logic relies on connection expiration timer (m_ullLastRspTime_tk), instead of last ACK timer (m_ullLastRspAckTime_tk). And the connection expiration timer is updated with every incoming KEEPALIVE packet, therefore under certain conditions the retransmission will never happen and the connection remains idle.

This PR fixes the above issue by moving the LATEREXMIT logic together with FASTREXMIT into a separate function checkRexmitTimer, that shares the same timeout value for retransmission.

Expiration timer used by LATE_REXMIT:

uint64_t exp_int_tk =
    (m_iEXPCount * (m_iRTT + 4 * m_iRTTVar) + COMM_SYN_INTERVAL_US) * m_ullCPUFrequency;
if (exp_int_tk < m_iEXPCount * m_ullMinExpInt_tk)
    exp_int_tk = m_iEXPCount * m_ullMinExpInt_tk;
next_exp_time_tk = m_ullLastRspTime_tk + exp_int_tk;

New time value

uint64_t rtt_syn = (m_iRTT + 4 * m_iRTTVar + 2 * COMM_SYN_INTERVAL_US);
uint64_t exp_int = (m_iReXmitCount * rtt_syn + COMM_SYN_INTERVAL_US) * m_ullCPUFrequency;

next_exp_time_tk = m_ullLastRspAckTime_tk + exp_int;

⚠️
This change basically makes the logic of FAST_REXMIT and LATE_REXMIT almost similar, except for FAST_REXMIT will not work if the receiver is expected to send NAK reports (m_bPeerNakReport).

Also might be worth adding m_ullMinNakInt_tk as the minimum time before retransmission.

@maxsharabayko maxsharabayko added Type: Bug Indicates an unexpected problem or unintended behavior [core] Area: Changes in SRT library core labels Jun 27, 2019
@maxsharabayko maxsharabayko added this to the v.1.3.3 milestone Jun 27, 2019
@maxsharabayko
Copy link
Collaborator Author

The LATEREXMIT is the default blind retransmission mode of UDT.

UDT Draft Section 4. Timers.

EXP is used to trigger data packets retransmission and maintain
connection status. Its period is dynamically updated to N * (4 * RTT + RTTVar + SYN),
where N is the number of continuous timeouts. To
avoid unnecessary timeout, a minimum threshold (e.g., 0.5 second)
should be used in the implementation.

UDT logic for the connection expiration is the following.
Each time EXP timer is triggered, resend unacknowledged packets currently in the sender's buffer if there are some, otherwise send KEEPALIVE message.
In SRT a KEEPALIVE is sent if current time exceeds m_ullLastSndTime_tk + COMM_KEEPALIVE_PERIOD_US.

Note. m_ullLastSndTime_tk is not protected by a mutex.

@maxsharabayko
Copy link
Collaborator Author

Another option is to keep EXP event triggering LATEREXMIT, and to additionally trigger a LATEREXMIT upon receiving a KEEPALIVE message if
tKA - LastRspAck > RTT + 4 × RTTVar + SYN

LastRspAck time is updated when SRT socket is created, when a connection is established (since PR #745), an ACK packet for yet unacknowledged packet is received, and also if the sender's buffer has no unacknowledged packets on the new data submitted by srt_sendmsg().

In case the sending is resumed after an idle period, a keepalive message might be already underway from the receiver to the sender, while sender sends a new DATA packet. After sending a DATA packet, LastRspAck is updated by the sender (because there were no data in the sender's buffer), and thus we should expect some response after ~RTT period. It a keepalive message is received after (RTT + 4 × RTTVar + SYN), then it is likely the DATA packet was lost during transmission and LATEREXMIT should be triggered.

@maxsharabayko maxsharabayko marked this pull request as ready for review July 5, 2019 08:09
srtcore/core.cpp Outdated Show resolved Hide resolved
@maxsharabayko
Copy link
Collaborator Author

maxsharabayko commented Jul 19, 2019

Validation tests to do.

  1. Send 50 packets, force loosing last 10 packets, check the retransmission is triggered.
    1.1. File mode NAK report ON - LATEREXMIT should trigger retransmission
    1.2. File mode NAK report OFF - LATEREXMIT should trigger retransmission
    1.3. Live mode NAK report ON - should hang up (no retransmission will happen)
    1.4. Live mode NAK report OFF - should retransmit
  2. Start live streaming. Turn off loss reports on the receiver. Turn off NAK report. Set latency to 4 * RTT. Sender should detect losses after RTT + 4 * RTTVar and retransmit unacknowledged packets.

@maxsharabayko
Copy link
Collaborator Author

The hardcoded changes to force loosing certain packets.

  • Forcing initial sequence number to be 2048.
  • Sender looses packets with sequence numbers in range [2098; 2148] unless retransmission flag is set.
diff --git a/srtcore/api.cpp b/srtcore/api.cpp
index 87706be..3341a8a 100644
--- a/srtcore/api.cpp
+++ b/srtcore/api.cpp
@@ -841,6 +841,8 @@ int CUDTUnited::connect(const SRTSOCKET u, const sockaddr* name, int namelen, in
    // (CONNECTED vs. CONNECTING).
    s->m_Status = SRTS_CONNECTING;
 
+   forced_isn = 2048;
+
    /* 
    * In blocking mode, connect can block for up to 30 seconds for
    * rendez-vous mode. Holding the s->m_ControlLock prevent close
diff --git a/srtcore/queue.cpp b/srtcore/queue.cpp
index b9509b9..74543e2 100644
--- a/srtcore/queue.cpp
+++ b/srtcore/queue.cpp
@@ -607,7 +607,9 @@ void* CSndQueue::worker(void* param)
-            self->m_pChannel->sendto(addr, pkt);
+
+            if (pkt.getRexmitFlag() || (pkt.getSeqNo() < 2048 + 50 || pkt.getSeqNo() > 2048 + 100))
+                self->m_pChannel->sendto(addr, pkt);
 
 #if      defined(SRT_DEBUG_SNDQ_HIGHRATE)
             self->m_WorkerStats.lSendTo++;

@maxsharabayko
Copy link
Collaborator Author

maxsharabayko commented Jul 19, 2019

Retransmission tests

1.1 File mode NAK report ON ✔️

RTT=90ms RTTVar=17ms ExpInt=187ms
LATEREXMIT by ACK-TMOUT happened after:
310 ms since 50-th DATA packet
210 ms since ACK of the 50th packet

1.2 File mode NAK report OFF ✔️

RTT=94ms RTTVar=13ms ExpInt=178ms
LATEREXMIT by ACK-TMOUT happened after:
310 ms since 50-th DATA packet
209 ms since ACK of the 50th packet

1.3. Live NAK report ON ✔️

Retransmission does not happen.

1.4. Live NAK report OFF ✔️

Link properies: RTT=98443, RTTVar=31238, ExpInt=253395.
FASTREXMIT by ACK-TMOUT happened after:
412 ms since 50-th DATA packet
310 ms since ACK of the 50th packet

@rndi rndi merged commit 759ae8d into Haivision:master Jul 22, 2019
@maxsharabayko maxsharabayko deleted the hotfix/laterexmit branch August 20, 2019 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants