Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floating point exception in slowdownSndPeriod() with Ubuntu NetEm tool #887

Closed
stevenwff opened this issue Sep 26, 2019 · 4 comments · Fixed by #888
Closed

Floating point exception in slowdownSndPeriod() with Ubuntu NetEm tool #887

stevenwff opened this issue Sep 26, 2019 · 4 comments · Fixed by #888
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Milestone

Comments

@stevenwff
Copy link

I'm testing the performance of srt-file-transmit , when I used the tc & netem tool to get an environment with 1% packet loss, I came across a "Floating point exception" with srt-file-transmit.

Related Info:
SRT version : v1.4.0, Commit fe2858d
OS : Client - Ubuntu 18.04 64bits, Server - Ubuntu 16.04 64btis
Network : 1Gbps Lan
NetEm command : tc qdisc add dev eth0 root netem loss 1%
srt command : ./srt-file-transmit -s:1000 -pf:csv -statsout:statsnd.csv -logfile:log.txt -logfa:general file:///home/lzx/srt/4K2013.mp4 srt://10.48.40.251:7708

I used the netem command to set up the network and then used the srt command to send a file ~200MB. I exchange the client and server, and it made no difference.

The exception was located in "FileCC::slowdownSndPeriod", Line 497:

      const int pktsInFlight = m_parent->RTT() / m_dPktSndPeriod;
      const int numPktsLost = m_parent->sndLossLength();
      const int lost_pcent_x10 = (numPktsLost * 1000) / pktsInFlight;

It supposed pktsInFlight wouldn't be 0, but actually the in this case, m_parent->RTT() kept decreasing and m_dPktSndPeriod kept increasing, then m_parent->RTT() < m_dPktSndPeriod and the FPE came out.

I insert some test code to show the process. RTT stands for m_parent->RTT() , PSP for m_dPktSndPeriod and loss size for losslist_size.

Target connected (caller)
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 1.000000 
RTT: 95733, PSP: 11.768721 
pktsInFlight: 8134 
m_dPktSndPeriod KEPT 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 11.768721 
RTT: 95733, PSP: 11.768721 
pktsInFlight: 8134 
m_dPktSndPeriod KEPT 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 11.768721 
RTT: 95733, PSP: 11.768721 
pktsInFlight: 8134 
m_dPktSndPeriod KEPT 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 11.768721 
RTT: 95733, PSP: 11.768721 
pktsInFlight: 8134 
m_dPktSndPeriod KEPT 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 11.768721 
RTT: 95733, PSP: 11.768721 
pktsInFlight: 8134 
m_dPktSndPeriod KEPT 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 11.768721 
RTT: 92166, PSP: 11.768721 
pktsInFlight: 7831 
m_dPktSndPeriod KEPT 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 11.768721 
RTT: 92166, PSP: 11.768721 
pktsInFlight: 7831 
m_dPktSndPeriod KEPT

.............................................
.............................................

#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 283.000000 
RTT: 364, PSP: 283.000000 
pktsInFlight: 1 
m_dPktSndPeriod AFTER: 292.000000 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 291.914761 
RTT: 362, PSP: 291.914761 
pktsInFlight: 1 
m_dPktSndPeriod KEPT 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 291.065099 
RTT: 359, PSP: 291.065099 
pktsInFlight: 1 
m_dPktSndPeriod AFTER: 300.000000 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 300.000000 
RTT: 357, PSP: 300.000000 
pktsInFlight: 1 
m_dPktSndPeriod AFTER: 309.000000 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 309.000000 
RTT: 353, PSP: 309.000000 
pktsInFlight: 1 
m_dPktSndPeriod KEPT 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 307.102109 
RTT: 344, PSP: 307.102109 
pktsInFlight: 1 
m_dPktSndPeriod AFTER: 317.000000 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 316.899543 
RTT: 338, PSP: 316.899543 
pktsInFlight: 1 
m_dPktSndPeriod AFTER: 327.000000 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 327.000000 
RTT: 338, PSP: 327.000000 
pktsInFlight: 1 
m_dPktSndPeriod AFTER: 337.000000 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 336.773015 
RTT: 342, PSP: 336.773015 
pktsInFlight: 1 
m_dPktSndPeriod AFTER: 347.000000 
#1 =============Running..........
Loss size: 1
m_dPktSndPeriod BEFORE: 347.000000 
RTT: 342, PSP: 347.000000 
pktsInFlight: 0 
Floating point exception (core dumped)

It looks a little weird because as i tried the same srt command in a real network with packet loss, everything seems all right. I tried to get the log file but only one-line connection information was logged.

Any idea why the problem happened? Thanks!

The stats file was attached.

statsnd.zip

@ethouris
Copy link
Collaborator

I smell division by 0. Might even be that this pktsInFlight was the divisor. I don't know where these above logs come from, but if this pktsInFlight represents what is defined in congctl.cpp:495 line, this would definitely do it in this case:

        const int lost_pcent_x10 = (numPktsLost * 1000) / pktsInFlight;

I'll fix it.

@ethouris
Copy link
Collaborator

Fixed by #888

@stevenwff
Copy link
Author

I modified slowdownSndPeriod() and generated the logs above , to see why pktsInFlight kept decreasing to zero, and it's excactly the pktsInFlight defined in congctl.cpp:495.
If it wouldn't affect the performance, #888 surely fix it.
Thanks for quick response!

@ethouris
Copy link
Collaborator

Thanks; I wouldn't have quickly an env for testing it, just was 99.9% sure that this will exactly fix it. Therefore thanks for making it 100% certainty.

@maxsharabayko maxsharabayko added this to the v1.4.1 milestone Oct 4, 2019
@maxsharabayko maxsharabayko added [core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior labels Oct 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants