Skip to content
Aaron Brown edited this page Mar 31, 2015 · 2 revisions

Data collected by the NDT

Abstract

The Network Diagnostic Tool (NDT) is a client/server program that provides network configuration and performance testing to a user's computer. The measured factors along with some additional analysis and detailed test traces are stored by the server for the future reuse. Moreover, multi-level results allow novice and expert users to view and understand the test results. This document describes what, where and when is stored by the NDT server. The information contained here should be enough to support parsing of the collected data for analysis and visualization purposes.

Table of Contents

Introduction

The NDT uses the following files to store the test results data:

File Scope of the collected data Always used? File extension
tcpdump trace C2S/S2C tests NO (enabled by -t, --tcpdump options) *.c2s_ndttrace / *.s2c_ndttrace
web100 snaplog trace C2S/S2C tests NO (enabled by --snaplog option) *.c2s_snaplog / *.s2c_snaplog
web100srv.log test session YES *.log
cputime trace test session NO (enabled by --cputime option) *.cputime
meta data file test session YES *.meta

The web100srv.log file is by default written under the /usr/local/ndt/ directory. However, this can be changed by using the -l, --log options.

All other files are saved under the data directory (by default /usr/local/ndt/serverdata/) which can be set by using the -L, --log_dir options. To minimize the number of files in one directory, the test results are further grouped by the year, month and day the test was started in.

To sum up, the result files are saved in the following directories:

/DataDirPath/YYYY/MM/DD/

For example:

/usr/local/ndt/serverdata/2011/08/24/

The mentioned files (all except web100srv.log) are saved using the following filename format:

ISOtime_serverFQDN:portNumber.extension

where:

  • ISOtime - current time in the ISO format
  • serverFQDN - server host fully qualified name
  • portNumber - currently used port number (ephemeral ports for the snaplog and tcpdump traces, control session port for the cputime and meta files)
  • extension - extension used to distinguish file types

For example:

20110824T21:11:45.99598000Z_jeremian-laptop.local:33687.c2s_ndttrace

Collected data formats

meta data file

The meta data file is always created for every test session. If the particular test does not have a corresponding meta data file, it can be caused by the following issues:

  • the test was abruptly interrupted by some unexpected error
  • there was an error during file creation (for example lack of disk space)
  • the test was performed by the old NDT version, that did not write meta data files

The meta data file contains the names of the other files, the clients & servers IP addr and FQDN, and the encoded line of the web100/analysis data. The meta data contains lines in the following order:

# Name Description/Comment
1. Date/Time Timestamp of the beginning of the test session
2. C2S snaplog file Filename of the snaplog taken during C2S Throughput test
3. C2S ndttrace file Filename of the tcpdump trace taken during C2S Throughput test
4. S2C snaplog file Filename of the snaplog taken during S2C Throughput test
5. S2C ndttrace file Filename of the tcpdump trace taken during S2C Throughput test
6. cputime file Filename of the cputime trace taken during the whole test session
7. server IP address IP address of the NDT server
8. server hostname Hostname of the NDT server
9. server kernel version Kernel version of the NDT server's OS
10. client IP address IP address of the NDT client
11. client hostname Fully qualified hostname of the NDT client
12. client OS name (optional) Name of the NDT client's OS
13. client browser name (optional) Name of the NDT client's web browser (if applicable)
14. Summary data A one-line summary of the test containing the final values of some of the web100 variables with an additional analysis details. The particular values are separated with commas without any spaces. The results are written with the same order and contain the same values as the one-line summary from the web100srv.log file (except the date and client name)
15. Additional data (optional) A multi-line information containing additional data sent by the client using the META protocol. Currently defined key/value pairs can be found in the NDT Protocol document

The optional values from the meta data file are set during the META test. This means that these values are empty when the client does not support the META test or it does not send these values.

Please look at the following sample meta data file:

Date/Time: 20091009T20:46:28.141927000Z
c2s_snaplog file: 20091009T20:46:28.141927000Z_149.20.53.166:60102.c2s_snaplog.gz
c2s_ndttrace file: 20091009T20:46:28.141927000Z_149.20.53.166:60102.c2s_ndttrace.gz
s2c_snaplog file:
s2c_ndttrace file: 20091009T20:46:28.141927000Z_149.20.53.166:61730.s2c_ndttrace.gz
cputime file:
server IP address: 4.71.254.147
server hostname: mlab1.atl01.measurement-lab.org
server kernel version: 2.6.22.19-vs2.3.0.34.32.mlab.pla
client IP address: 149.20.53.166
client hostname: nb.tech.org
client OS name:
client_browser name:
Summary data:313,2246,1658,0,69963,1020,19,8,1961,0,1448,392,1428,525600,112560,
43440,0,10001122,25307,2899352,0,15,15,21720,270,525600,100,0,0,0,1,6,2,8,10,8,1961,
66,10,22,964,0,166,0,-1208754176,3,0,31,0,360,1448,43440,8

tcpdump trace

Writing tcpdump trace files is enabled by the -t, --tcpdump options.

These are a standard trace files for all packets (except the initial syn & syn/ack exchange) sent during the individual C2S/S2C subtests (only the throughput tests' traffic is gathered that is sent on a newly created connection, all the NDT control protocol communication is not captured in these dump files).

The packets from the C2S Throughput test are stored in the *.c2s_ndttrace file, the packets from the S2C Throughput test are stored in the *.s2c_ndttrace file.

These tcpdump files are written using the pcap library.

The tcpdump files can be read using tcpdump and tcptrace programs.

web100 snaplog trace

Writing snaplog files is enabled by the --snaplog option.

These files contain all the web100 kernel MIB variables' values written in a fixed time (default is 5 msec) increments for the individual C2S/S2C subtests (these snapshots are taken only for a newly created connection used for the throughput test).

The snapshots from the C2S Throughput test are stored in the *.c2s_snaplog file, the snaphots from the S2C Throughput test are stored in the *.s2c_snaplog file.

The snaplog files are a binary files written in the format defined by the web100 library. In addition to the standard web100 variables' values written in each snapshot, the 'Duration' variable is also stored to indicate the number of seconds from the beginning of the snaplog session.

The list of logged variables with a short description can be found in the NDT Protocol document.

The NDT package comes with a utility (called genplot) that can convert snaplog trace files into text or xplot graph files. Alternatively the user can write their own analysis program using the web100 library functions.

web100srv.log file

The NDT always writes a one-line summary of the test to the web100srv.log file. The web100srv.log file is a standard log file and may contain lines that aren't summary lines (e.g. if debugging is enabled, there's a ton of information logged to this file).

This one-line summary contains the final values of some of the web100 variables with an additional analysis details. The particular values are separated with commas without any spaces. The results are stored with the following order:

# Name Description/Comment
1. date start time of the test session (in the form "Jun 30 21:49:08"). Warning: this field does not contain information about the year
2. client name remote (client) host fully qualified name (if the fully qualified name cannot be determined, then the numeric form of the address will be used here)
3. MID throughput speed CWND limited throughput speed measured during Middlebox test (value in kb/s). The details how this value is computed can be found in the NDT Test Methodology document.
4. S2C throughput speed measured throughput speed from server to client (value in kb/s). The details how this value is computed can be found in the NDT Test Methodology document.
5. C2S throughput speed measured throughput speed from client to server (value in kb/s). The details how this value is computed can be found in the NDT Test Methodology document.
6. Timeouts (*tcp-kis.txt) The number of times the retransmit timeout has expired when the RTO backoff multiplier is equal to one.
7. SumRTT (*tcp-kis.txt) The sum of all sampled round trip times.
8. CountRTT (*tcp-kis.txt) The number of round trip time samples included in tcpEStatsPathSumRTT and tcpEStatsPathHCSumRTT.
9. PktsRetrans (*tcp-kis.txt) The number of segments transmitted containing at least some retransmitted data.
10. FastRetran (*tcp-kis.txt) The number of invocations of the Fast Retransmit algorithm.
11. DataPktsOut (*tcp-kis.txt) The number of segments sent containing a positive length data segment.
12. AckPktsOut (*tcp-kis.txt) The number of pure ack packets that have been sent on this connection by the Local Host.
13. CurMSS (*tcp-kis.txt) The current maximum segment size (MSS), in octets.
14. DupAcksIn (*tcp-kis.txt) The number of duplicate ACKs received.
15. AckPktsIn (*tcp-kis.txt) The number of valid pure ack packets that have been received on this connection by the Local Host.
16. MaxRwinRcvd (*tcp-kis.txt) The maximum window advertisement received, in octets.
17. Sndbuf (*tcp-kis.txt) The socket send buffer size in octets. Note that the meaning of this variable is implementation dependent. Particularly, it may or may not include the retransmit queue.
18. MaxCwnd (*tcp-kis.txt) The maximum congestion window used during Slow Start, in octets.
19. SndLimTimeRwin (*tcp-kis.txt) The cumulative time spent in the 'Receiver Limited' state.
20. SndLimTimeCwnd (*tcp-kis.txt) The cumulative time spent in the 'Congestion Limited' state.
21. SndLimTimeSender (*tcp-kis.txt) The cumulative time spent in the 'Sender Limited' state.
22. DataBytesOut (*tcp-kis.txt) The number of octets of data contained in transmitted segments, including retransmitted data. Note that this does not include TCP headers.
23. SndLimTransRwin (*tcp-kis.txt) The number of transitions into the 'Receiver Limited' state from either the 'Congestion Limited' or 'Sender Limited' states. This state is entered whenever TCP transmission stops because the sender has filled the announced receiver window.
24. SndLimTransCwnd (*tcp-kis.txt) The number of transitions into the 'Congestion Limited' state from either the 'Receiver Limited' or 'Sender Limited' states. This state is entered whenever TCP transmission stops because the sender has reached some limit defined by congestion control (e.g. cwnd) or other algorithms (retransmission timeouts) designed to control network traffic.
25. SndLimTransSender (*tcp-kis.txt) The number of transitions into the 'Sender Limited' state from either the 'Receiver Limited' or 'Congestion Limited' states. This state is entered whenever TCP transmission stops due to some sender limit such as running out of application data or other resources and the Karn algorithm. When TCP stops sending data for any reason which can not be classified as Receiver Limited or Congestion Limited it MUST be treated as Sender Limited.
26. MaxSsthresh (*tcp-kis.txt) The maximum slow start threshold, excluding the initial value.
27. CurRTO (*tcp-kis.txt) The current value of the retransmit timer RTO.
28. CurRwinRcvd (*tcp-kis.txt) The most recent window advertisement received, in octets.
29. link A detected link type by the set of custom heuristics. The details of the detection algorithm can be found in the NDT Test Methodology document.

This variable can have the following values:
  • 100 - link type cannot be detected (tests did not recognize the link)
  • 0 - detection algorithm failed (due to some error condition)
  • 10 - Ethernet link (Fast Ethernet)
  • 3 - wireless link
  • 2 - DSL/Cable modem link

30. duplex mismatch A detected duplex mismatch condition. The details of the detection algorithm can be found in the NDT Test Methodology document.

This variable can have the following values:
  • 0 - no duplex mismatch condition was detected
  • 1 - possible duplex mismatch condition was detected by the Old Duplex-Mismatch algorithm
  • 2 - possible duplex mismatch condition was detected by the new algorithm: Switch=Full and Host=Half
31. bad_cable A detected bad cable condition. The details of the detection algorithm can be found in the NDT Test Methodology document.

This variable can have the following values:
  • 0 - no bad cable condition was detected
  • 1 - possible bad cable condition was detected
32. half_duplex A detected half duplex condition. The details of the detection algorithm can be found in the NDT Test Methodology document.

This variable can have the following values:
  • 0 - no half duplex condition was detected (it means a full duplex subnet in the case of the Fast Ethernet link)
  • 1 - possible half duplex condition was detected (it means a half duplex subnet in the case of the Fast Ethernet link)
33. congestion A detected congestion condition. The details of the detection algorithm can be found in the NDT Test Methodology document.

This variable can have the following values:
  • 0 - no congestion condition was detected
  • 1 - possible congestion condition was detected (it means that other network traffic was congesting the link during the test)
34. c2sdata A link type detected by the Bottleneck Link Detection algorithm using the Client --> Server data packets' inter-packet arrival times. The details of the detection algorithm can be found in the NDT Test Methodology document.

This variable can have the following values:
  • -1 - System Fault
  • 0 - RTT
  • 1 - Dial-up Modem
  • 2 - Cable/DSL modem
  • 3 - 10 Mbps Ethernet or WiFi 11b subnet
  • 4 - 45 Mbps T3/DS3 or WiFi 11 a/g subnet
  • 5 - 100 Mbps Fast Ethernet subnet
  • 6 - a 622 Mbps OC-12 subnet
  • 7 - 1.0 Gbps Gigabit Ethernet subnet
  • 8 - 2.4 Gbps OC-48 subnet
  • 9 - 10 Gbps 10 Gigabit Ethernet/OC-192 subnet
  • 10 - Retransmissions
35. c2sack A link type detected by the Bottleneck Link Detection algorithm using the Client <-- Server Ack packets' inter-packet arrival times. The details of the detection algorithm can be found in the NDT Test Methodology document.

This variable can have the same values as the c2sdata variable 36. s2cdata A link type detected by the Bottleneck Link Detection algorithm using the Server --> Client data packets' inter-packet arrival times. The details of the detection algorithm can be found in the NDT Test Methodology document.

This variable can have the same values as the c2sdata variable 37. s2cack A link type detected by the Bottleneck Link Detection algorithm using the Server <-- Client Ack packets' inter-packet arrival times. The details of the detection algorithm can be found in the NDT Test Methodology document.

This variable can have the same values as the c2sdata variable 38. CongestionSignals (*tcp-kis.txt) The number of multiplicative downward congestion window adjustments due to all forms of congestion signals, including Fast Retransmit, ECN and timeouts. This object summarizes all events that invoke the MD portion of AIMD congestion control, and as such is the best indicator of how cwnd is being affected by congestion. 39. PktsOut (*tcp-kis.txt) The total number of segments sent. 40. MinRTT (*tcp-kis.txt) The minimum sampled round trip time. 41. RcvWinScale (*tcp-kis.txt) The value of Rcv.Wind.Scale. Note that RcvWinScale is either zero or the same as WinScaleSent. 42. autotune (deprecated/not used) This value kept the information about web100 autotune functionality. It could have the following values: 0 - autotune is disabled, 1 - sbufmode autotune is enabled, 2 - rbufmode autotune is enabled, 3 - all autotune modes are enabled, 22 - autotune params cannot be found, 23 - autotune params cannot be read 43. CongAvoid (*tcp-kis.txt) The number of times the congestion window has been increased by the Congestion Avoidance algorithm. 44. CongestionOverCount (*tcp-kis.txt) The number of congestion events which were 'backed out' of the congestion control state machine such that the congestion window was restored to a prior value. This can happen due to the Eifel algorithm RFC3522 or other algorithms which can be used to detect and cancel spurious invocations of the Fast Retransmit Algorithm. 45. MaxRTT (*tcp-kis.txt) The maximum sampled round trip time. 46. OtherReductions (*tcp-kis.txt) The number of congestion window reductions made as a result of anything other than AIMD congestion control algorithms. Examples of non-multiplicative window reductions include Congestion Window Validation RFC2861 and experimental algorithms such as Vegas. 47. CurTimeoutCount (*tcp-kis.txt) The current number of times the retransmit timeout has expired without receiving an acknowledgment for new data. tcpEStatsStackCurTimeoutCount is reset to zero when new data is acknowledged and incremented for each invocation of section 5.5 in RFC2988. 48. AbruptTimeouts (*tcp-kis.txt) The number of timeouts that occurred without any immediately preceding duplicate acknowledgments or other indications of congestion. Abrupt Timeouts indicate that the path lost an entire window of data or acknowledgments. Timeouts that are preceded by duplicate acknowledgments or other congestion signals (e.g., ECN) are not counted as abrupt, and might have been avoided by a more sophisticated Fast Retransmit algorithm. 49. SendStall (*tcp-kis.txt) The number of interface stalls or other sender local resource limitations that are treated as congestion signals. 50. SlowStart (*tcp-kis.txt) The number of times the congestion window has been increased by the Slow Start algorithm. 51. SubsequentTimeouts (*tcp-kis.txt) The number of times the retransmit timeout has expired after the RTO has been doubled. See section 5.5 in RFC2988. 52. ThruBytesAcked (*tcp-kis.txt) The number of octets for which cumulative acknowledgments have been received, on systems that can receive more than 10 million bits per second. Note that this will be the sum of changes in tcpEStatsAppSndUna. 53. peaks.amount The number of times the CWND peaked (i.e. transitioned from increasing to decreasing) during the S2C test. By default, collected at 5ms intervals. 54. peaks.min The minimum CWND peak seen during the S2C test. By default, collected at 5ms intervals. 55. peaks.max The maximum CWND peak seen during the S2C test. By default, collected at 5ms intervals.

(*) web100 variable. For a more detailed and up-to-date description please look at the current version of the tcp-kis.txt file.

cputime trace

Writing cputime file is enabled by --cputime option.

This file contains lines with times routine results recorded using a 100ms interval. Each of the lines contains the following data separated by a single space in this order:

# Name Description/Comment
1. time seconds from the beginning of the test
2. user time contains the CPU time spent executing instructions of the calling process
3. system time contains the CPU time spent in the system while executing tasks on behalf of the calling proces
4. user time of the children contains the sum of the user time and user time of the children values for all waited-for terminated children
5. system time of the children contains the sum of the system time and system time of the children values for all waited-for terminated children

All times reported are in clock ticks.

A clock tick is same as a cycle, the smallest unit of time recognized by a device. For personal computers, clock ticks generally refer to the main system clock. The number of clock ticks per second can be obtained using:

sysconf(_SC_CLK_TCK);

Displaying collected data

The NDT server also contains a Java application that can look through the NDT log files and link log entries to trace files, and it also simplifies the viewing of this data.

The JAnalyze application can be started by running a single jar file:

$ java -jar JAnalyze.jar

The NDT web100srv.log file can be loaded by using Load option from the File menu.

You can see the JAnalyze application in action on the following screenshot:

The utility will need to be rewritten slightly to take advantage of the meta files instead of (in addition to) the log files.

The detailed tcpdump and snaplog files can be used to re-examine the test to see what happened, i.e.:

  • was there packet loss during the s2c test, and if so when/how often did it occur?
  • what was the maximum throughput during TCP's slow-start growth phase?
  • how many times did TCP oscillate in the Congestion Avoidance phase?
  • did the test compete with other traffic?
  • was there non-congestive loss on the path?
  • was the test limited by the user's PC (default tunable settings)?
  • how often did TCP retransmit packets, and were any of these unnecessary?
  • were packets being reordered (e.g., sent 1, 2, 3, 4, 5 but received 1, 2, 4, 3, 5)
  • what was the capacity of the bottleneck link in this path?
  • did the test produce the expected results?
  • was the client connected to a wired or wireless (WiFi) network?
  • is there a firewall and/or NAT box in the path?

More information about NDT:
http://www.internet2.edu/performance/ndt/