Introduction
For years network adapter manufacturer companies have educated their customers that network monitoring applications can’t live without hardware packet timestamps (i.e. the ability for the network adapter to report to the driver the time a given packet was sent or received). State of the art FPGA-based network adapters [1, 2, 3] have hardware timestamps with a resolution of +/- ~10 nsec and accuracy of +/- ~50 nsec so that monitoring applications can safely assume an accuracy of 100 nsec in measurements, for sub-usec measurements. Commodity adapters such as Intel 1 Gbit provide both RX and TX timestamps out of the box with IEEE 1588 time synchronisation, so the problem is on 10 Gbit (this until Intel comes us with a 10G adapter with hardware timestamps).
Who Really Needs Sub-microsecond Packet Timestamps?
This is a good question. Everyone seems to want it, but they in practice they might not need it. Let’s clarify this point a bit more in detail. For RTT (Round-Trip Time) measurements (i.e. I want to see how long a packet takes from location X to location Y) measurements on long-distance (e.g. Italy to USA and back) the order of magnitude is msec (actually tenth/hundred of msec) so usec are not needed, for a LAN is not needed either because if the probe packet used to monitor RTT is originated/received on the same adapter, 1 Gbit commodity adapters can do the trick and PF_RING supports them. For one-way delay (i.e. how to measure the time from A->B) on a WAN, 1G adapters+IEEE 1588 can do the trick (the delay is in msec), on a LAN same as above.
So who needs really sub-microsecond hardware timestamps at 10 Gbit (at 1 Gbit we have the solution as explained until now)? Reading on the Internet, it seems that one of the few markets where they are needed is in microburst detection [1, 2] in particular on critical networks such as high-frequency trading and industrial plants.
Can ntop Provide Sub-microsecond Timestamps in Software at 10 Gbit?
In short: yes we can. When we developed our n2disk application at 10 Gbit, we have faced with the problem of timestamps as no commodity adapter supported them. We have spent quite some time to optimise this application and these are our findings:
- We suppose to use a server machine with a good motherboard (i.e. Dell, Supermicro, HP), no toy PCs. This guarantees that the clock on the board is of good quality.
- The call to clock_gettime() used to read the timestamp in software takes ~30 nsec in our tests. As at 10 Gbit the max packet ingress rate is (14.88 Mpps) is 67 nsec, reading the timestamp once the packet is received it overkilling (not to mention that the reported time will be shifted in the future with respect to real packet arrive).
- We decided to create a thread (we called it pulse thread) that calls clock_gettime() at full speed and shares the time with the capture thread.
On our E3-1230 (CPU cost ~200 USD) starting n2disk as follows
n2disk10g -o /tmp/ -p 1024 -b 2048 -i dna0 --active-wait -C 1024 -w 0 -S 2 -c 4 -v -R 6 --nanoseconds
we can achieve both 10 Gbit to disk
25/Apr/2013 10:20:37 [n2disk.c:576] [PF_RING] Total stats: 90843997 pkts rcvd/90843997 pkts filtered/0 pkts dropped [0.0%] 25/Apr/2013 10:20:37 [n2disk.c:592] Capture Duration: 00:00:06 25/Apr/2013 10:20:37 [n2disk.c:594] Average Capture Throughout: 10.00 Gbit / 14.88 Mpps 25/Apr/2013 10:20:37 [n2disk.c:1593] [writer] Thread terminated 25/Apr/2013 10:20:37 [n2disk.c:3664] Writer thread terminated 25/Apr/2013 10:20:37 [n2disk.c:2805] Packet capture thread terminated 25/Apr/2013 10:20:37 [n2disk.c:3668] Reader thread terminated 25/Apr/2013 10:20:37 [n2disk.c:3673] Time thread terminated
and high-accuracy timestamps. In fact this is what happens:
< 30 nsec timestamps (as in the above test)
60 1366418032.342040270 6726 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342040355 6727 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342040430 6728 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342040502 6729 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342040613 6730 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342040728 6731 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342040767 6732 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342040890 6733 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342041036 6734 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342041238 6735 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342041427 6736 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342041610 6737 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342041685 6738 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342041835 6739 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342041982 6740 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342042056 6741 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342042167 6742 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342042327 6743 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342042441 6744 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366418032.342042515 6745 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd)
As you can see all packets have different timestamps
100 nsec timestamps
60 1366417070.119899160 5183 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119899386 5184 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119899501 5185 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119899615 5186 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119899615 5187 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119899731 5188 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119899846 5189 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119899960 5190 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900073 5191 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900187 5192 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900301 5193 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900417 5194 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900532 5195 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900646 5196 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900646 5197 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900762 5198 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900877 5199 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119900989 5200 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119901104 5201 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417070.119901218 5202 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd)
Bad: some packets have the same timestamp.
500 nsec timestamps
60 1366417709.563877691 3466 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878226 3467 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878226 3468 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878226 3469 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878226 3470 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878226 3471 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878226 3472 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878763 3473 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878763 3474 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878763 3475 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878763 3476 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878763 3477 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563878763 3478 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563879297 3479 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563879297 3480 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563879297 3481 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563879297 3482 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563879297 3483 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563879297 3484 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd) 60 1366417709.563879832 3485 192.85.1.2 -> 192.0.0.1 IP Unknown (0xfd)
Very bad: too many packets have the same timestamp.
Conclusion
Using software timestamps and our “timestamp trick” you can achieve ~30 nsec timestamp precision, so that at 10 Gbit line rate all packets have a different timestamp (so we’re below 67 nsec timestamp resolution). This means that you can use n2disk for detecting microbursts at 10 Gbit line rate as:
- It can handle 14.88 Mpps with no drops when dumping them to disk with nsec timestamps
- You can avoid using hardware timestamps for sub-usec precision and leave them only for specific tasks where you need very accurate ~100 nsec timestamps. At this point in time however, we have not received any request from people who really need them, so we’re confident that our approach can be enough for most people.
- Hardware timestamps still make sense in those cases where you need a NIC with a GPS signal ingress, so that you can accurately sync the time over long distance with an accuracy better than what IEEE 1588 can offer you.