Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change resolution to 1us rather than 10us #101

Closed
tfgm-bud opened this issue Aug 30, 2016 · 13 comments · Fixed by #136
Closed

Change resolution to 1us rather than 10us #101

tfgm-bud opened this issue Aug 30, 2016 · 13 comments · Fixed by #136

Comments

@tfgm-bud
Copy link

Currently the minimum resolution for fping is 10us. Ideally that would be lowered to 1us. This is especially useful for modern low latency devices. Ultimately, I'm sure measurements under 1us will be desired.

@schweikert
Copy link
Owner

This is going to be quite difficult to achieve because fping uses '10 us' as unit for time internally, so it would mean refactoring quite a few parts, and making sure that this doesn't introduce any issues (in particular about overflowing the time interval variables).

@tycho
Copy link
Contributor

tycho commented Apr 1, 2018

This is a really old issue now, but I have some thoughts.

First of all, gettimefoday is a bad clock source for measuring elapsed time. It's subject to system clock changes, NTP adjustments and NTP slewing, etc. So you can get really wacky numbers at times.

A better choice would be clock_gettime with CLOCK_MONOTONIC. First of all, struct timespec supports up to nanosecond resolution, although you almost never actually see 1ns resolution because just reading the clock takes time. CLOCK_MONOTONIC isn't subject to clock changes or NTP adjustments. but it is subject to NTP slewing (which is why CLOCK_MONOTONIC_RAW exists). It costs about 15-25ns to read it, assuming /sys/devices/system/clocksource/clocksource0/current_clocksource is set to tsc. On the other hand, it costs about 300ns to read CLOCK_MONOTONIC_RAW, so it's probably not worth bothering with that one for now.

Note that CLOCK_MONOTONIC isn't guaranteed by POSIX to be defined. CLOCK_REALTIME is, has the same cost to read as CLOCK_MONOTONIC, but it has the same system clock adjustment pitfalls as gettimeofday. It does have the benefit of being much higher resolution, though. I'd suggest using CLOCK_MONOTONIC when possible and using CLOCK_REALTIME elsewhere.

I'm looking at refactoring stuff to use these clocks instead, but I don't know how far I'll get. There's a lot of magic multiplication and such to keep things at the 10us granularity.

@tycho
Copy link
Contributor

tycho commented Apr 1, 2018

The reason this was bothering me, by the way, was this behavior with Smokeping:

image

It clamps to the nearest multiple of 10us, which makes the result look noisier than it actually is.

I did a first pass of moving to clock_gettime and it seems to work alright. I'm not super happy with it yet, but it appears to work for both netdata and smokeping:

https://github.com/tycho/fping/tree/wip-high-resolution-clock-sources

@tfgm-bud
Copy link
Author

tfgm-bud commented Apr 1, 2018

@tycho I thought this was an April Fools day joke! I scanned your WIP branch and it looks reasonable to me. When I get a chance next week, I'll check it out, build it and give it a test.

@tfgm-bud
Copy link
Author

tfgm-bud commented Apr 3, 2018

Got it working. Pretty straightforward. Updated one of my smokeping servers by modifying the Probes file to specify binary = /usr/local/sbin/fping. No more square waves in my graphs. Nice!

Thanks so much for doing this!

server_last_10800

@tfgm-bud
Copy link
Author

tfgm-bud commented Apr 4, 2018

One thing to note, it is important to setcap cap_net_raw+ep /usr/local/sbin/fping if your smokeping is not running as root. Skipping this step on a couple of servers caused me some grief...

@tycho
Copy link
Contributor

tycho commented Apr 4, 2018

That's probably a better idea than setuid root, yeah.

@schweikert
Copy link
Owner

@tycho Thanks a lot for your work! It looks very good and maybe we could integrate this for fping 4.2 ? I plan to do a fping 4.1 release with some bugfixes soon.
I wonder: what about SO_TIMESTAMP / SO_TIMESTAMPNS ?

@tycho
Copy link
Contributor

tycho commented Jul 29, 2018

Sure, that'd be fine. I should clean up the commit first though.

IIRC, I gutted SO_TIMESTAMP because it was too low resolution and used struct timeval. Didn't know that SO_TIMESTAMPNS existed. That might actually be usable!

@tycho
Copy link
Contributor

tycho commented Sep 30, 2018

Moved to a new branch and added SO_TIMESTAMPNS:
https://github.com/tycho/fping/tree/high-resolution-clock-sources

Submitting a pull request momentarily.

@aneagoe
Copy link

aneagoe commented May 9, 2019

Any progress on this? Seems that the PR is opened since Sep 30 2018 but was not yet merged. I'd love having this in. Thx for the work @tycho

@iustin
Copy link

iustin commented Jan 4, 2020

Friendly ping as well; would very much like this.

@luckman212
Copy link

luckman212 commented Feb 27, 2020

I think this also might fix netdata/netdata#8148 - I'll test shortly. Update: sadly, did not fix but at least I found a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants