-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: time.Sleep takes more time than expected on Windows (1ms -> 10ms) #44343
Comments
It looks like this happened in 8fdc79e. https://go-review.googlesource.com/c/go/+/232298 CC: @ChrisHines |
This is reproducible with a trivial benchmark in time package: func BenchmarkSimpleSleep(b *testing.B) {
for i := 0; i < b.N; i++ {
Sleep(50*Microsecond)
}
} amd64/linux, before/after http://golang.org/cl/232298:
|
For reference, across different sleep times:
|
Looking at the 100µs, the immediate problem is the delay resolution in Prior to http://golang.org/cl/232298, 95% of timer expirations in the 100µs case are detected by After http://golang.org/cl/232298, this path is gone and the wakeup must come from I'm not sure why I'm seeing ~500µs on the 10µs and 50µs benchmarks, but I may have bimodal distribution where ~50% of cases a spinning M is still awake long enough to detect the timer before entering I'm also not sure why @egonelbre is seeing ~14ms on Windows, as that also appears to have 1ms resolution on |
I think the ideal fix to this would be to increase the resolution of As it happens, Linux v5.11 includes In the past, I've also prototyped changing the Both of these are Linux-specific solutions, I'd have to research other platforms more to get a sense of the options there. We also may just want to bring the |
I guess that Perhaps when |
While working on CL232298 I definitely observed anecdotal evidence that the netpoller has more latency than other ways of sleeping. From #38860 (comment):
I didn't try to address that in CL232298 primarily because it was already risky enough that I didn't want to make bigger changes. But an idea for something to try occurred to me back then. Maybe we could improve the latency of non-network timers by having one M block on a I haven't fully gauged how messy that would get. Questions and concerns:
One other oddity that I noticed when testing CL232298: The linux netpoller sometimes wakes up from the timeout value early by a several microseconds. When that happens |
As somewhat of a combination of these, one potential option would be to make
Hm, this sounds like another bug, or perhaps a spurious |
It seems that on Windows, |
My first thought on the Windows behavior is that somehow |
That could be, but I was logging at least some of the calls to netpollBreak as well and don't recall seeing seeing that happen. I saved my logging code in case it can help. https://github.com/ChrisHines/go/tree/dlog-backup |
For reference, output on my Windows 10:
A totally different results from the #44343 (comment).
|
@vellotis this could be because there's something running in the background changing Windows timer resolution. This could be some other Go service/binary built using older Go version. Of course there can plenty of other programs that may change it. You can use https://github.com/tebjan/TimerTool to see what the current value is. There's some more detail in https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted. |
For what it's worth, go1.16.2 darwin/amd64 is also exhibiting this. In a program I'm running, the line |
@prattmic Strangely, I just checked the output of the script now and (although I just restarted it a few hours before) it was spot on at one hour twice in a row. However, yesterday it was "generally" 3 minutes. I don't have an exact time since we were only seeing the minute logged. It was always 3 minutes (rounded) from the previous run. 1:00pm, 2:03pm, 3:06pm, 4:09pm, etc. Some things to note about this loop I'm running, it is calling a shell script using If you have some code you would like me to run I would be happy to - otherwise I will keep an eye on it here as well and report anything else I see. Hopefully some of this is helpful in narrowing down the cause if you haven't already. UPDATE 3-20-2021 10am ET: I just ran my program for about the last 24 hours and sure enough, it would run fine for the first few iterations and then start to sleep longer based on the logging. Mostly it would hover between 3-5 minutes late but once it was 9 minutes! I've (this morning) written another program using the same logic but with simulated activity (since the actual tasks did not seem to be part of the problem) and much more detailed logging. I have it running now in 3 environments, two using |
3 minutes per hour is rather worrisome, especially if some cron-like functionality required in a long-running service... |
In case it's related; on Windows, timers tick while the system is in standby/suspend, but they pause on other platforms. This is supposed to have been addressed already #36141 |
@prattmic et al, I took the liberty of writing a testbed to experiment with this issue. See all the details of test runs on Linux, Intel, and M1, along with source code here: https://github.com/zobkiw/sleeptest Comments welcome - especially if you find a bug in the code. Also, see the |
cc @zx2c4 :-) |
@zobkiw could it be something due to cpu frequency scaling? Can you try locking the cpu to a specific frequency (something low) and see whether the problem disappears? |
Is there any workaround I can use for my program to sleep with a high resolution while this gets fixed? |
|
With time.Sleep now being particularly low resolution on windows in go (see golang/go#44343), waiting for the rest of a frame after it's complete isn't viable. I'm now basing my wait in my own programs on the OS asking for more sound data / the sound buffer growing too full, which doesn't rely on sleep resolution (thanks to using a channel that sleeps/wakes based (ultimately) on an OS audio system callback)
NtCreateWaitCompletionPacket and friends required for #44343 (comment) have been officially documented 😄 @thepudds. This means that I'll try to land CL 488675 in go1.23, which will improve timer resolution to ~1ms. |
That’s great @qmuntal! This should be a very nice step forward for Windows users. |
This is a very long issue, with 150 hidden comments currently. For a period of time, the title here was:
In Sept 2023, the title was changed to add In the interests of making things more manageable, I suggest we re-focus this issue back on Windows so that @qmuntal's https://go.dev/cl/488675 could hopefully close this issue. (This issue also has a decent amount of external visibility, including it was cited as a reason some Windows users couldn't upgrade to modern Go versions, so it would be nice for clarity if @qmuntal's change could close this issue). If that sounds reasonable, I'd be happy to open a new issue for Linux if there isn't one already. Alternatively, if @prattmic thinks the WIP https://go.dev/cl/363417 (which switches to epoll_pwait2 on Linux) or something similar has a chance of landing for Go 1.23 to improve things for Linux as well, then maybe this issue could be both for Windows and Linux. (Even so, I'm not sure there is an improvement pending in the near term for "other" OSes). Separately, I wanted to copy here for posterity an explanation @prattmic gave earlier today in Gopher Slack in response to someone observing that on Linux they were "seeing a weird mix of certain sub-ms sleeps being serviced timely, and some rounding to 1ms":
@prattmic had some related comments about One final reference for posterity is that I had tried in #44343 (comment) to summarize the state of this issue with some links to other key comments as of March 2023. (In short, after re-skimming this issue just now, I'm hoping to avoid re-skimming this issue ever again 😅). |
Seeing a few thumbs up and hearing no objection so far, I am going to change this issue title to re-focus back on Windows. If someone has a different opinion, feel free of course to edit the title again. It looks like #53824 is already tracking using use |
FYI: CL 488675 has been merged. The new time resolution on Windows with high resolution timers is now ~0.5ms, measured using the code from the issue description: avg 513µs; min 17.4µs; p50 531µs; max 722µs;
p90 558µs; p99 722µs; p999 722µs; p9999 722µs;
17.4µs [ 2] ▌
100µs [ 2] ▌
200µs [ 2] ▌
300µs [ 1]
400µs [ 3] █
500µs [ 84] ████████████████████████████████████████
600µs [ 5] ██
700µs [ 1]
800µs [ 0]
900µs [ 0] |
time.Sleep is not precise, instead use a ticker. Relates to golang/go#44343 Co-authored-by: Twometer <twometer@outlook.de>
go1.23 has a fix for the windows Sleep bug discussed in issue #44343. See golang/go#44343. This works so go1.23rc1 is the new minimum version. Also move the icosphere demo code from the engine (vu/loader.go) to the examples (eg/ps.go).
Change https://go.dev/cl/614435 mentions this issue: |
I missed to add timer resolution improvements in the release notes for Go 1.23. This is a significant change, so better late than never. For golang/go#44343 Change-Id: I7fe22a1752a45df9094d5233bed762823322c7a9 Reviewed-on: https://go-review.googlesource.com/c/website/+/614435 Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
This seems to be a regression with Go 1.16
time.Sleep
.What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
hrtime
is a package that usesRDTSC
for time measurements.Output on Windows 10:
Output on Linux (Debian 10):
The time granularity shouldn't be that bad even for Windows. So, there might be something going wrong somewhere.
The text was updated successfully, but these errors were encountered: