-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-101558: Add time.sleep_until() #101559
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🌮
@@ -2168,7 +2206,8 @@ pysleep(_PyTime_t timeout) | |||
int ret; | |||
Py_BEGIN_ALLOW_THREADS | |||
#ifdef HAVE_CLOCK_NANOSLEEP | |||
ret = clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &timeout_abs, NULL); | |||
ret = clock_nanosleep(absolute ? CLOCK_REALTIME : CLOCK_MONOTONIC, | |||
TIMER_ABSTIME, &timeout_abs, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why need to use CLOCK_REALTIME
? @vstinner mentioned in a mailing thread that CLOCK_REALTIME
can jump forwards and backwards as the system time-of-day clock is changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quoting my reply from the email:
Yes, using the system clock has its caveats, which is why I mentioned it in the documentation, and I wouldn't mind highlighting these even more.
The use case in my specific case is long-running recurring measurements, for example a measurement every minute or every ten minutes over a period of many hours or often days. In these cases it makes much more sense to tie the measurements to wall-clock time, because often there are other measurement systems also using wall-clock time.
Although of course not everyone uses it, an NTP daemon like Chrony is able to adjust the system clock by changing the system clock frequency and thereby adjusting the system clock gradually, which keeps the system clock monotonic. Having multiple measurement computers stay synchronized via NTP is then much more reasonable. In the past I've built a system consisting of multiple machines, one of them using a GPS clock and acting as an NTP server, that does exactly this.
Since both clock_nanosleep and SetWaitableTimerEx allow sleeping until a specific time, my reasoning is simply that this functionality available in the lower-level API could be made available to the user so they can write simple loops - aside from the precision of clock_nanosleep appealing to my perfectionist side 😉
While there are schedulers like cron or similar, sometimes, a measurement may need to happen e.g. every second, where then the (admittedly small) inaccuracy introduced by calculating the sleep() timeout based on the current time can start to make a difference.
The force push from yesterday was just a rebase; the force push from today was a rebase + squash; no changes to the code itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that I like the overall idea of exposing such function in Python. It can give false hope to users, since it's hard to write a reliable function to "sleep until". Did you actually test the behavior of the function when the system clock is changed on different operating systems (ex: Windows, Linux, macOS).
I tried to make such study when I designed time.monotonic(): https://peps.python.org/pep-0418/#monotonic-clocks
Additional potential sources of inaccuracies include: | ||
|
||
* Because this function uses the system clock as a reference, this means the | ||
reference clock is adjustable and may jump backwards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me what is the expected behavior when the system clock jumps backward or forward. Does the function sleeps longer/shorter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a caller asks to sleep until an absolute time on a given clock and that clock then "jumps", the caller should still wake up when that absolute time has elapsed. For example, if the clock is set backwards after the caller blocks, then the caller should sleep for a longer duration because it will take a longer time for the clock to reach the chosen absolute value. This is not a violation of the caller's request: they asked to block until an absolute time, not to block for a given duration.
If sleeping a longer/shorter duration is unacceptable for a caller, then either (a) the caller should use a relative sleep, or (b) the caller should schedule the sleep against a clock that cannot "jump".
* On Unix, if ``clock_nanosleep()`` is not available, the absolute timeout | ||
is emulated using ``nanosleep()`` or ``select()``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you consider that it's important to document the implementation, I suggest to do it in a separated section, since it's unclear to me what's the relationship between the chosen implementation and inaccuracies. If you care about the theorical clock resolution, you should document it, it's not obviously to users what is the resolution of these functions.
self.assertGreater(delta, -0.050) | ||
# allow sleep_until to take up to 1s longer than planned | ||
# (e.g. in case the system is under heavy load during testing) | ||
self.assertLess(delta, 1.000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, such test tends to fail on busy buildbots. I would prefer to not test the "performance" (accuracy) of the function in such unit test.
I'm not sure that it's useful to write an unit test for this function. It's very hard to write a reliable on the clock accuracy.
What if the system clock changes during the test? :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree, and I actually already had to increase that final assertLess
because the test failed on a busy runner with a smaller value. When writing the test I took my cue from tests like test_monotonic
, but I'm also fine with simplifying the test case, let me know.
On Windows, an absolute due time doesn't change if the system time is adjusted. |
Sorry, it's unclear to me. It reminds me traveling in a flight trough different time zones which gives me deadheads... Let's say that I do:
During this sleep, the system clock is modified: either goes 30 seconds backward or forward. Does Python sleep 30, 60 or 90 seconds? |
A timer object is signaled as soon as the system time is greater than or equal to the scheduled time. If the system time is adjusted backward by 30 seconds, the sleep will be for 90 seconds. If the system time is adjusted forward by 30 seconds, the sleep will be for 30 to 60 seconds, depending on when the system time is adjusted. On the other hand, adjusting the system time also adjusts relative due times by the same amount. So |
@vstinner Thanks for all the feedback!
The Windows
And the Ubuntu Linux
I wrote a test that does a I ran this test on a Linux machine (4.15.0-206-generic #217-Ubuntu x86_64 GNU/Linux), a Linux VM (5.15.0-67-generic #74-Ubuntu x86_64 GNU/Linux), a Windows machine (Win 10 Pro 22H2 19045.2364), and a Windows VM (Win 10 Pro 22H2 19045.2486), and the results were the same on all four: When the system clock is adjusted backwards by 30 seconds, the total time slept by This confirms what the documentation says and what @eryksun wrote - it appears to be well-defined behavior on systems with
The way I look at it is that this is simply exposing a functionality already available in the OS's libraries, and in my comment above I name a couple of useful uses for it. Putting this function in an external module instead requires a ton of code duplication from * Anyway, one remaining question is of course what happens on *NIX systems that don't have
I hope you'll agree that As for your other comments that I haven't replied to above, I'll re-write the documentation once we've clarified the best way forward. |
This statement from the Windows documentation is vague and thus possibly misleading. Adjusting the system time actually has no effect on an absolute due time. If a timer is due at 17:00 hours, then the executive signals the timer as soon as the system time is greater than or equal to 17:00 hours, even if that's because the time was manually adjusted past the scheduled due time. It's actually the case, in terms of the implementation in the kernel, that relative due times have to be adjusted when the system time is adjusted. Internally, due times are stored as absolute times, except relative due times are flagged as such. When the system time is adjusted, the executive has to recompute all of the 'relative' due times by the adjusted amount. |
@eryksun Thanks for the clarification! |
Adds the `time.sleep_until` function, which allows sleeping until the specified absolute time.
Co-authored-by: Victor Stinner <vstinner@python.org>
Bump. Of the options I named above, I personally would probably prefer option 3, that is, simply don't provide |
.. function:: sleep_until(secs) | ||
|
||
Like :func:`sleep`, but sleep until the specified time of the system clock (as | ||
returned by :func:`time`). This can be used, for example, to schedule events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced from this documentation what is the benefit of using this function instead of existing code:
dt = deadline - time.time()
if dt > 0: time.sleep(dt)
How does it behave differently? If it does behave the same, the function has no benefit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is your point about improving the documentation or are you questioning the usefulness of the function in general? I'd be happy to improve the documentation if that's all that's missing, but I'd prefer to finalize the implementation first (see above).
The code you showed does differ from sleep_until
in at least one IMHO significant way: The fact that sleep_until
(when backed by the SetWaitableTimerEx
or clock_nanosleep
implementations) is sensitive to changes to the system clock is actually desirable in some situations. Imagine for example a measurement system consisting of multiple processes on a single machine, or spread across several machines, where time is kept synchronized by NTP, running for several days with a measurement interval on the order of minutes. On this timescale, differences in the clocks of a few ms don't matter, but if one of the measurement processes were to drift off relative to the others, as it could with the code you showed, that wouldn't be too good.
Other than that, for short measurement intervals, in theory the code you showed introduces a small offset due to there being several instructions in between taking the current time and scheduling the sleep, which may make a difference on a heavily loaded machine.
I'd love to see this merged. To implement a periodic task in a Python loop, one is currently forced to do a racy subtraction with the current time. This PR fixes that. To print 'A' at 100hz, one might naively write:
...but of course, that's not really going to yield a coherent 100hz. There are many applications where that matters, such as sampling a sensor to produce timeseries data. If you actually care, you're forced to do something like this:
...but that's inherently racy. If this proposal is implemented, that loop could become:
...which both simplifies the code, and eliminates the race condition. On platforms where there isn't support for TIMER_ABSTIME, it could fall back to the racy subtraction (which is what the user would've done anyway). |
I'm not convinced that the time.sleep_until() code respects 100 Hz if the clock jumps backward or forward. You should use time.monotonic() for that. time.sleep_until() uses the "realtime" clock which is affected by changes of a system administrator, NTP, or anything else. Example using import sys
import time
SEC_TO_NS = 10 ** 9
def sync_at_frequency(frequency):
if frequency < 1:
raise ValueError
delay = SEC_TO_NS // frequency
deadline = time.monotonic_ns() + delay
while True:
yield
dt = deadline - time.monotonic_ns()
if dt > 0.0:
time.sleep(dt / SEC_TO_NS)
deadline += delay
def main():
stdout = sys.stdout
i = 1
for _ in sync_at_frequency(10):
if (i % 10) == 0:
stdout.write(f'. {time.time()}\n')
else:
stdout.write('.')
stdout.flush()
i += 1
main() Output on Linux:
The precision is under 1 ms. |
Hum, another more complete example which injects a random sleep between 0 and 100 ms: import sys
import time
import random
SEC_TO_NS = 10 ** 9
def random_sleep():
# max: 100 millisecond
delay = 0.100 * random.random()
time.sleep(delay)
def sync_at_frequency(frequency):
if frequency < 1:
raise ValueError
delay = SEC_TO_NS // frequency
deadline = time.monotonic_ns() + delay
while True:
yield
dt = deadline - time.monotonic_ns()
if dt > 0:
time.sleep(dt / SEC_TO_NS)
deadline += delay
def main():
stdout = sys.stdout
i = 1
for _ in sync_at_frequency(10):
if (i % 10) == 0:
stdout.write(f'. {time.time()}\n')
else:
stdout.write('.')
stdout.flush()
# inject random delay to test sync_at_frequency() accuracy
random_sleep()
i += 1
main() Output on an idle Linux system:
I manually added some spaces in timing numbers for readability, to highlight the precision of 1 ms. You can see that there is a precision of 1 ms even with the random sleep. And the code is not affected by system clock changes. |
I can't believe you wasted time testing that... it was a toy example, you missed the point. It is common to sample things at rates greater than 100hz. Obviously, sampling at 10khz, 1ms of inaccuracy is substantial.... |
I also think you're missing that it is the sum of the error over time that matters: the timed sleep approach will drift in phase over long timescales. An abstime time sleep eliminates that possibility (if one assumes the local clock is accurate, of course...). If you care about this, you currently have to do all that ugly subtraction. You might argue the error will effectively cancel out over time, but that's not true at least on Linux because it strictly over-sleeps its timers. |
Re-reading this... I replied too quickly, I'm not sure which of these two things you're claiming:
I think (1) is just flat out wrong, for the reasons I explained above. If you're saying (2)... yes, that's true, I wasn't trying to claim it did. You could make the same argument against clock_nanosleep() supporting TIMER_ABSTIME in the first place. But it does exist, and given that it makes sampling code simpler, mathematically race free, and easier to read... why not include it? I suppose something being "ugly" is a poor argument... but the inherent race condition is very ugly IMHO. |
@vstinner Regarding both of the code examples you showed, imagine the kernel scheduler decides to hand a slice of time to another process right in between the calculation of the delay and the call to Just to summarize a few of the points I made above:
Are these not things that can be described in documentation? As I said, I'd be happy to rewrite it if the implementation is settled. @jcalvinowens https://pypi.org/project/sleep-until/, though of course this duplicates a bunch of code from
I actually argued against this above, instead I now think it is better to only provide |
|
You're proposing that I have to test for the existence of sleep_until before using it? That's uglier than the racy subtraction is IMHO, I would never use that API. I think it's clearly simpler on all fronts to fall back to the racy subtraction if TIMER_ABSTIME doesn't exist. The point of this change is simplifying code, not increasing precision. The increase in precision is entirely meaningless on intervals long enough they can reliably work on a general purpose operating system. I'm still not sure if it was the point @vstinner was trying to make, but if so I completely agree. What you're proposing (forcing the user to test for the existence of sleep_until) makes things more complicated, not simpler, IMHO. |
@vstinner Can I interpret your comment to mean that if the user had a choice between the two clocks in the @jcalvinowens I understand your reservations, but take a look at the whole thread; you'll see that before the comment I linked to, I proposed three different alternative solutions, was asking which one I should implement, and was just stating my preference for the third - I now see it as a case of "do one thing and do it well". There was also some discussion of user expectations.
The way you've worded this, I have to disagree. I feel like this amounts to telling the user "maybe we'll give you the accurate version of |
I read the whole thread, I just disagree with the reviewers. I started to work on a patch to do this, but then discovered yours, I'm trying to jump in here rather than fragment the discussion with another PR. Why not add
I understand where you're coming from. But it feels a bit circular to me: if no equivalent of ABSTIME exists, the user must necessarily emulate it themselves if they require it... but, because no ABSTIME exists, they have no way to do that, except by falling back to the racy math, themselves. I may be biased by my own usecase, but I think the vast majority of users would find it more convenient for the fallback behavior be part of the general API (not a hypothetical os.nanosleep() API, to be clear: that's different). Emulating it by racily checking the current time is, on average, far more precise than required. One only "loses" the race if the process is interrupted between
I'm sorry for being confusing about this: I was arguing that it is truly necessary to reference the current time if one cares about the phase of the timer expiry drifting, because the error accumulates. But it is not strictly necessary to have |
Just to drive this home: I think the precision is a red herring, and we shouldn't be discussing it at all. The problem I want to solve is that sleeping until a fixed time is a ubiquitous operation available in every programming environment I have ever worked in, except python. My approach was to implement In principle, I think the approach @haukex took is superior, because it's a more general portable API. But that generality benefit is lost if the function disappears on some platforms. |
@jcalvinowens Thanks for the feedback, I understand better now where you're coming from. (Though several people in this thread have commented on whether the use of the absolute time is the best idea...) I don't have the time to test right now, but I think that @vstinner As I said I'd be happy to look at implementing an option to As for my personal feelings on this, all of this discussion and potential issues mentioned above is driving me more towards the opinion of "just do one thing and do it well"... 😬 |
@vstinner Sorry for the multiple replies but I just realized your question actually has a really simple answer. (Also @Livius90 because you asked about the use of
Could you clarify "which has issues", is there anything else I didn't address in my research? |
I would prefer to only use a monotonic clock to sleep. If Windows doesn't support that, only add the function to operating systems supporting sleeping with an absolute time of a monotonic clock. |
If you ask to sleep 60 seconds, it should sleep 60 seconds: it should not sleep 30 or 90 seconds. That's a flaw in your API coming from the usage of the "realtime" clock (or called the "system" clock). I don't see the point of adding a new API with such flaw, whereas existing sleep(60) doesn't have such flaw. For me, it's a blocker issue. IMO better accuracy is not enough to justify adding such function if there is such flaw. |
Would you mind to elaborate what drift in my example and how?
Would you mind to elaborate which kind of code do you run every 10 us in Python? On which operating system? Which kind of accuracy do you expect? In Python, it's not easy to have an "effective" precision better than 1 ms. Python is not designed for hard real time, it has a "stop the world" GC. Just reading a clock can take 1 us: https://peps.python.org/pep-0418/#performance On Windows, time.monotonic() has a precision of... 15.6 ms, it's far from 1 ms (16x worse). |
@vstinner I don't follow your reasoning here:
If you want to sleep 60 seconds, that's what
Different task, different tool. |
@vstinner my later replies clarified everything you asked about... could you respond to those points please? |
The main UI issue I see with this function is that there are three slightly different use cases for it, and if the function is designed with one use case in mind, but the user expects another one, it's going to cause problems (and may be worse than if the interface didn't exist at all). The three use cases I see:
Writing it out like this, I think that #1 sounds like the most useful use case, but the current implementation actually gives you #2, which is a bit dangerous. I'm not sure why you'd want #2 other than to implement something like #3. I also think that if you are worried about phase drift accumulating in the case of #3, doing something like this should suffice: def sample_every(interval_in_ms: int, num_samples: n) -> Sequence[float]:
output = []
interval_in_ns = interval * 1_000_000
for _ in range(n):
# Find the next time the montonic clock will be an even multiple of the interval
next_sample = ((time.monotonic_ns() // interval_in_ns) + 1) * interval_in_ms
time.sleep(next_sample)
output.append(sample())
return output In this implementation, if you sample too fast, you'll probably miss some intervals, but the errors will not accumulate over time. I am not sure we could write a general function that handles this for everyone, though, because there are different trade-offs with different methods of writing a "clock" of this sort. Some people want to minimize the variation in the gap between samples (and thus would prefer to sample immediately if you miss a sample), while other people want to align as closely as possible to the clock even if it means missing samples. Other people might want a hybrid approach, where they are keeping track of phase shifts and switch strategies if the drift gets too high. I feel like with regards to #1 vs #2, it might work to have a
When I started writing this comment, I thought it was going to be generally in favor of this and the interface just needed to be clarified, but the more I think about this, the more complicated it feels. I'd be inclined to look at the PyPI package and see if it gets wide adoption, and among the people adopting it, how many people end up using it incorrectly 😅. |
@pganssle Thanks for the feedback!
Seeing how
After reading @jcalvinowens's arguments on this, I no longer think this is the best example of a use case for this function - you'd need an RTOS to get a clean 100+ Hz. Just to reiterate a real-world example - I've built several systems for work similar to this - imagine two Raspberry Pi's in different remote locations, just sitting out in a field somewhere, without any network connection but with a GPS receiver that provides a PPS signal. These RPis are to run for weeks and take simultaneous measurements every 15 minutes (or imagine any other time periods where you think clock drift can become significant). In this case you want
Sure, sleep-until works for now of course. It's just too bad I've had to duplicate so much code from |
This modifies
pysleep
to give it an "absolute
" argument, and adds atime.sleep_until()
function.(I haven't been able to test this on Windows yet, I hope the GitHub Actions will take care of that)✓