-
-
Notifications
You must be signed in to change notification settings - Fork 665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Lolex for tests involving time #3886
Conversation
744a98b
to
219bacb
Compare
Previous test breakage (only mostly due to late refactor) is now fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ray-kraesig! This will be an excellent upgrade for our ability to write good tests for code that relies on timers. Comments below.
src/utils/__tests__/async-test.js
Outdated
// If this stalls, Jest will report failure. | ||
expect(await expectation).toBeTrue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm interesting! What does this mean -- how does Jest decide that it's stalled, vs. that it just hasn't completed yet (but later will)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks.
Just "stalls" doesn't communicate that to me. How about "If this never completes, Jest will time out and report failure."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Expanded somewhat more than even that, including a documentation link.)
src/utils/__tests__/async-test.js
Outdated
// cannot be guaranteed: arbitrary delays are possible | ||
// expect(durationMs).toBeLessThan(10 * expectedMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a regrettable loss in test coverage: it means that if our sleep
acquired a bug where, say, we were confused about units and always slept for as many seconds as we'd intended milliseconds, it'd still pass the test. Is there a reason that switching to Lolex means we need to sacrifice our test coverage of that failure mode?
It's true that a sleep
in real life can't guarantee a timely wakeup. That's why the test only asks that the interval be within 10x of the target, which I think has been enough to make flakes nonexistent even on the very messy real systems that Travis CI runs our tests in. When we move to a spiffy mock setup where we have total control of the (virtual) environment, what I'd expect to happen is that the behavior only becomes more predictable, not less -- so that we could tighten this to like 1.01 * expectedMs
, and it would pass every time. That would be a very appealing kind of thing to be able to do when trying to test tricky timer-using code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be reasonable... so long as about half of this comment is in the code near the 10x test. Especially the bit where a 10x test is specifically intended to guard against a unit-scaling bug, rather than just being an unusually-large fudge factor.
(Done.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think the origin of it really is as a large fudge factor. 🙂 It had been 10%, and in 935c1f5 we loosened it to 10x because that had flaked.
The possible unit-scaling bug is just a nice example of how some code like this plausibly could actually have a bug that this test with a large fudge factor would catch.
191eaf8
to
6634ecd
Compare
The new version has an additional final commit ( |
src/utils/__tests__/async-test.js
Outdated
// [α] In theory, we can't be sure of this test; the time between the | ||
// specified timeout expiring and a timeslice actually becoming available | ||
// may be arbitrarily long. (To simulate this with Lolex, one can set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true with Lolex? It seems like one of the great advantages of systematically mocking out the clock as well as timers, provided the mock implementation delivers this, is that we could expect the timers to behave ideally.
Which effectively means that for timer-intensive code like this, we'd have written much more of a true unit test than was ever possible with real clocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true with Lolex?
It's not – hence the trailing parenthetical. That could stand to be made more clear, though.
It seems like one of the great advantages of systematically mocking out the clock as well as timers, provided the mock implementation delivers this, is that we could expect the timers to behave ideally.
Alas, "ideally" contrasts with "in reality", making this simultaneously one of the great disadvantages of mocking out the clock. We don't have any simple way to ensure that our code behaves properly under real-world conditions like load-induced greater-than-epsilon timeslice delays.
Which effectively means that for timer-intensive code like this, we'd have written much more of a true unit test than was ever possible with real clocks.
That's true, but it's not an unalloyed good. 😞 An integration test may well be more informative than a unit test.
src/utils/__tests__/async-test.js
Outdated
const FUDGE_FACTOR = 1.01; // see discussion of [α], below | ||
lolex.advanceTimersByTime(expectedMs * FUDGE_FACTOR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FUDGE_FACTOR here seems like it makes this test loose in a way that it isn't now. I still don't see why any tradeoff like that should be a consequence of adopting a nice spiffy mock setup, where we get to control the clock's behavior in a way we didn't have before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite the opposite: it's intended to replicate the looseness that is present now. (Clarifying the preceding would also clarify this, I expect.)
I suppose we could have a separate test that's "sharp", so to speak. I'm not sure it would be terribly informative, though. (EDIT: I have been convinced otherwise of that last.)
src/utils/__tests__/async-test.js
Outdated
// `sleepPromise` should have set up its timer before returning control. | ||
expect(lolex.getTimerCount()).toBe(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something this tests about the function's behavior beyond what's tested below? This feels to me like it's only poking into the specifics of the function's implementation. (What if there were two timers involved? Not super likely for this particular function, given the simplicity of its spec... but I don't see how it'd be the caller's business if there were.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was intended not as a test of sleep
's behavior, but as a metatest of sleepPromise
's behavior (resulting from an actual failure in its original implementation).
It's probably not necessary.
src/utils/__tests__/async-test.js
Outdated
const sleepAsync = async () => { | ||
const start = Date.now(); | ||
await sleep(expectedMs); | ||
const actualMs = Date.now() - start; | ||
|
||
return actualMs; | ||
}; | ||
const sleepPromise: Promise<number> = sleepAsync(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this simplification, thanks!
Can we make this another step more straightforward from here? I think it'd be helpful to basically inline this single-use function, like so:
const start = Date.now();
const sleepPromise: Promise<void> = sleep(expectedMs);
// ...
await sleepPromise;
const actualMs = Date.now() - start;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's very much less straightforward, not more – it mixes the two control paths, and in subtly fragile ways.
In particular, sleepPromise
will hang unless you advance the timer before await
ing on it, and doing so would make the computation of const actualMs
[0] meaningless.
src/__tests__/metatests.js
Outdated
test('timers fail on overflow', () => { | ||
expect(() => { | ||
setTimeout(() => {}, 10e10); | ||
}).toThrow(); | ||
|
||
expect(() => { | ||
setInterval(() => {}, 10e10); | ||
}).toThrow(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 10e10 in these tests and not something close to the required threshold, like 3e9 or 2**31?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like 10e10 more than 3e9. That's about it, really.
It's probably worth adding tests that are just barely on either side of the limit, as well as versions with NaN
and Infinity
.
src/__tests__/aux/lolex.js
Outdated
// In Node.js, a timeout or interval of greater than INT32_MAX is reduced to | ||
// exactly 1. (In browsers, and probably in React, it is instead truncated | ||
// to an int32 value.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"in React" is a confusing expression here -- React also runs in browsers :-) and I think also in Node. I think you probably mean "React Native"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... yes. 🤦♂️
src/utils/__tests__/async-test.js
Outdated
// If this stalls, Jest will report failure. | ||
expect(await expectation).toBeTrue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks.
Just "stalls" doesn't communicate that to me. How about "If this never completes, Jest will time out and report failure."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ray-kraesig for the revisions!
The added check in the new commit looks fine to me. Some comments below on that and the others. I think, zooming out a bit, the main concern I have at this point is:
- In master, the
sleep
test is quite simple, just a few straightforward lines, for testing a simple function. - With these changes, it gets a lot more complex -- many times longer, with several additional wrinkles. It also becomes imprecise on a check that the existing version makes precisely (and doesn't materially tighten the other, quite loose, existing check.)
- That doesn't seem like a sacrifice I'd expect to have to make in order to use this nice mocking library. Sure, a few lines of setup, and perhaps a line here and there in the test to drive the mocks forward; but greater control ought to mean the rest of the test logic just gets simpler and more precise, not the reverse of those.
At present, it's not possible (or at least not easy) to test any code which relies on `Date.now()` for timing purposes. Timer functions such as `setInterval()` are mockable with Jest's existing mock-timer functionality, but its mock timers do not include a mock time. [0] Fortunately, there is work underway to have Jest v25^H^H26 use the Lolex time-mocking library for its fake timer implementation, which will do exactly that! Unfortunately, we need that capability now, rather than in the indefinite future, so we'll jump the gun a bit. To ease conversion of tests once Lolex is available natively, we use a small shim taken from the code on Jest's current master branch which implements Jest's timer-control functions (not projected to change) in terms of Lolex. Tests for this shim -- which may also serve, in part, as examples for Zulip contributors who would like to use Lolex -- are included. (Additionally, inform Jest that files present in src/__tests__/aux are not supposed to contain executable tests.) [0] S2E1, 1967-09-15. GitHub-PR: #3886
Most of these commits LGTM, including the main one, so I've gone and merged them! Thanks again. dd53b0d eslint: Extend test-file import permissions to subdirectories. That leaves two commits in progress: the |
6634ecd
to
f611e1b
Compare
The previous muddled Lolex-based An optional final refactoring commit has also been included. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ray-kraesig for these revisions! I like the new structure of a unit test plus an integration test (the latter being what we have now), and the new shape of the unit test.
Just one item below, which I think will be easy to fix.
|
||
test('waits for exactly the right number of milliseconds', async () => { | ||
const expectedMs = 1000; | ||
const sleepPromise: Promise<number> = sleepMeasure(expectedMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line appears in a commit where sleepMeasure
hasn't yet 🙂
Maybe put the "factor out sleepMeasure" commit earlier, with just one callsite, and then add this new callsite right after that? Or other sequences would be good too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restored(?) to what I thought it was originally: sleepMeasure
created in this commit, within the unit-test's describe
block.
"Lolex" !== "lolex".
The second `expect` test is theoretically unsound, and the reason for a fudge-factor of 10, specifically (as opposed to other possibly- interesting magic numbers like 2 or 137.036) is not obvious. Add a justification and explanation. Additionally, rename `durationMs` to more clearly contrast with `expectedMs`.
Test that `sleep` has ideal behavior in the presence of an ideal clock. The original test remains in place, to test that `sleep` also has reasonably robust behavior in the presence of a non-ideal clock.
Lolex follows Node's behavior in (silently!) limiting timer durations to a maximum of INT32_MAX (2^31 - 1) milliseconds (or approximately 24.86 days), substituting a timeout of 1ms on overflow. Browsers have subtly different behavior here, as they truncate to 32 bits, rather than substituting. React Native's JavaScript engine probably follows browser behavior rather than Node's, but it hardly matters -- if we issue a timer with a duration over the limit, even in tests, we're almost certainly not going to get the expected behavior. Wrap Lolex's overrides of setTimeout and setInterval to complain if a timer is ever created with an incompatibly-long timeout duration.
Factor `sleepMeasure` out of both the "ideal" and "real" tests for `sleep`.
f611e1b
to
16c1687
Compare
And merged. Thanks again! |
Separated out from #3727, with some improvements.