Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve & simplify time sync in pthreads #18267

Merged
merged 4 commits into from
Dec 2, 2022
Merged

Conversation

RReverser
Copy link
Collaborator

@RReverser RReverser commented Nov 29, 2022

This supersedes solution from #6084 / #10137 with performance.timeOrigin.

This property provides the high resolution baseline for performance.now() results, which is exactly what we need. Unlike custom __performance_now_clock_drift sent as a message to a worker, it won't suffer from the small time difference incurred by postMessage itself, making synchronisation more precise, while at the same time allows to remove some custom code, making it a bit simpler too.

I checked caniuse.com and it's supported in all browsers where SAB are supported, so should be always safe to use when pthreads are enabled.

Note that this changes the absolute value returned from emscripten_get_now, but this is fine because that value is already different across engines (e.g. it might already return Date.now() in some old browsers), and it's explicitly documented as "only meaningful in comparison to other calls to this function".

I verified that test_pthread_reltime still passes too.

@RReverser RReverser requested review from kripken, juj and sbc100 November 29, 2022 16:35
Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantasitic!

@sbc100 sbc100 enabled auto-merge (squash) November 29, 2022 16:48
@RReverser RReverser force-pushed the better-worker-time-sync branch from b67345b to 10151d0 Compare November 29, 2022 16:54
@RReverser
Copy link
Collaborator Author

Just in case - even if tests pass, want to get @kripken and/or @juj to take a look as well since they worked on the original implementation and might know if I missed some edge case.

@sbc100 sbc100 disabled auto-merge November 29, 2022 17:29
@RReverser
Copy link
Collaborator Author

RReverser commented Nov 29, 2022

Hm, I can't reproduce that wasm2js failure locally:

> python .\test\runner.py wasm2js1.test_pthread_abort
Test suites:
['test_core']
Running test_core: (1 tests)
(checking sanity from test runner)
shared:INFO: (Emscripten: Running sanity checks)
test_pthread_abort (test_core.wasm2js1) ... ok

----------------------------------------------------------------------
Ran 1 test in 1.725s

OK

And I don't see how it could be related either. Flaky test?

@sbc100
Copy link
Collaborator

sbc100 commented Nov 29, 2022

Hm, I can't reproduce that wasm2js failure locally:

> python .\test\runner.py wasm2js1.test_pthread_abort
Test suites:
['test_core']
Running test_core: (1 tests)
(checking sanity from test runner)
shared:INFO: (Emscripten: Running sanity checks)
test_pthread_abort (test_core.wasm2js1) ... ok

----------------------------------------------------------------------
Ran 1 test in 1.725s

OK

And I don't see how it could be related either. Flaky test?

Yup, its a known flaky test: #15014

@RReverser
Copy link
Collaborator Author

Ah thanks. So weird, it doesn't seem to rely on timing in any way :/

@RReverser
Copy link
Collaborator Author

And now it's bunch of pthread tests on Firefox instead :/ I guess one of you will have to force-merge this manually once reviews are done.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 29, 2022

And now it's bunch of pthread tests on Firefox instead :/ I guess one of you will have to force-merge this manually once reviews are done.

You should also be able to "re-run from failed" in circleci

@RReverser
Copy link
Collaborator Author

You should also be able to "re-run from failed" in circleci

Ah right. I think I tried that in the past on a different PR, but was just getting different set of failing tests each time. Might give it a go though.

@RReverser
Copy link
Collaborator Author

Yup, re-run didn't help. I sometimes get a feeling that once CI runs into this bad state, ~same pthread tests start falling more or less consistently.

@brendandahl
Copy link
Collaborator

I just ran into pthreads failing in #18252 too. Re-run fixed it for me though.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 29, 2022

Re-running here also seems to have fixed it.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 29, 2022

@kripken can you take a look?

@RReverser
Copy link
Collaborator Author

Re-running here also seems to have fixed it.

Heh so it's the 3rd rerun that does the charm. (I also did re-run above, but it didn't help.)

This supersedes solution from #6084 / #10137 with `performance.timeOrigin`.

This property provides the high resolution baseline for `performance.now()` results, which is exactly what we need. Unlike custom `__performance_now_clock_drift` sent as a message to a worker, it won't suffer from the small time difference incurred by `postMessage` itself, making synchronisation more precise, while at the same time allows to remove some custom code, making it a bit simpler too.

I checked caniuse and it's supported in all browsers where SAB are supported, so should be always safe to use when pthreads are enabled.

Note that this changes the absolute value returned from emscripten_get_now, but this is fine because that value is already different across engines (e.g. it might already return Date.now() in some browser), and it's explicitly documented as "is only meaningful in comparison to other calls to this function".

I verified that test_pthread_reltime still passes too.
@RReverser RReverser force-pushed the better-worker-time-sync branch from fd72b36 to 6837754 Compare November 30, 2022 12:51
@RReverser
Copy link
Collaborator Author

Rebased & moved changelog item to 3.1.28 now that 3.1.27 is released.

@RReverser
Copy link
Collaborator Author

@kripken Ping? This is mostly a trivial change but would like another pair of eyes.

@juj
Copy link
Collaborator

juj commented Dec 2, 2022

Ops, sorry, missed the ping from before.

Unfortunately Safari only has support for performance.timeOrigin since about a year ago starting from Safari 15, so this change would regress compatibility with earlier Safari versions?

Would it be possible to add a polyfill for browsers that don't support timeOrigin? That way there won't be an impact to browser support.

@juj
Copy link
Collaborator

juj commented Dec 2, 2022

When polyfilling is not needed, this line

msg.time = performance.now();
has then become obsolete, and can be removed when only targeting browsers that do support performance.timeOrigin.

@juj
Copy link
Collaborator

juj commented Dec 2, 2022

Actually, I now realize that only Safari 15.2 adds support for multithreading, so there is no need to polyfill then. LGTM by removing the above redundant line msg.time = performance.now(); as well.

@RReverser
Copy link
Collaborator Author

Actually, I now realize that only Safari 15.2 adds support for multithreading

Yup, I was about to add my own polyfill too but then checked caniuse, as mentioned in the PR description, and found both are in the same version.

@RReverser
Copy link
Collaborator Author

LGTM by removing the above redundant line msg.time = performance.now(); as well.

Hm, I think that's the one I already removed in this PR?

@RReverser
Copy link
Collaborator Author

@juj Thanks for taking a look, sounds like this should be good to go then.

@RReverser RReverser enabled auto-merge (squash) December 2, 2022 14:35
@juj
Copy link
Collaborator

juj commented Dec 2, 2022

Yeah, looks great!

@RReverser
Copy link
Collaborator Author

Ughhh these flaky tests are driving me crazy :( @sbc100 it seems to be always the same 3 that are flaky in FF, maybe we can at least disable them for now?

@RReverser RReverser merged commit dd96490 into main Dec 2, 2022
@RReverser RReverser deleted the better-worker-time-sync branch December 2, 2022 16:00
@sbc100
Copy link
Collaborator

sbc100 commented Dec 2, 2022

Ughhh these flaky tests are driving me crazy :( @sbc100 it seems to be always the same 3 that are flaky in FF, maybe we can at least disable them for now?

If you have identified them could you open a bug and we can track the flakeyness. There is a tag you can use on the issue for flakey tests.

@RReverser
Copy link
Collaborator Author

There is a tag you can use on the issue for flakey tests.

Found it. #18307

@sbc100
Copy link
Collaborator

sbc100 commented Feb 16, 2023

Apparently this change causing some safari-specific issues with SDL: #18677

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants