user activation tests might be racy #36221

marcoscaceres · 2022-10-03T09:41:25Z

@mustaqahmed, I noticed that some tests in #35836 are missing await on calls to test_driver.click(). I want to confirm to it won't be an issue?

I think we should add await to those tests regardless... if those tests get updated, that might be overlooked leading to problems.

The text was updated successfully, but these errors were encountered:

mustaqahmed · 2022-10-03T14:43:00Z

Most of these tests respond to test_driver.click() action through the click event received in JS, so they wait for the click to happen anyways. That said, just now I looked into the click flow in these tests and felt like using await would make the test more readable in some cases (e.g. chained-setTimeout.tentative.html).

As for racy behavior, in Chrome the tests with sub-frames (those containing "sameorigin" and "crossorigin" in the names) are marked as flaky because of possible delays in loading hit-test data for a subframe. I suspect Test Driver can't help here because it should dispatch the click to whichever element is "ready", but I could be wrong here.

marcoscaceres · 2022-10-05T00:25:16Z

The chained-setTimeout test is also flaky on the WebKit side... so I'm wondering if we really need that test? or can we refactor it in some other way so it's not setTimeout dependent?

The transient activation duration is UA-determined, so I'm wondering if it's worth testing that at all?

mustaqahmed · 2022-10-05T14:19:55Z

This test is about setTimeout chaining, and the total delay tested here is small (30ms = 3 call depths of 10ms each). Can we test if this delay is the root cause of the flake? I would check if the flakiness "score" gets better if the delay is reduced to, say, 3*1ms.

Also, please let me know if the flakiness happens at an assert before or after the click, to determine if event targeting could be a probable cause.

marcoscaceres · 2022-10-06T02:53:41Z

The "consumption-sameorigin.html" appears to be super flaky too. Here are two test runs (repeat 20 times):

Running 1 WebKitTestRunner.

[2/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[7/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[10/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[14/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[16/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)

Running 1 WebKitTestRunner.
[1/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[7/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[10/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[13/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[16/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[18/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)
[19/20] imported/w3c/web-platform-tests/html/user-activation/consumption-sameorigin.html failed unexpectedly (text diff)

I think loading both iframes is going to lead to the above. They should be loaded in order or in some more predictable manner.

marcoscaceres · 2022-10-06T03:37:21Z

Sent #36304 to fix "consumption-sameorigin.html"

marcoscaceres · 2022-10-06T03:41:48Z

Also, please let me know if the flakiness happens at an assert before or after the click, to determine if event targeting could be a probable cause.

They are just timing out and not finishing to run. I'll need to investigate further where they are getting stuck.

marcoscaceres · 2022-10-06T04:16:55Z

This test is about setTimeout chaining, and the total delay tested here is small (30ms = 3 call depths of 10ms each). Can we test if this delay is the root cause of the flake? I would check if the flakiness "score" gets better if the delay is reduced to, say, 3*1ms.

I still don't think this test is correct or very useful, to be honest... setTimeout is for a at least N ms, but that's not guaranteed to be exactly N (i.e., it could be a lot more, depending on testing infrastructure, virtualization, etc.). So, while transient activation should last some amount of time, it's up to the user agent to decide how much time that is (for testing purposes, it could be 0ms or it could be seconds, but that's totally left up to the user agent).

These tests assume that it will be at least 30ms (it's not a bad assumption!), but it's not a correct assumption either as it's left to the user agent.

mustaqahmed · 2022-10-06T14:39:05Z

The test is about chaining setTimeout calls (one setTimeout calls another recursively) because this particular situation didn't work in many browsers I tested few years ago (including Chrome before it shipped UAv2). This is a real case we need to test, so avoiding the chaining of setTimeout calls (as suggested in the PR) is not an option, sorry!

The exact timing is not important so I suggested reducing the delay to 1ms*3 (I would avoid 0ms*3 in case UAs have special handling for this case). I agree that the user agent can choose to arbitrarily extend any time-limit but it sounds like a very extreme case if main thread unresponsiveness causes the 30ms or 3ms delay getting extended to 5 seconds---I believe a lot of tests would fail in that case.

They are just timing out and not finishing to run. I'll need to investigate further where they are getting stuck.

Finding out where the test JS gets stuck would be helpful. Please keep me posted.

marcoscaceres · 2022-10-06T22:04:02Z

This is a real case we need to test, so avoiding the chaining of setTimeout calls (as suggested in the PR) is not an option, sorry!

No problem. I'm just trying to understand.

I'm still not following how chaining/recursive calls to setTimeout() would affect transient activation? My understanding from the code (and the refactor) is that the tests are just checking:

before click + some time N: does it have transient activation? (yes/no)
.click()
After click + some time N: does it have transient activation? (yes/no) (3 times)

What am I missing?

marcoscaceres · 2022-10-11T03:00:35Z

@mustaqahmed about the general issue with these tests, the problem (for webkit at least) is that the body doesn't appear to be ready before the tests run. This appears to be problem because the <body><script> pattern these tests are using.

A general mitigation strategy that's worked for me so far has been to run them onload. However, I just realized (but haven't tested) that adding defer on the scripts might also make them work.

For a bit more context, we are implementing UserActivation in WebKit, which is why were are interested in all these tests right now.

mustaqahmed · 2022-10-11T15:15:26Z

I'm still not following how chaining/recursive calls to setTimeout() would affect transient activation?

You are perfectly on track: this shouldn't affect any activation states because UAv2 defines them simply as a state of the Window. Before UAv2 this was not the case, and the browsers had to pass a "token" (as per the old spec) across setTimeout calls or Promise chains in a limited manner to prevent possible token replication. This test will catch that particular limitation of the old model with setTimeout. (And we still need a similar test for Promise chains!)

... the body doesn't appear to be ready before the tests run.

In Chrome, we have the same challenge in subframes but the top-frame seems fine with just <body onload=...>. Does adding one or two requestAnimationFrame() calls after onload=... help?

For a bit more context, we are implementing UserActivation in WebKit, which is why were are interested in all these tests right now.

We sensed that from your first comment here, and we are looking forward to a solid interop story here 😃! Thanks for cc-ing me in discussions that may benefit from our past impl experience.

marcoscaceres · 2022-10-12T00:51:48Z

Ah, ok. Now it makes sense! Thank you! 🙏

When implementing (and having not been part of the UAv1 discussions), I was really confused when I saw this test as it didn't match anything in the spec.

I totally get that we might want to have regression testing in WPT, but I wonder if we should move things like this to something like a legacy/ folder instead?

I know this test doesn't break anything, but it was quite confusing to try work out what it was doing and why.

mustaqahmed · 2022-10-13T15:48:09Z

I am a bit reserved about a "legacy" folder testing current expectations. We need to avoid any misinterpretation that these are "old behavior" or "failures here are not that important".

The fact that this test (and similar ones based on Promises etc---which I am hoping to add one day 🤞🏼) catches a non-compliant implementation (like pre-UAv2 Chrome), does a folder named "impl-compliance" make more sense? I know this is not a great suggestion 😞.

marcoscaceres · 2022-10-15T23:36:27Z

I've seen "legacy-" used a lot in various filenames. See:

find . -name "*legacy-*"

Maybe we can just prefix them with that?

mustaqahmed · 2022-10-20T14:37:28Z

Thanks for the suggestion to check existing "legacy" tests. I just sampled uievents/legacy* folders, and their change-logs seem to suggest they are old tests (I checked this and this). Most likely because of this, these folders doesn't appear in wpt.fyi dashboard.

The tests we are discussing here are asserting behaviors that are unsupported by the old user activation spec. So "legacy" in the name would be misleading IMHO.

marcoscaceres · 2022-11-23T09:02:14Z

Closing, as we addressed the missing awaits. The other problem was WebKit's infrastructure not supporting hosts[www] and friends, but only hosts[alt] and just {host}.

Thanks again for your help and time @mustaqahmed! Really appreciate your patience while I (continue to) get up to speed.

marcoscaceres assigned marcoscaceres and mustaqahmed Oct 6, 2022

marcoscaceres mentioned this issue Oct 7, 2022

Simplify chained-setTimeout.html #36305

Closed

marcoscaceres closed this as completed Nov 23, 2022

CanadaHonk mentioned this issue Oct 19, 2023

User activation tests are flaky #42629

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

user activation tests might be racy #36221

user activation tests might be racy #36221

marcoscaceres commented Oct 3, 2022

mustaqahmed commented Oct 3, 2022

marcoscaceres commented Oct 5, 2022

mustaqahmed commented Oct 5, 2022

marcoscaceres commented Oct 6, 2022

marcoscaceres commented Oct 6, 2022

marcoscaceres commented Oct 6, 2022

marcoscaceres commented Oct 6, 2022 •

edited

Loading

mustaqahmed commented Oct 6, 2022

marcoscaceres commented Oct 6, 2022

marcoscaceres commented Oct 11, 2022 •

edited

Loading

mustaqahmed commented Oct 11, 2022

marcoscaceres commented Oct 12, 2022

mustaqahmed commented Oct 13, 2022

marcoscaceres commented Oct 15, 2022

mustaqahmed commented Oct 20, 2022

marcoscaceres commented Nov 23, 2022 •

edited

Loading

user activation tests might be racy #36221

user activation tests might be racy #36221

Comments

marcoscaceres commented Oct 3, 2022

mustaqahmed commented Oct 3, 2022

marcoscaceres commented Oct 5, 2022

mustaqahmed commented Oct 5, 2022

marcoscaceres commented Oct 6, 2022

marcoscaceres commented Oct 6, 2022

marcoscaceres commented Oct 6, 2022

marcoscaceres commented Oct 6, 2022 • edited Loading

mustaqahmed commented Oct 6, 2022

marcoscaceres commented Oct 6, 2022

marcoscaceres commented Oct 11, 2022 • edited Loading

mustaqahmed commented Oct 11, 2022

marcoscaceres commented Oct 12, 2022

mustaqahmed commented Oct 13, 2022

marcoscaceres commented Oct 15, 2022

mustaqahmed commented Oct 20, 2022

marcoscaceres commented Nov 23, 2022 • edited Loading

marcoscaceres commented Oct 6, 2022 •

edited

Loading

marcoscaceres commented Oct 11, 2022 •

edited

Loading

marcoscaceres commented Nov 23, 2022 •

edited

Loading