-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix flaky consumption-sameorigin.html iframe loading #36304
Conversation
} else if (msg.type.endsWith("-report")) { | ||
if (--num_children_to_report == 0) | ||
finishReportPhase(); | ||
} | ||
}); | ||
async function createIframes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really improve the flakes? I am asking because the block in Line 46-55 already handles the case added here: the child frames pings the top frame when they see load
events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. The problem is that the (html) iframes were randomly in different order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In L85-87, the "load phase" ends after all 3 frames are loaded in any order.
child1.src = "resources/child-one.html"; | ||
child1.id = "child1"; | ||
document.body.appendChild(child1); | ||
await new Promise((resolve) => (child1.onload = resolve)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Is iframe.onload
event really tied to child frame's window.onload
? The normative note at the end of this spec section seems to suggest we have to wait explicitly for child frame's window.onload
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be more correct, yes. However, my testing this on a few hundred runs seems to confirm it's fine (all that is needed here is that the child1 loads first).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're misreading that note @mustaqahmed. That's just about event propagation. iframe
's load event will fire after its inner document load
event. (In general I don't recommend reading the UI Events spec for this as it doesn't define dispatching. I'm somewhat surprised the load
event is still there.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Questions for @annevk: Does the dispatching algorithm enforce a "load" event order across a frame boundary (even with a possibly slow cross-origin frame)?
I am asking because a "yes" here means the test could be simpler because it would be sufficient to wait for only the top window.onload
(the top window.onload
would follow the iframe.onload
event which would follow inner window.onload
event). Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mustaqahmed, can you help me understand what cause the consumption of the user activation here?
I'd still prefer not to rewrite the test if I can avoid it.
However, if I flatten the test to:
async function runTest() {
promise_test(async (t) => {
const child1 = document.getElementById("child1");
const child1Activation = child1.contentWindow.navigator.userActivation;
const child2 = document.getElementById("child2");
const child2Activation = child2.contentWindow.navigator.userActivation;
const grandChild = child2.contentDocument.getElementById("grandchild");
const grandChildActivation = grandChild.contentWindow.navigator.userActivation;
assert_false(navigator.userActivation.isActive, "Parent frame isActive initial state");
assert_false(navigator.userActivation.hasBeenActive, "Parent frame hasBeenActive initial state");
assert_false(child1Activation.isActive, "Child1 frame isActive initial state");
assert_false(child1Activation.hasBeenActive, "Child1 frame hasBeenActive initial state");
assert_false(child2Activation.isActive, "Child2 frame isActive initial state");
assert_false(child2Activation.hasBeenActive, "Child2 frame hasBeenActive initial state");
assert_false(grandChildActivation.isActive, "Grandchild frame isActive initial state");
assert_false(grandChildActivation.hasBeenActive, "Grandchild frame hasBeenActive initial state");
await test_driver.bless("click", null, child1.contentWindow);
assert_true(navigator.userActivation.isActive, "Parent frame isActive after click");
assert_true(navigator.userActivation.hasBeenActive, "Parent frame hasBeenActive after click");
assert_true(child1Activation.isActive, "Child1 frame isActive after click");
assert_true(child1Activation.hasBeenActive, "Child1 frame hasBeenActive after click");
assert_true(child2Activation.isActive, "Child2 frame isActive after click");
assert_true(child2Activation.hasBeenActive, "Child2 frame hasBeenActive after click");
assert_true(grandChildActivation.isActive, "Grandchild frame isActive after click");
assert_true(grandChildActivation.hasBeenActive, "Grandchild frame hasBeenActive after click");
});
}
I get different results? I can't see anything in the iframes that consume the user activation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get different results? I can't see anything in the iframes that consume the user activation?
We shouldn't get a different result, because a consumption call affects the whole frame tree.
Could you please elaborate which assertions are different between flattened vs un-flattened frame load code?
Is there a chance test_driver implementation is causing the flake here? This same-origin test is not flaky in Chrome's internal bots, but the cross-origin one is flaky and we can't fix that because of a test_driver impl limitation.
In any case, I think a virtual meeting would be more efficient here to share our thoughts. Let's plan one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't get a different result, because a consumption call affects the whole frame tree.
Oh, wait! But the test again does something non-standard:
window.open().close();
That shouldn't consume the user activation: at least, there is nothing in HTML saying that either window.open()
or window.close()
consumes it.
The flattened version should be this (all false, click(), all true):
async function runTest() {
promise_test(async (t) => {
const child1 = document.getElementById("child1");
const child1Activation = child1.contentWindow.navigator.userActivation;
const child2 = document.getElementById("child2");
const child2Activation = child2.contentWindow.navigator.userActivation;
const grandChild = child2.contentDocument.getElementById("grandchild");
const grandChildActivation = grandChild.contentWindow.navigator.userActivation;
assert_false(navigator.userActivation.isActive, "Parent frame isActive initial state");
assert_false(navigator.userActivation.hasBeenActive, "Parent frame hasBeenActive initial state");
assert_false(child1Activation.isActive, "Child1 frame isActive initial state");
assert_false(child1Activation.hasBeenActive, "Child1 frame hasBeenActive initial state");
assert_false(child2Activation.isActive, "Child2 frame isActive initial state");
assert_false(child2Activation.hasBeenActive, "Child2 frame hasBeenActive initial state");
assert_false(grandChildActivation.isActive, "Grandchild frame isActive initial state");
assert_false(grandChildActivation.hasBeenActive, "Grandchild frame hasBeenActive initial state");
await test_driver.bless("click", null, child1.contentWindow);
assert_true(navigator.userActivation.isActive, "Parent frame isActive after click");
assert_true(navigator.userActivation.hasBeenActive, "Parent frame hasBeenActive after click");
assert_true(child1Activation.isActive, "Child1 frame isActive after click");
assert_true(child1Activation.hasBeenActive, "Child1 frame hasBeenActive after click");
assert_true(child2Activation.isActive, "Child2 frame isActive after click");
assert_true(child2Activation.hasBeenActive, "Child2 frame hasBeenActive after click");
assert_true(grandChildActivation.isActive, "Grandchild frame isActive after click");
assert_true(grandChildActivation.hasBeenActive, "Grandchild frame hasBeenActive after click");
});
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, I think a virtual meeting would be more efficient here to share our thoughts. Let's plan one.
Happy to chat, but I think the biggest points of contention are that we don't have a consistent way of consuming the user activation.
So far, the tests have used two non-standard ways to try to consume it:
- requestFullscreen() + exitFullscreen()
- window.open() + window.close()
We should also avoid window.open()
tests. They don't work well on mobile devices.
I'm thinking we should remove or "-tentative" such tests until we have an actual solution to prevent codifying non-standard behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kindly request if we can just deal with the flakiness issue, as this is blocking me from landing the API implementation in WebKit. If the tests do need to be refactored more, let's please do that as a followup as I'm spending more cycles on this than I have available. I'd also appreciate more help resolving these issues with direct JavaScript code contributions/suggestions as it's really delaying things 🙏 |
I don't think it's good to rush these things. If a test is flaky in one browser and not another, that could indicate a problem with the test, but it could also indicate a problem with the browser in which it's flaky. Quickly changing the test to eliminate the flakiness without investigating the situation more holistically is not a good, as it may hide relevant bugs. For example, it's possible Safari has an issue here with iframes and load events, or with user activation propagation when multiple iframes are loaded, which we would not want to hide. |
That said, after looking further it seems like maybe this test is flaky in Chrome too, in which case everything I said above is inapplicable. (Or maybe it's only flaky after the changes in this PR?) |
I agree. I did investigate this one and it's why I proposed the change. This test was always reporting one iframe winning over the other, which made it flaky. In WebKit, at least, my change fixes the problem by loading the iframes sequentially. I ran this test a lot (and also why I don't want to go poking at the test's internals).
I honestly don't think so. My testing is showing that the two frames can load at different times. If you take a look you will see that what I mean: there is literally two |
Having said that, if we can get rid of the message passing from the same origin tests, I really do think we should do that. These tests are extremely difficult to debug on mobile. |
If that helps you, please go ahead. But I still believe frame loading order is not the cause of the flakiness you are seeing because the test waits for all three iframes to load. So I am more interested about which assertions change for you when you flatten, see my comment above. In any case, I would love to discuss over a VC. |
yes, but those report in different order. I'm literally seeing the following:
And then occasionally:
In the output. I don't know how much more evidence you need? It's the iframes loading out of order. This is why I'm adding the iframes sequentially - so the above doesn't happen. |
Oh! This is the WebKit testrunner bug! Where they can't deal with tests that succeed in a different order! The WPT project contract does not require that tests all be run in the same order each time. That's a WebKit-specific thing, because they have This is also why it's so confusing that Marcos keeps using the word "flaky"! "Flaky" generally doesn't mean "tests run in a different order". It means "tests sometimes succeed and sometimes fail". But sometimes people from WebKit get confused and call tests that run in a different order, with the same results each time, "flaky". In the past we've accepted changes to accomodate this WebKit-ism, as long as they don't change the test very much. (Similar to how WebKit keeps changing tests to not use the WPT server featuers like I haven't look at whether the change here is minimal or problematic, but at least now we know what's going on... |
Yes, sorry that was not clear: I was under the mistaken assumption that Chromium also used -expected.txt files.
Correct. But it's not harming the test to change it in the manner I proposed.
Sorry, yes. That's marked as flaky on our infrastructure: a test that randomly fails is "flaky" to WebKit folks.
Correct 😊
Yes, this is why I'm asking for very small changes. Same with the
Right, but see also the use of (same with requestFullscreen() + exitFullscreen()... if we want to use that as a means of consuming user activation, we should have that standardized - or come up with something better that consumes it explicitly and directly) |
We are on the same page on this, that's why I commented to "go ahead". But the source of the flake was a mystery to me, and I wanted to see a solution that works for the cross-origin case too. I am seeing the full picture now from the two most recent comments: in WebKit the asserts are not failing but the order of the
Assuming my last comment here is correct, let me suggest an alternative change that would work even for the cross-origin case: move all "initial state"
etc. Similarly, move all "final state" I still hope (like @domenic) that WebKit's test order rigidness would be fixed one day, but let's move forward here.
Without a standardized solution here, I had to add stop-gap measures like these 😢 because we can't possibly standardize any user-activation gated APIs (including their consumption behavior) before standardizing the core user activation model itself! We needed to start standardizing the chicken or the egg. If you see a better way (including making it conditional based on the user agent), let's discuss it in a separate issue. We (in Chrome) know how frustrating and time-consuming every minor failure in user activation tests might feel---we have gone through this! Let's continue our frank discussion here and elsewhere so that we can avoid revisting the same pain points! |
That is correct. Sorry again I didn't communicate what was happening clearly. My bad.
Me too. Though, at the same time, I can appreciate the reproducibility that webkit's CI requires (i.e., if you squint just right: "this is a feature, not a bug").
No, I totally get it and I'm not blaming anyone (I've done some questionable things myself in the Payment Request tests to consume user activation 🙈). At the same time, for someone coming at this fresh working on the implementation I was surprised to find the stop gaps because they don't match anything in the specs. But we probably shouldn't do that as a lot of engineers "code to the tests". This is a bit dangerous, because the specs are codifying non-standard behavior. Where we need stopgaps, at a minimum, we should document them in the code... or, better, we should stop and do as you suggest below (work together to standardize something).
Absolutely. But we have the model, we just need a way of accessing it.
Ok, but can we acknowledge that these tests should be Filed: #36727 Let's chat! |
@mustaqahmed or @domenic, would you mind approving as this addresses the (webkit) flakiness and also addresses missing We can then continue the discussion about how to consume user activation in #36727 |
childSO.id = "child-so"; | ||
childSO.src = "resources/consumption-sameorigin-child.html"; | ||
document.body.appendChild(childSO); | ||
await new Promise((resolve) => (childSO.onload = resolve)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we are still relying on the load
event behavior @annevk mentioned above...I mean the fact that childSO
's load
is ultimately blocked by "grandchild" frame's load
, right? I find this mix of flattening-vs-not a bit too awkward!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this line. There is no need to await for it to load. What matters is just that the first iframe loads in order.
I've made similar changes to all the other dual iframe tests. I'll have them re-viewed upstream as part of my WebKit patch and will just send them reviewed as part of that.
Closing. Will send all the WebKit reviewed changes in one go. |
Pull request was closed
Tests were flaky in that the iframes could load in different order, leading to different results.