Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🚀 Feature]: Implement high level BiDi script commands #13992

Open
titusfortner opened this issue May 20, 2024 · 11 comments
Open

[🚀 Feature]: Implement high level BiDi script commands #13992

titusfortner opened this issue May 20, 2024 · 11 comments

Comments

@titusfortner
Copy link
Member

titusfortner commented May 20, 2024

Feature and motivation

At the Selenium Dev Summit we agreed on this API to be generally applied across the bindings; we'll keep this labeled beta while we make sure that it works for what is needed

We want the methods to be accessible from a script() method available directly from the Driver class (e.g., driver.script.pin(script), driver.Script().Pin(script)). We can't do everything just like this in all the languages, because, for example, .NET uses events with a += and -= for adding and removing handler events so we don't went "add" and "remove" methods.

Implementations:

Method Java NodeJS Python Ruby .NET
pin() #14250 #14305
unpin() #14305 #14250
execute() #14330 #14293
addConsoleMessageHandler() #14225 #14135 #14107 #14073 #14057
removeConsoleMessageHandler() #14225 #14135 #14107 #14073 #14057
addDomMutationHandler() #14304 #14238
removeDomMutationHandler() #14304 #14238
addJavaScriptErrorHandler() #14225 #14135 #14107 #14073 #14057
removeJavaScriptErrorHandler() #14225 #14135 #14107 #14073 #14057

Considerations

  • some of these "script" methods are in the "log" domain in BiDi spec, but "log" is overloaded and we were concerned that it would not be obvious what driver.log() was supposed to give access to. As such we agreed it would work well in the Script implementation
  • we agreed to split out console messages from javaScript errors even though both come through the same BiDi method
  • we agreed not to have an async version of execute method in the BiDi API
@Mr0grog
Copy link
Contributor

Mr0grog commented Jun 27, 2024

Hi, are you interested in feedback on the design of this? If so, where, and Is there any kind of design document/plan? (I’m not sure what might be implementation-specific or only partially implemented, other than a vague sense based on the table in this issue.)

I’ve been playing a bit with the Ruby implementation as a replacement for some devtools-based code I wrote recently and had some thoughts I hope might be useful.

@p0deje
Copy link
Member

p0deje commented Jun 27, 2024

@Mr0grog Absolutely, please share your feedback. We don't have the document describing all of this because the API was designed during the in-person DevSummit of Selenium TLC. As we work on implementing these APIs, the feedback is greatly appreciated!

@Mr0grog
Copy link
Contributor

Mr0grog commented Jun 27, 2024

👍 Great!

First off, the Script API is very pleasant to use. I’m really glad to see it! ❤️ The straightforward unsubscribe methods are also a nice improvement over devtools.

That said, one thing I’d love is for console log and javascript error objects from the high-level Script API to include the URL of the page they occurred on or, even better, the URL of both the frame they occurred in and the top-level/source frame they occurred in (when iframes or popups are involved). Or maybe a BrowsingContextInfo object? (Since that includes links up and down the context tree.)

Barring that, the raw BiDi API’s log.entryAdded event (when a console log or unhandled JS error occurs) has a source property with the ID of the context, but that field gets dropped in the mid-level (LogHandler) and high-level APIs. Having it there would allow me to line up a log message with the context using other BiDi commands and events (namely browsingContext.getTree and browsingContext.fragmentNavigated), instead of working around the new high-level API entirely. (In this case, it would also be nice to have some high-level access to the context tree and various navigation events, but I’m hoping they’re already planned for even if I can’t find the issue. 🙂 🤞)

Background/use case: I’ve been using the devtools API to keep track of what page JS logs and errors are coming from during browser-based tests. This is just generally useful in testing workflows that cover multiple pages or that involve iframes or popups. In my case, an app I recently started working on has tests that include navigations to other sites (e.g. a signup flow that ends by kicking someone over to a marketing page managed by another system at a different subdomain). I use information about the URL to decide whether to ignore JS logs/errors (since we don’t want our tests to fail and prevent a build if the other site is having some issues).

Obviously pulling multiple info sources together like I’m suggesting in my ideal scenario adds a lot of complexity to the implementation, but I suspect this info might be useful in a lot of ways whenever automations navigate across multiple pages, perform form submissions, etc.

@p0deje
Copy link
Member

p0deje commented Jun 27, 2024

I think we should preserve source in the log entry and make sure it has the browsingContext that is properly serialized. @titusfortner @pujagani Any objections?

@titusfortner
Copy link
Member Author

@Mr0grog Thanks for the feedback! We're definitely trying to figure this out as we go along.
Can you make a Pull Request with what you'd like to see?

@pujagani
Copy link
Contributor

Yes, the source bit was added in the web driver BiDi spec relatively recently. I am guessing for the same reason as requested above. We should make everything that is part of types defined in spec https://w3c.github.io/webdriver-bidi/#types-log-logentry, available to the user.

@titusfortner
Copy link
Member Author

Oh, I see what you're asking; we just need to update the Struct with source:

        ConsoleLogEntry = BiDi::Struct.new(:level, :text, :timestamp, :method, :args, :type)
        JavaScriptLogEntry = BiDi::Struct.new(:level, :text, :timestamp, :stack_trace, :type)

That's weird, I would have sworn I added things based on the spec, not sure how I missed source

@Mr0grog
Copy link
Contributor

Mr0grog commented Jun 28, 2024

Oh, I see what you're asking; we just need to update the Struct with source

Yeah, that is the simplest solution. I guess I thought I’d go ahead and propose something that did more given the description of Script as “high-level” API, which generally implies to me that the intent isn’t simply to marshall messages over the protocol, but to combine parts of the protocol in more use-case-specific ways when reasonable (up to you if this use case is common/reasonable or too narrow!).

It would also be nice if the things in the source field were more meaningful objects, i.e. if the source.context was a BrowserContextInfo that I could use to navigate up the context tree instead of just an ID string (and same for source.realm, although it looks like there is nothing at all written to wrap realm stuff yet). I can appreciate that might be fairly complex to do in a performant way, though. (In that case, I also need some API to get contexts based on ID, since it doesn’t exist yet and in #14201 I learned I should not actually be using the BiDi API directly, which was not clear to me when I made my earlier comments.)

@titusfortner
Copy link
Member Author

Right, the first step is to just expose what's available. I'm a little resistant to adding a bunch of objects for everything in Ruby, but we definitely want to make sure that the output is in a state that is usable in other parts of the code. Maybe the string is what gets passed in as a parameter to another method that provides the tree parsing information.

@Mr0grog
Copy link
Contributor

Mr0grog commented Oct 21, 2024

👋 Hi, I just wanted to check back in on this — I put up a PR over three months ago (#14236) and talked with @titusfortner on Slack/IRC about it, but never got any feedback.

@p0deje
Copy link
Member

p0deje commented Oct 21, 2024

👋 Hi, I just wanted to check back in on this — I put up a PR over three months ago (#14236) and talked with @titusfortner on Slack/IRC about it, but never got any feedback.

Sorry for the delay, it's now merged. Thank you!

sandeepsuryaprasad pushed a commit to sandeepsuryaprasad/selenium that referenced this issue Oct 29, 2024
Related to SeleniumHQ#13992

Co-authored-by: Titus Fortner <titusfortner@users.noreply.github.com>
sandeepsuryaprasad pushed a commit to sandeepsuryaprasad/selenium that referenced this issue Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants