Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dotnet] implement high level API for WebDriver BiDi Log Events #14057

Closed
wants to merge 10 commits into from

Conversation

titusfortner
Copy link
Member

Status

  • Note that this is based on [dotnet] implement WebDriver BiDi with WebDriverBiDi dependency #14052 and I have it set to merge into that branch to make it easier to see what is specific to the commits in this proposal (except ignore the Navigation code I need to update the other PR)
  • All the status limitations in that PR apply to this PR
  • I just realized I need to write tests for removing events

Description

  • Implements ConsoleMessageHandler event (which allows adding and removing events)
  • implements JavaScriptErrorHandler event (which allows adding and removing events)
  • It uses the .NET convention of making the users start and stop the monitoring separate from adding and removing the handler (this still seems weird to me), so StartMonitoringLogEntries and StopMonitoringLogEntries
  • I've set the classes and methods to show they are beta and shouldn't be relied on yet (as kind of a feature toggle so we don't have to have everything figured out before we do anything)

Motivation and Context

This is the first part of implementing #13992 in .NET

Feedback

  • While this implementation works, I'm mostly interested in the user-facing API. Does the test code represent how we want users to work with BiDi going forward? (especially based on what was discussed at the last Dev Summit)
  • @p0deje / @shs96c does this update to the Bazel file look right for having separate tests with BiDi enabled, or is it overkill? I also need to figure out how to tag specific tests to also have them run as part of dotnet_bidi_test_suite

/// <summary>
/// Add and remove handlers for console messages.
/// </summary>
[Obsolete("This event is in beta and may change in future releases.")]
Copy link
Member

@jimevans jimevans May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something of an abuse of the Obsolete attribute. No, the .NET API (before .NET 8, which introduces the Experimental attribute) doesn't have anything like the @Beta annotation, and I get what you're going for (an attribute that automatically warns when compiling). But seeing this warning will be surprising to users, because the warning code will be the one that means "You're using a method that may be removed."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I'm abusing it. The question is whether we have to commit to this API before merging/releasing. We probably are ok to commit to it, so I can remove it.

/// Add and remove handlers for console messages.
/// </summary>
[Obsolete("This event is in beta and may change in future releases.")]
public event EventHandler<EntryAddedEventArgs> ConsoleMessageHandler;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design is pretty much correct, but typical .NET API design names events with verbs, usually past tense for "something just happened," or present tense with "-ing" for "something is about to happen." In this case, you'd want something like ConsoleMessageWritten and JavaScriptErrorMessageWritten. Yes, I know it's bikeshedding, and we can negotiate about the names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about ConsoleMessageHanding?

Copy link
Member Author

@titusfortner titusfortner Jun 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Playwright & PuppeteerCSharp use page.Console += as public facing API
I'm reading this inenglish as "Add event to Page Console"
So this PR implementation would be: "Add event to Console Message Handler"
Which seems equivalent (and more descriptive)?
It's also more closely following the naming conventions we had on the board at the Dev Summit

/// <returns>An <see cref="IScript"/> object allowing the user to access
/// WebDriverBiDi events.</returns>
[Obsolete("This method is in beta and may change in future releases.")]
public IScript Script()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the kind of thing I wish I'd changed in the initial creation of the .NET bindings rather than just blindly copying Java. I would have made these properties instead of methods. We can maintain consistency or try to move to idiomatic .NET code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a dumb question. If we're going to ask people to update to the Async methods anyway, would it make sense to have users access those methods via a property? (e.g., we tell people to change from driver.Navigate().Back() to driver.Navigate.BackAsync())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bingo. I was always thinking about why driver.Navigate().Back() when driver.Back() is simpler.

My personal opinion is very simple: introduce new API besides old. That's it. Just open a door to new API for users: var bidiDriver = await driver.AsBiDiAsync().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we don't want people to spend time trying to figure out if they want a driver or a bidiDriver. Whether it is classic or BiDi doesn't matter to the user, it only matters if it is sync or async. So let's make async use the latest whatever it is and people only have to know driver.

The reason not to put everything on driver is that it makes discoverability harder. In the IDE you want to click the . and see what all can be done, and you don't want it to be too much. We will definitely disagree on what is or isn't obvious, but putting navigation stuff in a navigation namespace seems straightforward.

Copy link
Member

@nvborisenko nvborisenko May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I clicked the . and I see:

  • Close()
  • CurrentWindowHandle
  • Dispose()
  • FindElement()
  • FindElements()
  • Manage()
  • Navigate()
  • PageSource
  • Quit()
  • SwitchTo()
  • Title
  • Url
  • WindowHandles

Is it a lot?!

When I type primitive int. - I see even more, even more static methods for primitive type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's too much right now, but I'd rather only add a few more methods than dozens

Copy link
Member

@nvborisenko nvborisenko Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you really want to namespace methods for the driver object, than driver.Navigate.BackAsync() doesn't look good. Navigate is a verb if I am not wrong, so it should look like a method. And Navigate looks like a state of the object, what is incorrect.
If I would really want to split methods by "spaces", I would choose driver.AsNavigable().BackAsync() (if you really really don't like driver.NavigateBackAsync()).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Back() it makes sense to be a verb for it to sound fluent: "Navigate Back," But then that breaks with "Navigate Go To Url" instead of just "Navigate To Url"

In general, though, we need to be very cautious about making major changes to the existing API. We want as small a lift as possible for people to update their code. Changing from a method to a property seems straightforward, but changing the name entirely is a lot more intrusive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then do not change it at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's definitely the safest option

/// </summary>
/// <returns>A task object representing the asynchronous operation.</returns>
[Obsolete("This task is in beta and may change in future releases.")]
Task StartMonitoringLogEntries();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the need for exposing start/stop API? I would probably just start when the first event handler is added and stop when the last one is removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems less intuitive to me that you need to register and then explicitly call to start monitoring, but apparently that's what is expected in the .NET world 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thread promises to be interesting, and this thread is very important.

but apparently that's what is expected in the .NET world

No, native events is a standard implementation of pub/sub pattern, or observable pattern, or any other notification based pattern. Events born at the beginning of .net. Technically you are able to know when somebody subscribes/unsubscribes on the event.

https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/add - pretty easy.

Now, in 2024, let's think about it deeper:

  • add method is sync, but we need to call another async method. Sync-over-async problem.
  • OK, we know how to call async methods from sync context even when Microsoft doesn't recommend it.
  • What about async event handlers? User wants to call async methods in theirs event handlers. This is async void problem. Again, Microsoft doesn't recommend it. Ah, this is exceptional case, async event handlers... move on.
  • Now, when user is able to define sync/async event handlers, we want to make that all events are dispatched to users, and users handlers are finished. The problem here is how to await async void? Kind of impossible. There is tricky trick, but this trick is very rare in .net world. Nobody wants to code workarounds.

How to be?

await driver.OnConsoleLogEntryAsync(e => Console.WriteLine(e));

await driver.OnBeforeRequestAsync(async req => await req.ContinueAsync());

Why? It is fully manageable!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my first implementation had private variables for the handlers and then reference them to add/remove

 public event EventHandler<EntryAddedEventArgs> ConsoleMessageHandler
        {
            add
            {
                bool startMonitoring = consoleMessageHandler == null;
                consoleMessageHandler += value;
                if (startMonitoring && !isMonitoringStarted)
                {
                    Task.Run(StartMonitoring);
                }
            }
            remove => consoleMessageHandler -= value;
        }

But from what I've seen, .NET practices leans toward explicitness and clarity which means not assuming that adding or removing events will start and stop monitoring. (ChatGPT agrees with this 😁)

I'm not following all of your comment, though (and I'm not well-versed in async approach in general). Is there a use case for where we want to make one of these events manage async actions?

Copy link
Member

@nvborisenko nvborisenko May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For user it doesn't matter what you do behind: starting, stopping, laughing. He just subscribe on the event, what means he wants to get notifications every time when the event is occurred.

Use cases for async event handlers?.. My initial example can be considered as use case:

await driver.OnBeforeRequestAsync(async req => await req.ContinueAsync());

To be more precise:

await context.OnBeforeRequestSentAsync("https://**", async req =>
{
    await req.ContinueAsync("POST");
});

await context.NavigateAsync("https://selenium.dev");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my question is whether there is something that can't be done with a sync method, or if it's just that people want to do everything async on principle?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With respect to explicit start/stop:

So I took a look at PuppeteerCSharp and Playwright .NET implementations; I don't want to take what they are doing as necessary indications of proper, idiomatic .NET behavior, but, neither of them require an extra step to start monitoring.

I generally see our priorities as:

  1. Language idiomatic
  2. Matches ease of use in alternative projects
  3. Consistent across Selenium bindings

I agree that explicit stop/starts follows the convention of explicitness, but that neither other project requires it and none of the other bindings will do it this way, I'm leaning toward having adding the event start it, but keep the methods for starting and stopping public in case the user wants more control.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as the async implementation, I did some reading, and yes, we need to support it, I'll see about updating this code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looks like Async works out of the box...

            driver.Script.ConsoleMessageHandler += async (_, msg) =>
            {
                if (msg.Level == WebDriverBiDi.Log.LogLevel.Info)
                {
                    logs.Add(msg);
                }
                await Task.Delay(1000);
            };

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your example is known async void problem. Just try slightly modified handler:

await Task.Delay(1000);
if (msg.Level == WebDriverBiDi.Log.LogLevel.Info)
{
  logs.Add(msg);
}

@@ -100,3 +103,39 @@ dotnet_nunit_test_suite(
framework("nuget", "NUnit"),
],
)

dotnet_bidi_test_suite(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to create separate test targets for BiDi, but I think it might be easier for you to automatically handle this logic in dotnet_nunit_test_suite to automatically generate extra targets for BiDi and extend env based on the test file path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a pattern of this I can follow? It's a little complicated because I want to figure out both:

  1. Test suite runs all tests except what's in BiDi directory without BiDi Enabled
  2. Test suite runs all tests in BiDi directory, plus specific tagged tests with BiDi enabled

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing like that is implemented, if the current approach works, I suggest you stick to it and then we can improve.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't fixed the other things preventing bazel from working, yet. 😂

@titusfortner titusfortner force-pushed the dotnet_bidi_new branch 2 times, most recently from 5d176cb to 69c568e Compare June 7, 2024 13:13
@titusfortner
Copy link
Member Author

I'm closing this PR because we are re-evaluating how we want the .NET API to look like. We're investigating moving away from Events as they do not work great with async pattern. It does not make sense to hold this PR open while waiting for that work.

@diemol diemol deleted the dotnet_bidi_script branch December 26, 2024 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-devtools BiDi or Chrome DevTools related issues C-dotnet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants