Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: dom.createPort() #679

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rdcronin
Copy link
Contributor

@rdcronin rdcronin commented Aug 22, 2024

This adds a proposal for a new dom.createPort() method, which will allow communication between different JavaScript "worlds". It leverages the proposed dom.execute() method (#678).

This adds a proposal for a new dom.createPort() method, whic will allow
communication between different javscript "worlds". It leverages the
proposed dom.execute() method.
Copy link
Contributor

@carlosjeurissen carlosjeurissen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdcronin @Rob--W Thanks for putting this together and continue where we left off during the San Diego meeting!

Comment on lines +109 to +113
const foo = getFoo();
port.sendMessage({foo});
} else if (message == 'getBar') {
const bar = getBar();
port.sendMessage({bar});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligning with the code below to make the examples more readable / prevent perceived differences where they do not apply.

Suggested change
const foo = getFoo();
port.sendMessage({foo});
} else if (message == 'getBar') {
const bar = getBar();
port.sendMessage({bar});
port.sendMessage({foo: getFoo()});
} else if (message == 'getBar') {
port.sendMessage({bar: getBar()});

}

// The rest of this code executes in the content script world.
const mainWorldPort = browser.dom.createPort({world: 'MAIN'});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the port needs to be passed to the main world anyway. What is the reason to require explicitly specifying the world?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above from Oliver's comment. This is to make it explicit and obvious to both the browser and the developer that this port is implicitly bound to one world, and disallow any accidental passing to another world or multiple worlds.

Copy link
Member

@Rob--W Rob--W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments but also suggestions to specify underspecified behavior. For ease of editing, you can commit these suggestions with Github's UI (https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/incorporating-feedback-in-your-pull-request#applying-suggested-changes) and then use git pull origin refs/pull/679/head to pull the new changes locally.

I'll use "Squash & Merge" before committing the PR, so the number of intermediate commits do not matter.

browser.dom.execute(
{
func: setUpMainWorld,
args: mainWorldPort,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
args: mainWorldPort,
args: [mainWorldPort],


interface MessagePort {
sendMessage(args?: any): void
onMessage(args?: any): void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm that you are intentionally using a new design here:

There is no precedent in extension APIs for using properties to control API behavior - API interaction is always through methods. Some APIs can accept function members on objects passed to functions (e.g. contextMenus.update(id, { onclick })), but there is no namespace whose behavior changes through properties.

If this new pattern is not intentional, setMessageListener( MessageListenerCallback? ); would fit the usual conventions (I don't mention getMessageListener because there is presumably no need to change the listener later).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant to update this earlier. This should be an event, not a function. In our yet-to-be-finalized new syntax, this would be something like:

callback OnMessageListener = void (any message);

interface OnMessageEvent : ExtensionEvent {
  static void addListener(OnMessageListener listener);
  static void removeListener(OnMessageListener listener);
  static boolean hasListener(OnMessageListener listener);
}

interface MessagePort {
  static attribute OnMessageEvent onMessage;
  ...
}

Not sure if we want to use that here, or a more typescript-centric approach; lemme know if you have a preference


### Open Web API

The open web is unaware of the concept of multiple Javascript worlds, so this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the open web does not support the concept of multiple worlds, it does already have a concept of a MessagePort (https://developer.mozilla.org/en-US/docs/Web/API/MessagePort). It's worth calling this out explicitly, and also mention that we cannot use this because (1) it is async and (2) the use of web prototypes would enable web pages to intercept messages (e.g. through MessagePort.prototype or Event.prototype).

Here is an example:

mc = new MessageChannel();
mc.port1.onmessage = e => console.log(e.data);
mc.port2.postMessage(2);
console.log(3);

Result:
3
2
// ( 2 logged after 3 means that the message delivery was async )

Copy link
Member

@dotproto dotproto Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename MessagePort to SyncMessagePort? That would both address the naming collision called out in the previous comment and describe what is different about this interface.


## Implementation Notes

N/A.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expand with notes on the prototype (maybe at the behavior section?):

Suggested change
N/A.
### Prototype
To avoid interception by web pages, the prototype chain of the `MessagePort`
instance should terminate at `null` instead of `Object.prototype`.
Concretely, this means that an instance of `MessagePort` could either inherit
from a (hidden) MessagePort class \* (which itself has a `null` Prototype), or
be a plain object with a null Prototype and all specified members as own
property descriptors.
\* hidden means: not exposed as a global class. If anyone (e.g. the web page)
ever gets access to a `MessagePort` instance, it may see the hidden prototype
via `port.__proto__` or `port.constructor.prototype` and through that intercept
& interfere with all future port communications.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to find an existing web specification that has interfaces that define an interface or object that inherits from null. The closest thing I was able to find was the create an exports object algorithm from the WebAssembly JavaScript Interface working draft.

}
};

// These notify the content script of changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// These notify the content script of changes.
// These notify the content script of changes.
// NOTE: addEventListener used in this example is not part of the API.

Adding emphasis on the fact that addEventListener is NOT part of the API. Because if it were, it would easily be intercepted by the web page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack; will incorporate

### Additional Security Considerations

N/A.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suggesting to include a short write-up of the brittle safety offered and not-offered by the API. This clarifies what the API does and does not do to those who are not experts in the subject.

Suggested change
#### Secrecy of data
This API is designed to offer a secure communication channel between one world
and an untrusted world. It is the responsibility of the extension to maintain
further secrecy beyond that. In general, this requires avoiding the use of
prototype members.
For example, imagine a func that reads `message.foo` in the main world. This
is safe if `message` is always an object with the `foo` member. If this is not
the case, then it is a language feature of JavaScript to look up the `foo`
member in the prototype chain. Most objects ultimately inherit from
`Object.prototype`, which can be manipulated by other (untrusted) scripts to
intercept the message.
Example:
```javascript
// Attack:
Object.defineProperty(Object.prototype, "foo", {
get() { console.log("Intercepted via foo:", this); return "intercepted" }
});
// Safe use: foo always defined, web page cannot intercept access.
message = { secret: 1, foo: true };
console.log("message.foo=", message.foo);
// Logs "message.foo= true"
// Unsafe use: foo not set; web page can intercept access attempt.
message = { secret: 1 };
console.log("message.foo=", message.foo);
// Logs "Intercepted via foo: { secret: 1 }
// Logs "message.foo= intercepted"
```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, will incorporate

Comment on lines +56 to +57
sendMessage(args?: any): void
onMessage(args?: any): void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sendMessage(args?: any): void
onMessage(args?: any): void
sendMessage(message?: any): void
onMessage(message?: any): void

To avoid confusion with args in dom.execute, suggest renaming to message.

A message port can be passed to another world (in order to establish a
connection) by leveraging the new `dom.execute()` API. `dom.execute()` will be
expanded to have special serialization logic for the MessagePort type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a more explicit description of the message dispatch semantics. This defines synchronicity, error handling and non-errors.

Suggested change
### Message dispatch
Upon calling `sendMessage`, `onMessage` is synchronously called before
`sendMessage` returns. `sendMessage` returns successfully if `onMessage` has
been called, even if the function itself threw an error.
`sendMessage` may throw if the input message cannot be serialized.
`sendMessage` may throw if the return value cannot be serialized.
`sendMessage` may throw when the recipient is unavailable. For example, if the
recipient did not assign an `onMessage` listener. Or if the port was never
delivered to the other world.

I am also willing to use the phrase throws instead of may throw if you agree with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to touch on the synchronocity aspect a bit more here (and I think return values are a natural byproduct of that). I know we discussed this briefly before, but I'd like some more clarity on what this enables. Given we have a synchronous dom.execute(), what the use cases in which synchronous message passing (as opposed to just FIFO) to a port is critical? Is it for cases where the main world needs synchronous execution with the extension world? If so, why? Or for some other use case?

A not-necessarily-synchronous design leaves this more open -- we could potentially have a single world blocked while another is unblocked. I'm honestly not sure if this exists today in Chromium (I'm not sure about other browsers), but I could see definite situations in which it could (e.g., we might pause the main world, but allow extensions to continue running). If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, synchronous message passing enables the already injected main world script to reach back to the isolated world to look up relevant state that the script in the main world needs.

Other extension authors have already commented on some use cases at #284

As an example of a concrete use case from personal experience: I have an extension called Don't Track Me Google, which is just a content script that ensures that becomes. Among other things, the extension intercepts the setter of HTMLAnchorElement.prototype.href`, to hook in on calls from the web page. It sanitizes it and calls the original setter.

To minimize interference from the web page, the sanitization logic should run in an isolated world. And because the original caller expects a synchronous API, the messaging API has to be synchronous.

This specific use case also shows why sometimes it is desirable to be able to receive DOM objects, so that the page and caller can pass the DOM elements (that they already share!) without requiring appending to the DOM.

Note: this extension was originally a MV2 extension. To convert to MV3, I had to duplicate the sanitization logic in the main world and use a shared DOM element for communicating specific state (preferences from the options page) from the content script to the main world script: https://github.com/Rob--W/dont-track-me-google/blob/a73ab8b3bfa9794c75cb4e111c6424e31e6c019e/main_world_script.js. The shared DOM element can be detected by the web page. In my case it's a reasonable trade-off, especially because I don't expect Google web properties to intentionally break the extension. This assumption is not necessarily true for extensions that are privacy-focused and designed for the whole web.

A not-necessarily-synchronous design leaves this more open -- we could potentially have a single world blocked while another is unblocked. I'm honestly not sure if this exists today in Chromium (I'm not sure about other browsers), but I could see definite situations in which it could (e.g., we might pause the main world, but allow extensions to continue running). If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).

There is already a synchronous communication between worlds - they share the same DOM, and therefore DOM events.


A message port can be passed to another world (in order to establish a
connection) by leveraging the new `dom.execute()` API. `dom.execute()` will be
expanded to have special serialization logic for the MessagePort type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expanded to have special serialization logic for the MessagePort type.
expanded to have special serialization logic for the MessagePort type.
Once a message port has been passed to a `dom.execute` call, it cannot be passed
again - an error will be thrown because it is not structurally cloneable.

I'm explicitly noting that the port cannot be reused. And also emphasizing that the port is not structurally cloneable, because making the type cloneable could be a trivial way to make it fit in the dom.execute requirement of args being strucurally cloneable (which would trigger a bunch of new issues, such as the "clone" potentially inverting roles without dom.execute, e.g. when calling structuredClone(port)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; will incorporate

declare namespace dom {
interface PortProperties {
world: ExecutionWorld;
worldId?: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both @carlosjeurissen and @xeenon had mentioned that this might be redundant, and I can't immediately see a strong need for this (it is only useful if an extension sends the port to multiple worlds, which seems unlikely - and we might be able to just make it so the port only works in the first world it is sent to). Any thoughts on that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't to support multiple worlds (this wouldn't do that, since this still only supports one), but rather to ensure that:
a) a developer is aware that a port is bound to a single world. This is distinct from the existing uses of ports, which are a many-to-one relationship between receivers and senders.
b) a developer can't accidentally sending a port to an incorrect world.

Technically, this is redundant, since it could be derived from the first world it's sent to (and, since dom.executeScript() is proposed to be synchronous, there in theory aren't any TOCTOU issues here?), but that still seems a bit less obvious to me than binding the port at the start. (And makes browser implementation slightly more complex, though we can always work around that.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what you mean by "this might be redundant". Are you saying that worldId is not necessary, that world should not be required if worldId is present, or something else?

The Allow multiple user script worlds proposal introduced the concept of worldId (along side world) as a property of RegisteredUserScript. Right now the presence of both properties on PortProperties follows that pattern. The main difference between how these properties are used is that this proposal defines world as a required property while in the other prooposal its optional. That allows developers to omit the world property when targeting a specific user script world or to omit worldId when targeting MAIN or USER_SCRIPTS worlds.

The only scenario I can think of where requiring world might be useful is if we anticipate allowing other types of script injection to occur in separate, isolated worlds. Or, at least want to reserve the ability to introduce such a concept in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that worldId is not necessary

This was the intended meaning of my original comment. Thinking about it some more (and as Devlin touched on below) the world property is also not strictly necessary.

I can see the argument for requiring these, but right now I'm not convinced the benefit is worth it. If a developer tries to send a port to a second world, we can easily throw a "Port cannot be sent to multiple worlds" error.

That seems like enough for developers to discover the right path without making the API more verbose.

Copy link
Contributor Author

@rdcronin rdcronin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, folks! Just responding to open comments to seed some more discussion; haven't made changes to the commit just yet.

declare namespace dom {
interface PortProperties {
world: ExecutionWorld;
worldId?: string;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't to support multiple worlds (this wouldn't do that, since this still only supports one), but rather to ensure that:
a) a developer is aware that a port is bound to a single world. This is distinct from the existing uses of ports, which are a many-to-one relationship between receivers and senders.
b) a developer can't accidentally sending a port to an incorrect world.

Technically, this is redundant, since it could be derived from the first world it's sent to (and, since dom.executeScript() is proposed to be synchronous, there in theory aren't any TOCTOU issues here?), but that still seems a bit less obvious to me than binding the port at the start. (And makes browser implementation slightly more complex, though we can always work around that.)


interface MessagePort {
sendMessage(args?: any): void
onMessage(args?: any): void
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant to update this earlier. This should be an event, not a function. In our yet-to-be-finalized new syntax, this would be something like:

callback OnMessageListener = void (any message);

interface OnMessageEvent : ExtensionEvent {
  static void addListener(OnMessageListener listener);
  static void removeListener(OnMessageListener listener);
  static boolean hasListener(OnMessageListener listener);
}

interface MessagePort {
  static attribute OnMessageEvent onMessage;
  ...
}

Not sure if we want to use that here, or a more typescript-centric approach; lemme know if you have a preference


A message port can be passed to another world (in order to establish a
connection) by leveraging the new `dom.execute()` API. `dom.execute()` will be
expanded to have special serialization logic for the MessagePort type.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; will incorporate

A message port can be passed to another world (in order to establish a
connection) by leveraging the new `dom.execute()` API. `dom.execute()` will be
expanded to have special serialization logic for the MessagePort type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to touch on the synchronocity aspect a bit more here (and I think return values are a natural byproduct of that). I know we discussed this briefly before, but I'd like some more clarity on what this enables. Given we have a synchronous dom.execute(), what the use cases in which synchronous message passing (as opposed to just FIFO) to a port is critical? Is it for cases where the main world needs synchronous execution with the extension world? If so, why? Or for some other use case?

A not-necessarily-synchronous design leaves this more open -- we could potentially have a single world blocked while another is unblocked. I'm honestly not sure if this exists today in Chromium (I'm not sure about other browsers), but I could see definite situations in which it could (e.g., we might pause the main world, but allow extensions to continue running). If we guarantee this synchronocity of message passing, that means we also need to block execution in the extension world on any execution that may be happening in the main world (preventing extensions from doing work they may otherwise be able to).

}
};

// These notify the content script of changes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack; will incorporate

}

// The rest of this code executes in the content script world.
const mainWorldPort = browser.dom.createPort({world: 'MAIN'});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above from Oliver's comment. This is to make it explicit and obvious to both the browser and the developer that this port is implicitly bound to one world, and disallow any accidental passing to another world or multiple worlds.

### Additional Security Considerations

N/A.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, will incorporate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants