Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out Function constructor + TrustedScript #174

Closed
koto opened this issue May 23, 2019 · 13 comments
Closed

Figure out Function constructor + TrustedScript #174

koto opened this issue May 23, 2019 · 13 comments

Comments

@koto
Copy link
Member

koto commented May 23, 2019

First n-1 parameters to Function constructor form the argument lists, but still have the capability of executing code due to a weird way how a function is constructed (string concatenation, essentially):

https://www.ecma-international.org/ecma-262/9.0/index.html#sec-createdynamicfunction

It would be best if all of the function parameters were passed to a single (default) policy createScript function, instead of separate invocations for each parameter, but that means the createScript function signature changes (argument list and the meaning of the return value).

cc @otherdaniel @mikesamuel

@otherdaniel
Copy link
Member

After reading the spec...

I think my original proposal (checking only the function body) is simply not sane.

Likewise, I think checking parameters separately is effectively useless, since it'd be super easy to 'hide' things by splitting functionality between different bits.

My current thinking is to pass the entire array of arguments to the callback, which could work.

However, after having re-read the spec on this... maybe all of this is too complicated. The spec specifies how to assemble all of the args into a single source string, which means just not special casing this at all would seem to work just fine:

  • Since all arguments are ToString-ified, any 'safe type' will be discarded, and the multi-argument function constructor will never pass the trusted type check.
  • We can call the default policy with the fully assembled string. (I had previously assumed that the source string assembly was an implementation detail one wouldn't expose to the developer, since every browser might to this differently, but in reality this is all fully spec-ed.)
  • I.e., do the trusted type check at or around step 18 of CreateDynamicFunction, with possibly some custom logic that if it's a single argument which is a 'safe' type, to then use that instead.
  • Whoever would want to use the (full) Function constructor with safe types has a spec-supported work-around: They can just assemble the source string themselves and pass that through a policy.

@koto
Copy link
Member Author

koto commented May 24, 2019

I.e., do the trusted type check at or around step 18 of CreateDynamicFunction, with possibly some custom logic that if it's a single argument which is a 'safe' type, to then use that instead.

So, have a single parameter type check early in the algorithm, and if it passes, the stringified function is interpreted as bodyText, with the assembly steps skipped?

They can just assemble the source string themselves and pass that through a policy.

That would require changing the Function constructor to support single-parameter function body with arguments calls (but since it's typed, that should work).

There is some JS spec machinery here to build, so deferring to Mike on how easy it is to add to the spec. If we have @@evalable, perhaps it's an easy task in the end.

@mikesamuel
Copy link
Collaborator

We can call the default policy with the fully assembled string. (I had previously assumed that the source string assembly was an implementation detail one wouldn't expose to the developer, since every browser might to this differently, but in reality this is all fully spec-ed.)

How does this work when body is a TrustedScript and the arguments are strings?

Yeah, I think there are two options:

  1. pass the concatenated argument list (P) as one string and the body as a pre-stringified value.
  2. pass the parsed form of P as a |FormalArgumentList| AST and the body as a pre-stringified value.
    and then define a "safe" argument list.

Neither of these need to affect the order of stringification in CreateDynamicFunction, but the second might affect whether something like Function('a a', '') results in a TypeError or a SyntaxError which might raise web-compat issues.

I don't think this affects generality though. Just how easy it is to express complicated list arguments.
Obviously, any call that passes arguments like

new Function(x, policy.createScript(body))

can be rewritten to one that does not

TrustedTypes.isScript(body) && new Function(
    policy.createScript(`return function (${ x }) { ${ body } }`)

which means that different tolerances for argument safety can be specified in user code.

@otherdaniel Should we go through the dynamic semantics of FormalArgumentList and figure out what's a good tradeoff?

@otherdaniel
Copy link
Member

@koto @mikesamuel Honestly, I feel like I'm out of my depth, here, since every time I re-read the spec I discover some reason why my previous argument doesn't quite hold...

@koto: "have a single parameter type check early in the algorithm" Yeah. I'm no longer sure that's best, though.

@mikesamuel: "Should we go through the dynamic semantics [...]" Yes, we should. I'd prefer to have a list of options, though, so we can present that to V8 team for fact-checking. Or maybe just get Toon in there.


#1 My idea so far has been to basically not support Trusted Types for any case other than the single argument case, and to then tell the developer(s) that if they must use the Function constructor, then they must build a single string. I (now) see that they can only do that by indirectly returning a function, e.g.:
Function('a', 'b', 'return a+b;') => Function("return function(a,b) { return a+b; }");

That is kind of awkward. Also, if that was our choice, it's interesting what the 'default' policy should receive as an argument.


#2 I'd be slightly less awkward if we could reduce everything into the two-argument case, because the spec distribguishes between the combined parameter list P and the bodyText. Also, P is trivially constructable from the arguments. I don't see any good way to retrofit that into our current interfaces. (But: See #4.)


#3 Maybe one could go back to only checking the body, and demand as a pre-condition for trusted types that IsSimpleParameterList is true (as demanded for 'use strict;'). However, I fear this would be too restrictive for authors trying to move to TT from a legacy codebase. (It's really the same issue as #1 above, since it'd force people to use the indirect function definition thing.


#4: A better (bug "bigger") solution might be have a separate 'trusted type' just for this case. And in this case, one that doesn't wrap a string, but would instead wrap an array of any. (Or: Two separate strings, to pick up case #2 again.)

The Function constructor could then (via the embedder) check for that type and - if found - unwrap the argument list from the trusted type and proceed as normal. This might be the cleanest to specify, but arguably more work to implement.

E.g., in addition to createHTML, createURL, createScript, createScriptURL, there's a createFunction(...args), which returns a TrustedFunction. TrustedFunction isn't string-ifiable, and the Function constructor is the only thing reading it.

@mikesamuel
Copy link
Collaborator

Re #2 do the below fit the two argument case?

new Function('a, b, c', 'return a+b+c')
new Function(myPolicy.createScript('a, b, c'), myPolicy.createScript('return a+b+c'))

I like the idea of using IsSimpleParameterList since that'll be more palatable for the committee.

Does the following flow fit #2?

  1. If the count of args is 0 or 1, let P = the empty string.
  2. Else if the count of args is 2, let P = the zeroth argument. Do not stringify.
  3. Else, let P = the result of stringifying args[:-1] and joining on ","
  4. Let body = args[args.length - 1].
  5. Pass P and body to the host callout.

The host callout does:

  1. if P is not a TrustedScript
    1. Let P be ToString(P)
    2. Let fpl = the result of parsing P as a FormalParameterList.
    3. If there as a SyntaxError, throw
    4. If ! IsSimpleParameterList(fpl)
      1. TODO. throw or apply default policy as createScript
  2. body checks

@mikesamuel
Copy link
Collaborator

E.g., in addition to createHTML, createURL, createScript, createScriptURL, there's a createFunction(...args), which returns a TrustedFunction. TrustedFunction isn't string-ifiable, and the Function constructor is the only thing reading it.

Would a single suffice work for all the Function constructors: Function, AsyncFunction, etc. ?

@mikesamuel
Copy link
Collaborator

mikesamuel commented May 29, 2019

@koto As discussed, here's a user-library class that can stand in for all arguments:

class C {
  *[Symbol.iterator]() {
    for (let a of [...this.args, this.body]) { yield a }
  }

  constructor(...args) {
    this.args = args.slice(0, args.length - 1); this.body = args[args.length - 1] || '';
  }

  toString() {
    return `function (${ this.args.join(', ') }\n) { ${ this.body } }`;
  }
}
//undefined
const c = new C('a', 'b, c', 'return [a, b, c]')
//undefined
Array.from(c)
// (3) ["a", "b, c", "return [a, b, c]"]
new Function(...c)
// ƒ anonymous(a,b, c
// ) {
// return [a, b, c]
// }

@koto
Copy link
Member Author

koto commented May 29, 2019

That class could serve as a basis for the new type (TrustedFunctionConstructorArguments :) ). It would wrap over multiple strings essentially, last of which is a function body.

With that iterable type instance in place, one would call a Function constructor, spreading the arguments.

The default policy is also accounted for - at the right place, its createFunctionConstructorArguments is being called with the original string (or mixed string and typed) arguments, returning the iterator (or an array of TrustedScripts, whichever is easier to spec.

@otherdaniel
Copy link
Member

"Does the following flow fit #2?" Yes, that's the idea. I'm not sure if specifying works that way. I guess we'd need an algorithm that does something like:

  • if trusted type check is required:
    • if it's TrustedScript, use its string value
    • else if there's a default policy, call it with ToString(value)
    • else, throw an exception
  • it trusted type check is not required:
    • use ToString(value)

And in CreateDynamicFunction, do someting like:

  • zero args: P="", body=""
  • one arg: P="", body=ThatNewAlgorithm(args[0])
  • two args: P=ThatNewAlgorithm(args[0]), body=ThatNewAlgorithm(args[1])
  • 2 args: P=ThatNewAlgorithm(string-ify procedure), body=ThatNewAlgorithm(args[N-1])
    Note that for that last case, P will always 'loose' the trusted value.

On second thought, I do think a new 'trusted function' type that wraps all the arguments is a more understandable way. In that case, there could be one check for the trusted type early on, and then the entire procedure could continue undisturbed.

@mikesamuel
Copy link
Collaborator

Is this a fair statement of requirements:

  1. We want new Function(...simpleParameterStrings, trustedBody) to work
  2. We want new Function(...simpleParameterStrings, stringBody) to work out of the box with a default policy callout for stringBody.
  3. We want a way for clients willing to change code to migrate new Function(...complexParameters, trustedBody)
  4. It'd be nice if new Function(...mixOfStringAndTrustedParameters, anyBody) worked but not required.

and similarly for {Async,}{Generator,}Function?

@mikesamuel
Copy link
Collaborator

Is the idea that

  1. CreateDynamicFunction passes arguments to the host callout before any stringification, concatenation, or FormalArgumentList validity checks.
  2. The host callout returns an argument list
  3. The host callout has 2 new normative notes:
    1. If Host... stringifies argument[i] it MUST stringify all of arguments[0:i+1] in left-to-right order
    2. If Host... receives one argument which is to be parsed as a ScriptBody, it MUST return an argument list of length 1. (To not require eval to do complicated error recovery after the host callout).
  4. After the host callout, CreateDynamicFunction stringifies the returned arguments and proceeds.
  5. Browser's implementation of the host callout can use the default policy to convert args to a TrustedFunctionBitsAndBobs if arguments is not [...simpleParameters, TrustedScript] or [TrustedFunctionBitsAndBobs].

@koto
Copy link
Member Author

koto commented May 31, 2019 via email

@koto koto closed this as completed Jun 1, 2019
mikesamuel added a commit to mikesamuel/proposal-hostensurecancompilestrings-passthru that referenced this issue Jun 2, 2019
w3c/trusted-types#174 derives TT's ask
of TC39 and explains how that will affect dynamic function creation.
@mikesamuel
Copy link
Collaborator

@koto, IIUC, Toon noticed this problem. I missed it entirely.

New proposal text at https://mikesamuel.github.io/proposal-hostensurecancompilestrings-passthru/ and I adjusted the explainer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants