Parameterized Terms #176

stasm · 2018-10-01T16:51:41Z

a.k.a. The Return of Macros.

In #102 we added VariantLists. They are a solution to abusing SelectExpressions by not defining the selector. Variants of a VariantList can be retrieved externally from where the term is referenced with a VariantExpression.

The design of VariantLists has its own shortcomings which I outlined in #102 (comment) and which required the addition of specialized code in abstract.mjs to tightly control where exactly VariantLists are allowed.

At the time, we accepted these shortcomings as trade-offs and also because we didn't find other alternatives. I think I may have stumbled across an idea which is worth bringing it up here.

Proposal

Rather than treating terms with VariantLists for value as dictionaries whose members must be accessed from the outside, what if we encapsulated the member access in a form of a function?

In other words, rather than defining the following term as a VariantList:

# Before, with VariantLists
-fxaccount-brand-name = {
    [lowercase] account Firefox
   *[uppercase] Account Firefox
}

sync-signedout-account-title = Connetti il tuo { -fxaccount-brand-name[lowercase] }

We could parameterize it with the $case local variable and use a regular Pattern as the value:

# After, with parameterized terms
-fxaccount-brand-name($case) = { $case ->
    [lowercase] account Firefox
   *[uppercase] Account Firefox
}

# Call it similar to a function
sync-signedout-account-title = Connetti il tuo { -fxaccount-brand-name("lowercase") }

Using the default variant, or referencing a term which doesn't have any parameters would use the current TermReference syntax without any paranthesis:

# English FTL
sync-signedout-account-title = Connect to your { -fxaccount-brand-name }

Advantages

Values of terms can always be Patterns again.
Accessing a specific variant uses the CallExpression syntax. No need for the specialized VariantExpression.
Accessing variants would now use the same semantics as SelectExpressions in Patterns do right now. New features of SelectExpressions, like Allow selectors to be lists #4, would automatically be available in terms.
Local variables can be used inside of the term's value, too. The above example doesn't show case this, but you could imagine a different term using a $number variable. In the current design, the name of this variable is effectively hardcoded in the term and all messages referencing it must have the variable of this exact name passed into them by the app. This would limit the utility of private terms, which we might introduce in the future.

Open questions

How should the scoping rules work exactly? Should terms be allowed to access variables passed to messages which reference them, or should all variables be passed to them explicitly?
Should attributes use the parameters defined on the term id, or should there be a way to define (possibly different) parameters on the attribute id?
Related, when referencing attributes, should the parameters be attached to the term ({-brand-name("lowercase").gender}) or to the attribute ({-brand-name.gender("lowercase")})?
Could we use named arguments to define default values of parameters?

The text was updated successfully, but these errors were encountered:

This is a re-write of the runtime parser. It supports Fluent Syntax 0.7, runs against the reference fixtures, has half the lines of code, and is as fast in SpiderMonkey as the old one (and slightly faster in V8). Goals 1. Support 100% of Fluent Syntax 0.7. This includes the indentation relaxation, dropping tabs and CR as syntax whitespace, normalizing new lines to LF, and only allowing numbers and identifiers as variant keys. 2. Maintain good performance. The parser is used in performance-critical code paths. Back in the days of Firefox OS it had to be both fast _and_ produce tightly packed results so that translations don't take up too much space on the device. I think the storage requirements can be relaxed these days. 3. Write code which will be easy to maintain in the future. The parser was first written even before Fluent branched off from L20n. It's seen many changes and additions over the last two years. As new features accrued it became hard to maintain it and also to keep track of all known bugs. My goal for the re-write was not only to clean it up but also to define the conformance story for the future and to improve the testing infrastructure. Design The parser focuses on minimizing the number of false negatives at the expense of increasing the risk of false positives. In other words, it aims at parsing _valid_ Fluent messages with a success rate of 100%, but it may also parse some invalid messages which the reference parser would reject. The parser doesn't perform any validation and may produce entries which wouldn't make sense in the real world. For best results users are advised to validate translations with the fluent-syntax parser pre-runtime. The main parser loop iterates over the beginnings of messages and terms. This is to efficiently skip over comments (which have no use on runtime), and to recover from errors. When a fatal error is encountered, the parser instantly bails out of the currently-parsed message and moves on to the next one. Errors are discarded and are not visible to the users of `FluentResource`. The do carry a minimal description of what went wrong which may be useful when reading the code and for debugging, though. The parser makes an extensive use of sticky regexes which can be anchored to any offset of the source string without slicing it. In some places, it's easier to just check the character currently at the cursor, so it does a fair share of that, too. Conformance My original plan was to base the parser on the EBNF and only parse well-formed syntax. In this PR, I went for something a bit wider than that: a superset of well-formed syntax. The main deviation from the EBNF is related to parsing `VariantExpressions` and `CallExpressions`. The EBNF verifies that the they are called on `Terms` and `Functions` respectively. The optimistic parser doesn't differentiate between `Messages`, `Terms` and `Functions`. I decided to implement it this way because this code might soon change anyways (see projectfluent/fluent#176). Another deviation is that the parser treats commas in argument lists as whitespace, similar to how Clojure treats them in sequence lists. I might suggest we upstream this in the spec, too, because it makes the implementation of args lists _much_ simpler. I based this PR on top of the `zeroseven` branch. The `fluent-syntax` parser already supports Syntax 0.7 and passes the [reference fixtures](https://github.com/projectfluent/fluent/tree/master/test/fixtures). This made it possible to also turn on the reference testing in the runtime parser, too. `make fixtures` creates the parsed results for all reference fixtures; for now they must be verified manually before they're committed. `make test` can be used in development to assert that the output of the runtime parser still matches the committed one.

stasm · 2018-10-19T11:24:50Z

More open questions from the last time we talked about this in person:

Should both positional and named arguments be allowed?
If positional arguments are allowed, how can the caller invoke the default value of an argument which precedes another one? I.e. for -term($one, $two), should it be possible to somehow just pass the value of $two?
If named arguments are allowed, should we name them using the $ sigil? -term(arg: "value") vs. -term($arg: "value").
Or maybe we should drop the $ entirely and keep it reserved only for data coming directly from the app? In this case, we'd write:
```
-fxaccount-brand-name(case) = { case ->
    [lowercase] account Firefox
   *[uppercase] Account Firefox
}
```

stasm · 2018-10-19T13:59:38Z

I have a proposal which solves questions: 1, 4, 5, 6, and maybe 7 and 8.

# @param $case - Letter case to use.
-fxaccount-brand-name = { $case ->
    [lowercase] account Firefox
   *[uppercase] Account Firefox
}

sync-signedout-account-title =
    Connetti il tuo { -fxaccount-brand-name($case: "lowercase") }

Let's treat all variables referenced in Terms as references to named arguments passed to the term via a CallExpression. This means that terms are not allowed to access any variables passed to the message they are referenced from, unless these variables are passed to the term explicitly via the CallExpression. This answers question 1.

Terms don't need to define their format parameters in this proposal at all (-term(arg) = ...). The parameters are accessed lazily during the resolution of the term from the list of named arguments passed to it. This answers question 4.

When calling a term, both positional and named arguments can be used, but positional are ignored, because, by their anonymous nature, they can't be looked up via variable names. This answers question 5 and 6.

I'm still not entirely sure about question 7 and 8. I can see arguments to both using -fxaccount-brand-name($case: "lowercase") and -fxaccount-brand-name(case: "lowercase") (without the $). I'm erring on the side of consistency from the localizer's POV, which means using $ everywhere.

stasm · 2018-10-19T14:02:56Z

As far as questions 2 and 3 are concerned, and assuming term attributes are only used to describe grammatical or stylistic properties of the term itself... do we actually need parameterization there?

-brand-name =
    { $case ->
       *[nominative] Firefox
        [accusative] Firefoxa
    }
    .gender = masculine

has-updated =
    { -brand-name.gender ->
        [masculine] { -brand-name($case: "nominative") } został zaktualizowany.
        [feminine] { -brand-name($case: "nominative") } została zaktualizowana.
       *[other] Program { -brand-name($case: "nominative") } został zaktualizowany.
    }

Pike · 2018-10-19T14:07:31Z

Vowel harmony could be a use-case (didn't read the article in full) where you'd like to denote to the calling site that the term's dominating vowels depend on case or plurality?

stasm · 2018-10-19T14:09:38Z

Interesting, thanks for the link.

I'd also like to point out that it's not possible to achieve this with the current design of VariantLists, either: -term.attr[name] is not valid syntax. I filed #189 about it.

stasm · 2018-10-19T18:04:01Z

As I was researching this issue, I realized the current design of VariantLists has more limitations than I initially considered. I filed #187 and #188. Macros could be a way to fix them, if we removed VariantLists along the way. If not, we should look into fixing them in some other way.

stasm · 2018-11-06T10:57:36Z

I'd like to move forward on this proposal. I opened #205 with a WIP implementation allowing the -term(arg: "value") syntax. I'm leaving accessing variants of attributes out of scope for now; we'll add them in #189. I also filed #204 to remove VariantLists.

stasm mentioned this issue Oct 10, 2018

The Optimistic Runtime Parser projectfluent/fluent.js#289

Merged

stasm added the syntax label Oct 16, 2018

stasm mentioned this issue Oct 19, 2018

Allow accessing nested VariantLists #187

Closed

stasm mentioned this issue Oct 22, 2018

Change named arguments syntax to $name = value #192

Open

This was referenced Nov 6, 2018

Deprecate VariantLists #204

Closed

Add Parameterized Terms #205

Merged

stasm mentioned this issue Nov 6, 2018

Consider accessing variants of term attributes #189

Closed

stasm closed this as completed in #205 Nov 8, 2018

eemeli mentioned this issue Apr 19, 2019

Parameterized Messages #258

Closed

SirNickolas mentioned this issue Sep 17, 2019

Term references with positional arguments? #293

Open

snyk-bot mentioned this issue Oct 26, 2019

[Snyk] Upgrade fluent from 0.6.4 to 0.12.0 ajesse11x/send#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameterized Terms #176

Parameterized Terms #176

stasm commented Oct 1, 2018

stasm commented Oct 19, 2018 •

edited

Loading

stasm commented Oct 19, 2018

stasm commented Oct 19, 2018

Pike commented Oct 19, 2018

stasm commented Oct 19, 2018 •

edited

Loading

stasm commented Oct 19, 2018

stasm commented Nov 6, 2018

Parameterized Terms #176

Parameterized Terms #176

Comments

stasm commented Oct 1, 2018

Proposal

Advantages

Open questions

stasm commented Oct 19, 2018 • edited Loading

stasm commented Oct 19, 2018

stasm commented Oct 19, 2018

Pike commented Oct 19, 2018

stasm commented Oct 19, 2018 • edited Loading

stasm commented Oct 19, 2018

stasm commented Nov 6, 2018

stasm commented Oct 19, 2018 •

edited

Loading

stasm commented Oct 19, 2018 •

edited

Loading