Skip to content
This repository has been archived by the owner on Jul 22, 2020. It is now read-only.

Short-circuiting concerns #3

Closed
ljharb opened this issue Mar 20, 2018 · 26 comments
Closed

Short-circuiting concerns #3

ljharb opened this issue Mar 20, 2018 · 26 comments

Comments

@ljharb
Copy link
Member

ljharb commented Mar 20, 2018

(raised by @bakkot in the meeting today)

Currently, every assignment combo operator in JS can be described as a = a <op> b being equivalent to a <op>= b.

Including the short-circuiting semantics would make a = a <op> b equivalent to if (a) { a = a <op> b; }, which is a very large inconsistency.

The argument that "Ruby and CoffeeScript have these semantics" imo does not hold its weight compared to "it would be inconsistent with the rest of JS".

@bakkot
Copy link

bakkot commented Mar 20, 2018

The pedant in me is compelled to note that in cases where evaluating the LHS is observable the sugar is a little more complicated because you need to avoid triggering the side effects multiple times (though most implementations get it wrong, and do in fact trigger a subset of the side effects several times). No one should care about this, though.

@jridgewell
Copy link
Member

Ruby and CoffeeScript have these semantics

On this, both Ruby and CoffeeScript have mathematical operators that always assign, and logical operators that short-circuit assign.

it would be inconsistent with the rest of JS

Meaning the mathematical assignment operators? but none of them could ever have short circuiting semantics.

I think the better argument for always-set semantics path is that it's closer to (my personal intuition) the common de-sugared form (a = a || b). But, I'm pretty sure this is the common because the short-circuit-set semantics is usually a lint error (useless expression).

But, I'm open to either semantics.

where evaluating the LHS is observable the sugar is a little more complicated because you need to avoid triggering the side effects multiple times

This is separate from short-circuiting semantics, but I intend for this to follow whatever the rest of the spec uses, which I hope accounts for deep properties and computed keys.

obj.a.b ||= 'test';
(_a = obj.a).b || (_a.b = 'test');

obj.a[key++] ||= 'test';
(_a = obj.a)[_key = key++] || (_a[_key] = 'test');

@hilbix
Copy link

hilbix commented Aug 18, 2019

Sorry to comment here late and long. But I am not so familiar with the background of the JS spec and who is allowed to express thoughts here. And please forgive if this is a FAQ, I do not know where and how to look for such.

However I'd always vote for the most performant way to implement things if this does not break mathematical correctness. This also means, to allow engines to skip calls to setters if those calls look invariant (which might not necessarily be procedurally correct then) to the value.

So my idea is to turn it around and allow shortcuts like in the ||= case to be allowed in the += case, too!

a += b is internally done by somethings like var _=a; _=_+b; a=_ so I'd vote that engines are allowed to optimize that to

var _ = a
_ = _ + b
if (_ !== a) a = _    // optionally spare the if() and do a = _ here

which skips calls to the setter if nothing changes. (Note that this should not be imperative, because there will be many cases where calling the setter is more performant than implementing the identity check. And I am talking about performance here.). Now a ||= b flies just the same route:

var _ = a
_ = _ || b;
if (_ !== a) a = _

which effectively is

if (!a) a = b;

To keep the idea that a = a <op> b should be the expanded form of a <op>= b, this means, that it should be allowed to skip the assignment (=setter) if the (known) value is unchanged.

The motivation behind this is that setters should do the obvious thing. Storing the same value multiple times or only store it a single time should not catastrophically change things from a stability point of view. (Also getters should never return a value different from what setter expect to set the value returned by the getter again.) On JS this might break some constructs of Observables, which use such a bad approach to enforce some UI updates/refreshes. However for this a clear call to something like a.refresh() or a.refresh++ or a.refresh_trigger = !a.refresh_trigger should be a far superior approach, as this creates clear and concise code which everybody can understand out of the box.

Note:

  • I am not into JS spec, so perhaps this is a fundamental change in some principles there. But a good language should be allowed to optimize for performance (as long as this does not break some very imperative other things which cannot be expressed otherwise). And adding some internal assumptions on how getters/setters should be implemented, is not a bad thing.

  • Statically compiled languages like C are not a good choice to compare here. The C++ compiler has a good way to examine (inline) the getters/setters at compile time and optimize that. However JS is an interpreted language where everything can change on runtime, so it shall have other ways of optimization, possibly sacrificing some constraints on how side-effects must be handled in-detail, especially if those side-effects are not obvious to the common programmer anyway and hinder performance and generalization a lot.

  • Also the C++-optimization is, strictly speaking, not free from side-effects, as at assembly level the "add"-instruction is optimized away which can change behavior of a program. This could make differences on things like Spectre and the like. Hence there always is some tradeoff between speed and compliance, where I always go for speed, as long as the pure mathematical result stays correct.

Compare:

  • Spectre is not a bug from my perspective, it's just a side-effect of a speed feature which only has some possibly bad side-effects in certain few and from my perspective very negligible situations. So yes, Spectre has an impact on Cloud setups, but you always needed to be aware of side-channels on things like this, so you had to harden your code. This already was true in the last millennium with non-speculating processors (you were able to measure the power consumption to detect things you are not supposed to access, and there were even exploits known using sound cards to record things like password keystrokes). This is comparable to the shortcut taken here. Yes, it might break some existing code. But only bad code which relies on side-effects, which are not obvious to everybody, and this creates very hard do maintain code and should be changed anyway.

  • In contrast, Meltdown and SWAPGS are bugs, because both have a vast impact on kernel security guarantees. But I cannot see any similar bad thing happen here if you allow shortcuts for += as expressed above. The vast majority of sane code should continue to work completely unharmed, regardless of minified or not.

@ExE-Boss
Copy link

But what if I want to have:

let hasErrors = testForSomeErrors(foo, logger);
hasErrors = testForOtherErrors(foo, logger) || hasErrors;

So that testForOtherErrors() is also executed, even if testForSomeErrors() already detected some errors, so that the other errors are also logged.

@claudepache
Copy link
Contributor

claudepache commented Aug 19, 2019

But I am not so familiar with the background of the JS spec

@hilbix For information, the current state of the spec is: Implementations are allowed to do any optimisation provided that it does not change the observable behaviour, which is probably much more restrictive than what you want. Number of wasted CPU cycles does not count as observable; but observable side-effects not related to time and memory do count.

If you want to allow implementations to do more optimisations, that might change the observable behaviour in edge cases, this is not the place to propose it, because it is a major change in the language semantics. See https://github.com/tc39/ecma262/blob/master/CONTRIBUTING.md#new-feature-proposals for places where you can express your ideas (the es-discuss mailing list is one place to discuss them). (Although I am personally sceptical that this particular idea will fly , because the tendency is to reduce observable variations between implementations.)

@claudepache
Copy link
Contributor

claudepache commented Aug 20, 2019

I disagree with this argument:

Currently, every assignment combo operator in JS can be described as a = a <op> b being equivalent to a <op>= b.

Including the short-circuiting semantics would make a = a <op> b equivalent to if (a) { a = a <op> b; }, which is a very large inconsistency.

for the simple reason that descriptions like “a = a <op> b is equivalent to a <op>= b” are willingly incorrect for the sake of simplicity. You already know that you must add: “(except that, of course, a is not evaluated twice)”; now you just need to add: “(and except that, of course, in case of short-circuiting, the value just read in a is not re-written in a)”; there is no need to resort to some alternative (incorrect) expansion such as if (a) { a = a <op> b; }.

In fact, if we look at real semantics instead of simplistic expansions, there is absolute consistency. Here is a synoptic view of the differences of semantics between a + b, a += b, a || b and a ||= b:

<OP> a <OP> b a <OP>= b
non short-circuiting operator (e.g., +)
  • assign the final result to a
  • short-circuiting operator (e.g., ||)
  • possibly skip the rest of the algorithm (short-circuit)
  • possibly skip the rest of the algorithm (short-circuit)
  • assign the final result to a
  • The question is whether the “assign the final result to a” step is part of the algorithm that is possibly short-circuited. I say: yes, of course, because the goal of short-circuiting is precisely to avoid operations that are useless in “regular” cases (i.e., assuming that evaluating subexpressions does not trigger observable side-effects, assuming that re-writing the same value just read in some binding is a no-op, and other whatnots)

    Concretely, the only case where the difference is observable, is when assignment does something weird or perform some non-trivial work. E.g.

    previewElement.innerHTML ||= '<i>Nothing to show</i>'

    In that specific case—and I expect in most cases—you most probably don’t want to trigger the .innerHTML setter, as it could have unwanted side-effects (such as the user losing the focus in case it was on some child element of previewElement). If, for some obscure reason, you do want to always trigger the .innerHTML setter, you’d better be more explicit.

    @jridgewell
    Copy link
    Member

    @claudepache The el.innerHTML ||= 'default' is a great example. Want to add it to the readme?

    @hax
    Copy link
    Member

    hax commented Aug 22, 2019

    "a = a <op> b being equivalent to a <op>= b" may be too simple, but I feel it reflect the mental model of most programmers.

    The problem is, when programmers see a ||= b, will they consider short-circuiting semantic first?

    If, for some obscure reason, you do want to always trigger the .innerHTML setter, you’d better be more explicit.

    We could also say, if you do not want to trigger the setter, you'd better be more explicit --- write if (el.innerHTML) el.innerHTML = ....

    In which direction more explicit is better choice? I don't know.

    Some programmers think <op>= as a shorthand, and the doc of ESLint operator-assignment rule said:

    Use of operator assignment shorthand is a stylistic choice. Leaving this rule turned off would allow developers to choose which style is more readable on a case-by-case basis.

    Whichever we choose, it's not a stylistic choice anymore.

    So the most conservative teams might even choose enforcing explicit for both directions, aka. disallow logical assignment operators (via eslint).

    Note, actually I will choose current short-circuiting semantic if I'm forced to choose one. Just feel it's really an issue if the feature is added while a lint rule is also added to forbid it 🤓

    @Mouvedia
    Copy link

    Whichever we choose, it's not a stylistic choice anymore.

    So the most conservative teams might even choose enforcing explicit for both directions, aka. disallow logical assignment operators (via eslint).

    This.

    @claudepache
    Copy link
    Contributor

    Some programmers think <op>= as a shorthand, and the doc of ESLint operator-assignment rule said:

    Use of operator assignment shorthand is a stylistic choice. Leaving this rule turned off would allow developers to choose which style is more readable on a case-by-case basis.

    Whichever we choose, it's not a stylistic choice anymore.

    The doc of ESLint can continue to lie, and say that a ||= b is equivalent to a = a || b. In most cases, skipping a = a makes zero differences. In few cases, it makes a small difference, but it is sufficiently subtle for most coders not to care.

    In rare cases it makes a noticeable difference, and:

    • even in those cases, skipping the self-assignment is probably the most secure choice;
    • programmers that do care of the difference will also care to understand correctly the code.

    @ExE-Boss
    Copy link

    • programmers that do care of the difference will also care to understand correctly the code.

    I’ve seen far too much horrible, buggy code in my life that this sadly feels very unlikely to me.

    @hilbix
    Copy link

    hilbix commented Aug 28, 2019

    If a language is created for programmers who do not care, then the language will not take care of the programmers who do care.

    @jridgewell
    Copy link
    Member

    Closing this, as I feel #3 (comment) answers everything.

    @hax
    Copy link
    Member

    hax commented Jan 20, 2020

    Could we add a link to this discussion in README which can help the programmers understand the decision if they have similar question?

    I also think we'd better "fix" the doc of eslint sometime :)

    @DanielRosenwasser
    Copy link
    Member

    el.innerHTML ||= 'default'

    Is this actually a great example? It feels very off to me. The idea that this doesn't set a side effect is cute and clever, not something that'd be apparent to anyone familiar with +=. It would be such a weird artifact of language evolution to have

    a += b

    desugar into

    a = a + b

    but have

    a ||= b

    desugar into something like

    a || (a = b)

    @claudepache
    Copy link
    Contributor

    @DanielRosenwasser The fact is, a += b does not desugar to a = a + b, because a is not evaluated twice. This is a convenient lie.

    @jridgewell
    Copy link
    Member

    The idea that this doesn't set a side effect is cute and clever, not something that'd be apparent to anyone familiar with +=

    What else would += have been? Also note, as @claudepache points out, that the simple desugaring is only correct in the most basic case. Once you add more complicated LHS, it breaks down:

    // obj evaluated twice in "desugaring"
    obj.counter += 1;
    obj.counter = obj.counter + 1;
    
    // key incremented twice in "desugaring"
    array[key++] += 1;
    array[key++] = array[key++] + 1;

    There's already a difference in the how += evaluates the base reference. That ||= short-circuits like || will be easy enough to learn.

    @ljharb
    Copy link
    Member Author

    ljharb commented Jan 21, 2020

    The important part tho is that all the X= operators desugars to an unconditional assignment, so that a setter is observably invoked no matter what.

    @jridgewell
    Copy link
    Member

    The important part tho is that all the X= operators desugars to an unconditional assignment, so that a setter is observably invoked no matter what.

    For math operators, what else could we have chosen? Of course they always set.

    The logical operators are control flow. They allow us to have a branching behavior to prevent the set. And I'm willing to bet that the majority of setters don't actually need to be re-set to the same value, and it's possibly very destructive (DOM APIs).

    @DanielRosenwasser
    Copy link
    Member

    Having spoken to some friends on the C# team, I found out that they've also picked the behavior of short-circuiting the assignment (a ??= b -> a ?? (a = b)) rather than performing unconditional assignment (a ??= b -> a = a ?? b). I think that in this case, regardless of whatever I would find preferable, the advantages either way are probably marginal and it makes more sense to drive towards consistency with other languages.

    @jridgewell
    Copy link
    Member

    Excellent, thank you for reaching out to them!

    @icywolfy
    Copy link

    icywolfy commented Feb 29, 2020

    • As a library developer, how can I inform users that ||= is absolutely prohibited for some objects?
    • As a library user, how can I find out whether it is safe to call ||=?

    There will be much user overhead to know that
    transaction.currency = transaction.currency || CURRENCY_EURO; works as expected, but
    transaction.currency ||= CURRENCY_EURO; will cause numerous hard to track bugs, and bypass functionality.

    The primary difference:
    One must pre-verify whether the assigned property is going through a setter or not.
    This implies that users need to dig into third-party library code to know internal details.
    Any use of the ||= operator, on an object property must now be researched.

    // Heavily simplifed example from a library 
    
    const transaction = {
      _currency: 0,
      get currency() {
        return this._currency;
      },
      set currency(val) {
        if (val !== this._currency) {
           // Trigger a cancellable event, that may override the change
           console.log(`.dispatch(CURRENCY_CHANGE_EVENT, {old: this._currency, new: val);`);
        }
        this._currency = val;
        // Trigger a non-cancellable event, used for mandatory code auditing, invoking third-party libraries and other service integrations.
        console.log(`.dispatch(CURRENCY_SET_EVENT, {val: val);`);
      }
    };
    
    const CURRENCY_EURO = 978;
    transaction.currency = transaction.currency || CURRENCY_EURO;
    // .dispatch(CURRENCY_CHANGE_EVENT, {old: 0, new: 978);
    // .dispatch(CURRENCY_SET_EVENT, {val: 978);
    
    transaction.currency = transaction.currency || CURRENCY_EURO;
    // .dispatch(CURRENCY_SET_EVENT, {val: 978);
    
    
    transaction.currency ||= CURRENCY_EURO;
    // .dispatch(CURRENCY_CHANGE_EVENT, {old: 0, new: 978);
    // .dispatch(CURRENCY_SET_EVENT, {val: 978);
    transaction.currency ||= CURRENCY_EURO;
    /*
     * Auditor is not executed.
     * Region based runtime added third-party module, data is not yet synced, currency is not set.
     * Mandatory attempted change of information process is not executed.
     *
     */

    @ljharb
    Copy link
    Member Author

    ljharb commented Feb 29, 2020

    That’s already something you have to know - i can already always check a property’s value before conditionally setting it.

    @arye-eidelman
    Copy link

    arye-eidelman commented Apr 1, 2020

    @icywolfy If you need to log/report when transaction.currency ||= CURRENCY_EURO; is run even if the currency never changed you should be logging in the getter.

    I think part of the confusion stems from the fact that all existing assignment combo operates are effectively written backwards, += instead of =+. Expandinga += b requires one to place the equals before the plus a = a + b.
    On the other hand the new logical assignment operators are written in order. a ||= b expands to a || (a = b).

    Also I wouldn't be comparing this to assignment combo operators. Thearetically the language could support prepending && to other operators besides for the equals operater. (The same would apply to a non nullish operator if it is added in the future.)

    For example:
    a &&*= b expands to a && (a *= b)
    a &&-= b expands to a && (a -= b)
    a&&++ expands to a && a++
    &&++a expands to a && ++a
    etc.

    There is obviously less demand for these shorthand operators, but they bring out the point that this has different semantics than assignment combo operators.

    @asktheworld
    Copy link

    asktheworld commented Jun 4, 2020

    (raised by @bakkot in the meeting today)

    Currently, every assignment combo operator in JS can be described as a = a <op> b being equivalent to a <op>= b.

    Including the short-circuiting semantics would make a = a <op> b equivalent to if (a) { a = a <op> b; }, which is a very large inconsistency.

    The argument that "Ruby and CoffeeScript have these semantics" imo does not hold its weight compared to "it would be inconsistent with the rest of JS".
    class MyClass { print() { console.log(create table student( id int primary key name text )) } }

    @DanielRosenwasser
    Copy link
    Member

    As a person who was on that side of the fence, I'd just like to raise:

    1. Combing through our own codebase, practically all of the code that we've written prior to this feature actually uses the conditional set (even though it is likely never observable). We actually have many instances of this for lazy assignment. Ideally, a naive refactoring should preserve that behavior.
    2. C#, Ruby, and CoffeeScript all have the same difference in behaviors, and as far as I can tell it's been just fine in over a decade of support (Ruby has had it well before 2008, and C# just shipped with the same behavior). Making it inconsistent with the rest of the programming ecosystem is ignoring the cowpath and actually adds potential to surprise a user coming from another language.

    Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
    Labels
    None yet
    Projects
    None yet
    Development

    No branches or pull requests