Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

June spec review #155

Open
26 of 27 tasks
waldemarhorwat opened this issue Jun 11, 2024 · 7 comments · Fixed by #156
Open
26 of 27 tasks

June spec review #155

waldemarhorwat opened this issue Jun 11, 2024 · 7 comments · Fixed by #156

Comments

@waldemarhorwat
Copy link

waldemarhorwat commented Jun 11, 2024

I took another look at the Decimal spec for the June meeting. The spec forms a good foundation, but I found some pervasive issues that make it self-contradictory or produce incorrect results and should be addressed before seeking stage 2.

My review follows. Some of the items are nontrivial and will require some discussion and agreement on how to fix. I'm offering to provide whatever help is needed to understand and fix these.

Confusion

  • Throughout the spec there are errors where a subroutine is called with different types of arguments or different data than what it expects. Document what the arguments should be and check all call sites to make sure that they match. One example is EnsureDecimal128Value, where part of the spec assumes that the first argument is the full mathematical value (which I believe to be the correct interpretation), while another part of the spec assumes that it's a mantissa. In some places multiple bugs overlap: The Decimal128 constructor confuses parse nodes, which causes Decimal128("3.14") to return 33.14𝔻; if the parse node confusion is fixed, then Decimal128("3.14") will return 314𝔻 due to EnsureDecimal128Value argument type confusion.
  • Handling of zeroes is incorrect in nearly all arithmetic operations due to the above type confusion. For example, Decimal128.prototype.add tries to do real number arithmetic on +0𝔻 and -0𝔻, which are not real numbers. IsIntegralDecimal128 returns false on +0𝔻 and -0𝔻.
  • All operations that produce new zeroes via arithmetic either throw or encounter the above type confusion.
  • Decimal128ToDecimalString confuses a value with its length and contains buffer overruns.

Rounding

  • None of the rounding modes is implemented.
  • The spec throws whenever an inexact operation happens. Dividing 1𝔻 by 3𝔻 will throw, as will adding 1050𝔻 to 1𝔻. This is a major change to the spec that I hadn't been expecting. Is this intentional?

Quantum Handling

  • Arithmetic on the quantum is incorrect in many places. If we're going to distinguish cohort members, we should have arithmetic operations set the quantum in the same way as the IEEE spec does it.
  • The spec neglects to set the quantum on zero values in methods such as Decimal128.prototype.round.
  • Per the IEEE spec, the quantum should not affect mathematical results. Interrogating the quantum in methods such as IsIntegralDecimal128 is incorrect.
  • The quantum appears in other places where it shouldn't, I'm guessing due to the confusion above. toPrecision with a user-provided precision shouldn't interrogate the quantum. It's not the exponent.

Canonicalization

  • The spec confuses string canonicalization with Decimal128 value canonicalization. Those are quite different concepts:
  • What we want is for toString to produce strings independent of the input quantum unless one explicitly sets canonicalize to false: All Decimal values with the same v should produce the same string.
  • The spec attempts define a CanonicalizeDecimal128 helper method that tries to pick a specific "canonical" Decimal128 value to represent a cohort, but that doesn't work:
    • The choices are to pick either the one with the highest possible q, or the one with the lowest possible q. That's what's needed in some arithmetic contexts.
    • Neither choice works correctly for both toString and toExponential.
  • To fix this, restructure the spec to delete CanonicalizeDecimal128 and anything like it and just make the conversions not depend on the quantum unless the user requests it.

Comparisons

  • The compare method should compare mathematical values and return one of -1, 0, 1, or NaN. The current implementation also compares the quantum, which turns it from a generally useful method to an extremely obscure one that should be omitted in the first version.
  • A method for the < comparison is provided, but >, ≤, ≥ are missing, and the latter two are nontrivial for users to implement correctly. Naively turning a ≤ b into !(b < a) is incorrect due to NaN's.

Conversions

  • Invoking toString on 101000𝔻 produces 1 followed by 1000 zeroes. This is fine if printing a BigInt, but is undesirable for floating-point numbers because it provides a false sense of precision. Users will expect that they can add 1 to it, and, being a Decimal, get 100…001, which obviously won't work. Instead, toString should follow existing Number precedent and switch to scientific notation to not produce significantly more digits than is possible to represent in a Decimal128 value. If someone really wants fixed-point notation, they can use a different method such as toFixed.
  • One would expect calling toString with canonicalize set to false to not canonicalize the output, but that's not what it does. It will sometimes canonicalize and sometimes not, depending on what Decimal128 value it sees. So canonicalize set to false actually means "maybe". It should mean "don't canonicalize".
  • None of the conversions respect the quantum of zero values. But they should: if canonicalize is set to false, then we do want to distinguish 0 from 0.0000 and 0e75.
  • The quantum of large Numbers converted to Decimal128 is inconsistent. Converting 1e20 yields a quantum of 0, but converting 1e21 yields a quantum of 21. As a fix, I recommend doing the logical equivalent of always converting Numbers to Decimal128 via scientific notation.

ECMAScript Mechanics

  • ECMAScript object identity is inconsistent. Some methods such as Decimal128.prototype.round sometimes reuse input values and sometimes produce fresh values equal to the input values. We should be consistent here.
  • Number(x) now calls toString on every x, of any type. This is an undesirable change in existing behavior and not necessary for correct Decimal128 semantics.
  • The spec uses the wrong grammar for converting strings to Decimal128. As a result, Decimal128("07") throws.

Missing Operations

Structure

  • The introduction section uses concepts such as "zero" and "finite" before defining them.
  • For readability I'd prefer to use spec methods such as cohort and quantum on Decimal128 values rather than [0] and [1].
@jessealama
Copy link
Collaborator

Thanks for taking a look! I'll get to work on these issues. Stay tuned.

@waldemarhorwat
Copy link
Author

I added a sketch of how to fix things.

@jessealama
Copy link
Collaborator

(This was unintentionally closed in #156 )

@jessealama
Copy link
Collaborator

jessealama commented Jun 12, 2024

Thanks so much for this second round of detailed review! We plan to make changes to address almost all of your points below. Specifically:

Type errors

  • Thanks, this is a good guide. We'll annotate the specification text with more clear expectations about the types of various values, which we hope will help avoid this issue in the future.
  • The changes for quantum handling will address the type errors.

Rounding

  • It was not intentional to make rounding throw. Throwing an error there was a placeholder to come back and describe the operation.
  • We'll write this description with the same semantics as IEEE-754 and Intl (they already coincide via that table).

Quantum handling

  • Your suggestion to replace EnsureDecimal128Value with PickQuantum is an excellent one, and we'll rewrite our algorithms in that form.
  • The current data model does not handle different quanta for zero values, so of course algorithms don't handle them. We will have to add this.
  • To distinguish multiple quanta for +-0, we need to treat them as cohorts, rather than full decimal values. To stick with the convention where the 𝔻 subscript means that the whole value is a Decimal128 value, we'll name these cohorts PositiveDecimal128Zero and NegativeDecimal128Zero, and treat these cohorts specially in all arithmetic operations.
  • It was not intentional to differ from IEEE 754 in determining the quantum. We'll match the spec.
  • We'll rephrase the arithmetic operations to be based on the mathematical value of the operation applied to the operand cohorts, which is then passed to an algorithm to round it to the nearest cohort.

Canonicalization

  • We'll delete the canonicalize algorithm–as you say, we just need to get the cohort when we need the mathematical value, which toString and friends will use.

Comparisons

  • We'll remove the .compare function (i.e., a function that can distinguish mathematically equal decimals). We don't have it for Number or BigInt, and people get by. You can use the coefficient or quantum accessors (which we will add) if you want to distinguish mathematically equal values.
  • We're not sure whether greaterThan, lessThanOrEquals, etc. are necessary to add, but they are easy to define; let's discuss the tradeoff in plenary. Our current thinking is to resurrect an earlier idea, of having a single, differentcompare method, working with mathematical value, from which <, ≤, >, ≥ are trivially defined in user code.

Conversions

  • We'll change the default toString algorithm to overflow to exponential notation using the same cutoffs Number uses.
  • We'd like to replace the "canonicalize" option for toString with a separate operation that can be used just to convert a decimal value to an exact string that cleanly represents the decimal in a natural way, e.g., using exponential notation or trailing zero decimal points based on the quantum (e.g., could be "300", "3e2", or "300.00" depending on the member of the cohort, and may have 6000 zeroes after it).
  • Then, there will be no canonicalize flag for the other string operations, simplifying them.
  • For conversion from Number, we'll apply your fix of always using the quantum as if it were written in exponential notation.

ECMAScript mechanics

  • We'll consistently return new object identities
  • We'll delete that stray line of spec that does that conversion, which is not useful or a good idea
  • We'll define a new grammar for this particular case, which permits leading zeroes

Missing operations

  • We'll define methods to get the coefficient (as a BigInt) and quantum (as a Number)
  • Sqrt: This is easy to define, but let's discuss with the committee whether it's desired. We don't understand the use case

Structure

  • Good suggestions; we've applied them

@waldemarhorwat
Copy link
Author

waldemarhorwat commented Jun 13, 2024

Thank you!

Some comments:

  • "The current data model does not handle different quanta for zero values, so of course algorithms don't handle them. We will have to add this.": I don't understand this response. The current data model does support different quanta for zero values. That's what the q in «+0𝔻q»𝔻 and «-0𝔻q»𝔻 is.
  • Renaming the +0𝔻 and -0𝔻 symbols sounds good to reduce confusion, but PositiveDecimal128Zero and NegativeDecimal128Zero are a bit of a mouthful. Can we use something shorter such as PlusZero and MinusZero? There's no need to include "Decimal128" in the name — note that all the other finite cohorts such as 1.5 and 42 are represented by real numbers that don't have a Decimal128 tag.
  • I think it's essential to provide the five comparison methods that do the =, <, >, ≤, and ≥ comparisons. Not having those would make decimal code and algorithms awkward to translate into Decimal128 methods.
  • A compare method that compares mathematical values and returns one of {-1, 0, 1, NaN} is a very-nice-to-have.
  • The Decimal128 toString threshold for switching to scientific notation should be higher than for Number due to Decimal128's higher precision. For Number we intentionally picked a threshold greater than its 17 digits of precision. Decimal128 offers 34 digits of precision, so it's reasonable to switch to scientific notation for large numbers when the magnitude exceeds 1034. When adding leading zeroes for small numbers, the same threshold as Number (10-6) seems fine.
  • There are existing grammars in the language for converting user-provided strings to numbers. See StringNumericLiteral. Use that.

@waldemarhorwat
Copy link
Author

I added a proposed design of the methods for querying and working with exponents, significands, and the quantum, along with sample code for many use cases.

@jessealama
Copy link
Collaborator

I think all of these are taken care of, with the exception of

  • adding sqrt (I'm open to this but need a bit of evidence that this is actually used or, barring that, an argument for how it might be used)
  • using ~constants~ rather than the subscript D (I agree with the idea but prefer to do this in a single editorial change, later, after we've nailed down other details)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants