-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numbers: support compact mode #715
Conversation
Hi @sieverk, it's awesome to see your WIP PR. It's on my TODO list to review and provide you feedback about your proposal. Thanks so far and sorry for the delay on my answer. |
@rxaviers We tried deploying @sieverk's branch at twitter and saw that some short decimal formats break the number formatter. For example, trying to format the number 222.604 in hu ends up executing this in src/number/format.js: compactPattern = "000 E"; // hu cldr format
compactProperties = compactPattern.match( numberPatternRe ); // returns null The cause appears to be in numberPatternRe: /^(('([^']|'')*'|[^*#@0,.E])*)(\*.)?((([#,]*[0,]*0+)(\.0*[0-9]*#*)?)|([#,]*@+#*))(E\+?0+)?(('[^']+'|''|[^*#@0,.E])*)$/ Specifically, the last |
It seems the compact number pattern defined by CLDR is slightly different in a way that the number pattern must be extracted from it. The links I provided should contain the definition to allow that to be done correctly. If you run into any troubles please just let me know |
@@ -40,6 +40,14 @@ zh = new Cldr( "zh-u-nu-native" ); | |||
|
|||
QUnit.module( "Number Format" ); | |||
|
|||
function oneOrOtherPluralGenerator( plural ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could call this enPluralGenerator
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you're reusing it for es
too, please ignore my comment.
prefix += compactProperties[ 1 ]; | ||
suffix = compactProperties[ 11 ] + suffix; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to handle cases bigger than maximum available scale provided by CLDR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on this? I'm afraid I don't quite understand what "scale" means in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, basically my concern is what happens when you try to format a number bigger (in orders of magnitude) than 100000000000000-count-other": "000 trillion
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I see what you mean. We'll have to detect when no pattern is available for the number of digits and just return the raw number, I guess. Ideally, we'd be able to use scientific notation as a fallback when there is support for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like commented below, it would be ideal to use the biggest available instead of ignoring compact mode for big numbers.
src/number/format.js
Outdated
rounded = parseInt(numberFormatIntegerFractionDigits( number, minimumIntegerDigits, | ||
minimumFractionDigits, maximumFractionDigits, round, roundIncrement )); | ||
pluralForm = pluralGenerator ? pluralGenerator(rounded) : "other"; | ||
compactPattern = compactMap[ "1" + zeroes + "-count-" + pluralForm ] || compactPattern; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, what does || compactPattern
prevent? A missing pluralForm
? Let's add a comment explaining. In theory, that should never be needed, i.e., CLDR has the necessary forms for the locale; but it's better to be safe and not throw on runtime. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It guards against CLDR languages (Japanese, for example, but there are several) that only specify patterns for the "other" plural. In that case, checking compactMap
for a "one" pluralForm
would give us undefined. Hence the fallback to compactPattern
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap, but in theory that should never happen, for example the Japanese example, pluralGenerator
should never return one
. Also, CLDR has all the necessary forms for a locale. For example, for cases like Arabic where there are various plural forms, compactMap should contain all of them.
src/number/format.js
Outdated
|
||
// calculate rounded value to determine plural form and extract correct pattern | ||
rounded = parseInt(numberFormatIntegerFractionDigits( number, minimumIntegerDigits, | ||
minimumFractionDigits, maximumFractionDigits, round, roundIncrement )); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can drop this line and move the below code after numberFormatSignificantDigits
and numberFormatIntegerFractionDigits
calls, like this:
diff --git a/src/number/format.js b/src/number/format.js
index 85ed8fbe..5ca2cda2 100644
--- a/src/number/format.js
+++ b/src/number/format.js
@@ -68,26 +68,15 @@ return function( number, properties, pluralGenerator ) {
number *= 1000;
}
- var compactPattern, compactDigits, compactProperties, divisor, pluralForm, rounded;
+ var compactPattern, compactDigits, compactProperties, divisor, pluralForm, zeroes;
if ( compactMap ) {
- var zeroes = Array(Math.floor(number).toString().length).join( "0" );
+ zeroes = Array(Math.floor(number).toString().length).join( "0" );
if ( zeroes.length >= 3 ) {
// use default plural form to perform initial decimal shift
compactPattern = compactMap[ "1" + zeroes + "-count-other" ];
compactDigits = compactPattern.split( "0" ).length - 1;
divisor = zeroes.length - ( compactDigits - 1 );
number = number / Math.pow( 10, divisor );
-
- // calculate rounded value to determine plural form and extract correct pattern
- rounded = parseInt(numberFormatIntegerFractionDigits( number, minimumIntegerDigits,
- minimumFractionDigits, maximumFractionDigits, round, roundIncrement ));
- pluralForm = pluralGenerator ? pluralGenerator(rounded) : "other";
- compactPattern = compactMap[ "1" + zeroes + "-count-" + pluralForm ] || compactPattern;
- compactProperties = compactPattern.match( numberPatternRe );
-
- // update prefix/suffix with compact prefix/suffix
- prefix += compactProperties[ 1 ];
- suffix = compactProperties[ 11 ] + suffix;
}
}
@@ -102,6 +91,16 @@ return function( number, properties, pluralGenerator ) {
minimumFractionDigits, maximumFractionDigits, round, roundIncrement );
}
+ if ( compactMap && zeroes.length >= 3 ) {
+ pluralForm = pluralGenerator ? pluralGenerator(+number) : "other";
+ compactPattern = compactMap[ "1" + zeroes + "-count-" + pluralForm ] || compactPattern;
+ compactProperties = compactPattern.match( numberPatternRe );
+
+ // update prefix/suffix with compact prefix/suffix
+ prefix += compactProperties[ 1 ];
+ suffix = compactProperties[ 11 ] + suffix;
+ }
+
// Remove the possible number minus sign
number = number.replace( /^-/, "" );
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such change actually fixed this:
diff --git a/test/unit/number/format.js b/test/unit/number/format.js
index ae8f9825..cd7520e0 100644
--- a/test/unit/number/format.js
+++ b/test/unit/number/format.js
@@ -120,7 +120,7 @@ QUnit.test( "decimals should format in compact mode", function( assert ) {
assert.equal( format( 1273, properties( "#0.#", en, { compact: "long" } ) ), "1.3 thousand" );
assert.equal( format( 1273000, properties( "#0.#", es, {
compact: "long"
- } ), oneOrOtherPluralGenerator ), "1,3 millón" );
+ } ), oneOrOtherPluralGenerator ), "1,3 millones" );
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nice suggestion! Lets us take full advantage of sig/decimal figure options.
QUnit.test( "percents should format in compact mode", function( assert ) { | ||
assert.equal( format( 127, properties( "#0%", en, { compact: "short" } ) ), "13K%" ); | ||
assert.equal( format( 127, properties( "#0%", en, { compact: "long" } ) ), "13 thousand%" ); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, let's pass oneOrOtherPluralGenerator
. It's weird to see 1.3 million
with wrong pluralization 😄. We can add one specific test to assert the missing pluralGenerator and a missing plural form data to check both preventions you added in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another tests I'd like to see are: (a) round tests, (b) significant digits tests.
Example for b:
+ assert.equal( format( 1273500, properties( "@@@", en, { compact: "long" } ) ), "1.27 million" );
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some round and significant digits tests -- let me know if you think there could be better coverage.
Not sure what you mean about the oneOrOtherPluralGenerator
. I'm only passing it when using es
because it's the only language we use in the tests that has plural forms. We could pass in all of the tests, I suppose, but that will make things a bit verbose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, what I mean is this:
enPluralGenerator = function(n) {
var s = String(n).split('.'), v0 = !s[1];
return (n == 1 && v0) ? 'one' : 'other';
}
esPluralGenerator = function(n) {
return (n == 1) ? 'one' : 'other';
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, my original message was confusing wrt that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can make that change. It shouldn't affect English, though. The plural/single case is identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, thanks
@sieverk this is looking great, good job so far... I've left a couple of comments. |
Something I like about the API is that significant digits can be used very conveniently to control the display digits. var fmt = .numberFormatter({
minimumSignificantDigits: 1,
maximumSignificantDigits: 2,
compact: "short"
});
fmt( 1273500 ); // > 1.3 million
fmt( 12735000 ); // > 13 million
fmt( 127350000 ); // > 130 million
fmt = .numberFormatter({
minimumSignificantDigits: 1,
maximumSignificantDigits: 3,
compact: "short"
});
fmt( 1273500 ); // > 1.27 million
fmt( 12735000 ); // > 12.7 million
fmt( 127350000 ); // > 127 million |
Something that seems NOT possible is to lock the scale, for example at |
@sieverk please could you sign the CLA? Thanks |
I'm also pretty interested in getting this in. Let me know if I can help. |
Sorry about the delayed response -- got swept up in a few other things this past month. Thanks for the feedback! I'll try to set some time aside this week to address questions / make some revisions / drop a few extra notes. Went ahead and got that CLA signed. |
…act logic for numbers in the 1000s
…nding logic, added tests
src/number/compact-pattern-re.js
Outdated
*/ | ||
return ( /^(('([^']|'')*'|[^*#@0,.E])*)(\*.)?((([#,]*[0,]*0+)(\.0*[0-9]*#*)?)|([#,]*@+#*))(E\+?0+)?(.*?)$/ ); | ||
|
||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff between this and the src/number/pattern-re.js is:
--- compact-pattern-re.js 2017-06-22 11:34:01.000000000 -0300
+++ pattern-re.js 2017-03-13 11:49:00.000000000 -0300
@@ -26,7 +26,7 @@
*
* suffix = non_number_stuff
*
- * non_number_stuff = regexp(.*?)
+ * non_number_stuff = regexp(('[^']+'|''|[^*#@0,.E])*)
*
*
* Regexp groups:
@@ -45,6 +45,6 @@
* 11: suffix
* 12: -
*/
-return ( /^(('([^']|'')*'|[^*#@0,.E])*)(\*.)?((([#,]*[0,]*0+)(\.0*[0-9]*#*)?)|([#,]*@+#*))(E\+?0+)?(.*?)$/ );
+return ( /^(('([^']|'')*'|[^*#@0,.E])*)(\*.)?((([#,]*[0,]*0+)(\.0*[0-9]*#*)?)|([#,]*@+#*))(E\+?0+)?(('[^']+'|''|[^*#@0,.E])*)$/ );
});
Can we update src/number/pattern-re.js
instead and reuse that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition seems a valid number pattern re.
src/number/format.js
Outdated
var infinitySymbol, maximumFractionDigits, maximumSignificantDigits, minimumFractionDigits, | ||
minimumIntegerDigits, minimumSignificantDigits, nanSymbol, nuDigitsMap, padding, prefix, | ||
primaryGroupingSize, pattern, ret, round, roundIncrement, secondaryGroupingSize, suffix, | ||
symbolMap; | ||
symbolMap, compactMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: please sort the variables, therefore compactMap would be the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh, that's what I get for renaming too quickly.
|
Is there anything I can do to get this over the finish line? I'm really looking forward to using it. |
Hi @alex-stripe, I think this is close to complete. I have to review the tests and I see one comment above not yet answered. @sieverk please any thoughts? @alex-stripe do you use globalize-webpack-plugin or the precompilation on your side? If so, a great help would be getting those projects in good shape for this change. I believe very few will change there. Therefore, it would be mostly about adding tests to assert this works fine. |
(Sorry I keep switching accounts) I don't use either, but I'll see if I can add some tests there in the next day or so just to keep things smooth. |
Awesome! That will certainly make things smooth and quicker to release. Thanks |
Yeah, I think as far as code changes, there's that issue and handling long numbers (that exceed defined patterns). Both should be pretty easy. Will see if I can set aside an hour or so today. |
Awesome! I see updates and I believe it's entirely on my side now :). Will review in a couple of days. Thanks! |
|
||
QUnit.test( "unsupported numbers should apply no compacting in compact mode", function( assert ) { | ||
assert.equal( format( 0.01, properties( "#0.##", en, { compact: "short" } ) ), "0.01" ) | ||
assert.equal( format( 1234000000000000, properties( "#0", en, { compact: "short" } ) ), "1234000000000000" ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of ignoring compact mode for the big case, can it use the biggest available, e.g., "000 trillion" => "1,234 trillion"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think that's probably more in line with what the user would want.
if ( zeroes.length >= 3 ) { | ||
|
||
// use default plural form to perform initial decimal shift | ||
compactPattern = compactMap[ "1" + zeroes + "-count-other" ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point (the format execution itself), compactMap
data structure should be the most convenient possible for use here (prepared by format-properties stage), e.g.:
compactMap: {
3: {
one: "0 thousand",
other: "0 thousand"
},
4: {one, other},
...
15: {one, other}
}
That would eliminate the need of creating the zeros array and would simplify the key above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, shifting the work into compactMap makes a lot of sense.
// If no pattern, original number should remain uncompacted. | ||
if ( compactPattern === "0" ) { | ||
number = originalNumber; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this if
block be moved up like:
if ( compactPattern === "0" ) {
number = originalNumber;
// Could you do this and avoid entering the other if block below
// (the one after numbers are processed)
compactPattern = null;
} else if ( compactPattern ) {
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. As you did in your branch, don't even need to apply the division / restore the number if arranged that way.
prefix += compactProperties[ 1 ]; | ||
suffix = compactProperties[ 11 ] + suffix; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like commented below, it would be ideal to use the biggest available instead of ignoring compact mode for big numbers.
|
||
// Some languages specify no pattern for certain digit lengths, represented as "0". | ||
// Only apply compact pattern if one is specified. | ||
if ( compactPattern !== "0" ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above, could you simply set compactPattern = null
in this case and avoid this if completely?
QUnit.test( "percents should format in compact mode", function( assert ) { | ||
assert.equal( format( 127, properties( "#0%", en, { compact: "short" } ) ), "13K%" ); | ||
assert.equal( format( 127, properties( "#0%", en, { compact: "long" } ) ), "13 thousand%" ); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed changes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed changes...
I've added a couple of comments requesting changes, but I have simultaneously implemented them in here: https://github.com/rxaviers/globalize/tree/compact-mode It needs another review round... |
Looked over your changes -- they look pretty good to me, other than some minor questions. Did you want a review over there? |
Thanks @sieverk and sure your review is welcome. Let me create a PR. |
Here we go... #759 |
Seems like all the failures are commit message related. Can we just rebase it to something valid and merge? |
Continued in #759 |
@alex-stripe thanks for the push... It would be great if you could help by updating the examples and add a compact case if you have time. Thanks |
By the way, this is merged via #759... |
Oh awesome. I'll whip up some examples asap.
…On Tue, Aug 22, 2017 at 8:16 AM Rafael Xavier de Souza < ***@***.***> wrote:
By the way, this is merged via #759
<#759>...
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#715 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAF5KoH535txWdceCoI-3I6FqOL-zLEzks5satSngaJpZM4NJli1>
.
|
WIP addressing #406 that I wanted to solicit some feedback on. I'm trying to add support for short/long compact numbers (long/short patterns already included in decimalFormats in CLDR). My proposed API is the following:
The
compact
option plays nicely with the other options with one small asterisk: thelong
pattern can look a bit goofy with percents. ("14 million%") And currency/unit should be able to make use of this options in their number formatters as well.Because compact patterns can only be determined when the number is in hand, we have to parse/apply that pattern at format-time. And since we can't determine which plural form applies until we have the rounded, decimal-shifted number, we have to check the pattern twice (the first time, figure out the decimal-shift divisor based on the "other" plural format string; the second time, use the rounded number to get the plural form and select the correct pattern for that plural). If I've missed some better way of doing this, I'm open to suggestions.
Anyway, wanted to solicit feedback on this approach--thanks for looking!