Skip to content

Commit

Permalink
CLDR-17336 fix parent locales for migration (#3534)
Browse files Browse the repository at this point in the history
* CLDR-17336 Fix parentLocales for migration

* CLDR-17336 Fix JSON test

* CLDR-17336 Fix spec
  • Loading branch information
macchiati authored Feb 27, 2024
1 parent 60df7f1 commit d09ef3e
Show file tree
Hide file tree
Showing 6 changed files with 30 additions and 10 deletions.
5 changes: 4 additions & 1 deletion common/dtd/ldmlSupplemental.dtd
Original file line number Diff line number Diff line change
Expand Up @@ -944,8 +944,11 @@ CLDR data files are interpreted according to the LDML specification (http://unic
<!ELEMENT parentLocale EMPTY >
<!ATTLIST parentLocale parent NMTOKEN #REQUIRED >
<!--@MATCH:validity/locale-->
<!ATTLIST parentLocale localeRules NMTOKENS #IMPLIED >
<!--@MATCH:set/literal/nonlikelyScript-->
<!--@VALUE-->
<!ATTLIST parentLocale locales NMTOKENS #REQUIRED >
<!--@MATCH:or/set/validity/locale||literal/nonlikelyScript-->
<!--@MATCH:set/validity/locale-->
<!--@VALUE-->

<!ELEMENT personNamesDefaults ( alias | ( nameOrderLocalesDefault* ) ) >
Expand Down
2 changes: 1 addition & 1 deletion common/supplemental/supplementalData.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5423,7 +5423,7 @@ XXX Code for transations where no currency is involved
</codeMappings>

<parentLocales>
<parentLocale parent="root" locales="nonlikelyScript"/>
<parentLocale parent="root" localeRules="nonlikelyScript" locales="az_Arab az_Cyrl bal_Latn blt_Latn bm_Nkoo bs_Cyrl byn_Latn cu_Glag dje_Arab dyo_Arab en_Dsrt en_Shaw ff_Adlm ff_Arab ha_Arab iu_Latn kk_Arab ks_Deva ku_Arab kxv_Deva kxv_Orya kxv_Telu ky_Arab ky_Latn ml_Arab mn_Mong mni_Mtei ms_Arab pa_Arab sat_Deva sd_Deva sd_Khoj sd_Sind shi_Latn so_Arab sr_Latn sw_Arab tg_Arab ug_Cyrl uz_Arab uz_Cyrl vai_Latn wo_Arab yo_Arab yue_Hans zh_Hant"/>
<parentLocale parent="en_001" locales="en_150 en_AG en_AI en_AU en_BB en_BM en_BS en_BW en_BZ en_CC en_CK en_CM en_CX en_CY en_DG en_DM en_ER en_FJ en_FK en_FM en_GB en_GD en_GG en_GH en_GI en_GM en_GY en_HK en_ID en_IE en_IL en_IM en_IN en_IO en_JE en_JM en_KE en_KI en_KN en_KY en_LC en_LR en_LS en_MG en_MO en_MS en_MT en_MU en_MV en_MW en_MY en_NA en_NF en_NG en_NR en_NU en_NZ en_PG en_PK en_PN en_PW en_RW en_SB en_SC en_SD en_SG en_SH en_SL en_SS en_SX en_SZ en_TC en_TK en_TO en_TT en_TV en_TZ en_UG en_VC en_VG en_VU en_WS en_ZA en_ZM en_ZW"/>
<parentLocale parent="en_150" locales="en_AT en_BE en_CH en_DE en_DK en_FI en_NL en_SE en_SI"/>
<parentLocale parent="en_IN" locales="hi_Latn"/>
Expand Down
22 changes: 15 additions & 7 deletions docs/ldml/tr35.md
Original file line number Diff line number Diff line change
Expand Up @@ -1790,18 +1790,21 @@ then a mixture of child and parent textual data is a mishmash of different scrip
Thus there are two cases where the truncation inheritance needs to be overridden:

1. When the parent locale would have a different script, and text would be mixed.
2. In certain exceptional circumstances where the parent.
2. In certain exceptional circumstances where the 'truncation' parent needs to be adjusted.

The `parentLocale` element is used to override the normal inheritance when accessing CLDR data.

For case 1, there is a special value for the locales, `nonlikelyScript`,
which includes all locales of the form <lang>_<script>, where the <script> is not the likely script for <lang>.
For case 1, there is a special attribute and value, `localeRules="nonlikelyScript"`,
which specifies **all locales** of the form <lang>_<script>, wherever the <script> is **not** the likely script for <lang>.
For migration, the previous short list of locales (a subset of the nonlikelyScript locales) is retained,
but those locales are slated for removal in the future.
For example, `ru_Latn` is not included in the short list but is included (programmatically) in the rule.

```xml
<parentLocale parent="root" locales="nonlikelyScript"/>
<parentLocale parent="root" localeRules="nonlikelyScript" locales="az_Arab az_Cyrl bal_Latn … yue_Hans zh_Hant"/>/>
```

This is used for the main component.
The `localeRules` is used for the main component, for example.
It is not used to components where text is not mixed,
such as the collations component or the plurals component.

Expand All @@ -1811,7 +1814,11 @@ For case 2, the children and parent share the same primary language, but the reg
<parentLocale parent="es_419" locales="es_AR es_BO … es_UY es_VE"/>
```

There are certain components that require addenda to the common parent fallback rules. For a locale like `zh_Hant` in the example above, the `parentLocale` element would dictate the parent as `root` when referring to main locale data, but for collation data, the parent locale should still be `zh`, even though the `parentLocale` element is present for that locale. To address this, components can have their own fallback rules that inherit from the common rules and add additional parents that supplement or override the common rules:
There are certain components that require addenda to the common parent fallback rules.
For a locale like `zh_Hant` in the example above, the `parentLocale` element would dictate the parent as `root` when referring to main locale data,
but for collation data, the parent locale should still be `zh`,
even though the `parentLocale` element is present for that locale.
To address this, components can have their own fallback rules that inherit from the common rules and add additional parents that supplement or override the common rules:

```xml
<parentLocales component="segmentations">
Expand All @@ -1827,7 +1834,8 @@ the parentLocale information is contained in CLDR’s [supplemental data.](tr35-

When a `parentLocale` element is used to override normal inheritance, the following guidelines apply in most cases:

1. If X is the parentLocale of Y, then either X is the root locale, or X has the same base language code as Y. For example, the parent of `en` cannot be `fr`, and the parent of `en_YY` cannot be `fr` or `fr_XX`.
1. If X is the parentLocale of Y, then either X is the root locale, or X has the same base language code as Y.
For example, the parent of `en` cannot be `fr`, and the parent of `en_YY` cannot be `fr` or `fr_XX`.
2. If X is the parentLocale of Y, Y must not be a base language locale. For example, the parent of `en` cannot be `en_XX`.

There may be specific exceptions to these for certain closely-related languages or language-script combinations, for example:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -322,6 +322,7 @@ public static class SplittableAttributeSpec {
.add("systems")
.add("origin")
.add("component") // for parentLocales - may need to be more specific here
.add("localeRules") // for parentLocales
.add("values") // for unitIdComponents - may need to be more specific here
.freeze();

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1906,10 +1906,17 @@ private void handleParentLocales(XPathParts parts) {
}
String parent = parts.getAttributeValue(-1, "parent");
String locales = parts.getAttributeValue(-1, "locales");
String localeRules = parts.getAttributeValue(-1, "localeRules");
Set<String> localeRuleSet =
localeRules == null
? Set.of()
: Set.copyOf(split_space.splitToList(localeRules));

for (ParentLocaleComponent component : components) {
Map<String, String> componentParentLocales = parentLocales.get(component);
if (locales.equals(NONLIKELYSCRIPT)) {
if (localeRuleSet.contains(NONLIKELYSCRIPT)) {
// This will need to be modified if we add any other rules,
// particularly if any rules are based on the particular parent
parentLocalesSkipNonLikely.add(component);
continue;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -418,6 +418,7 @@
//supplementalData/windowsZones/mapTimezones/mapZone[@other="%A"][@territory="%A"]/_type ; Supplemental ; WZoneMapping ; $1 ; $2 ; HIDE

//supplementalData/parentLocales/parentLocale[@parent="%A"]/_locales ; Supplemental ; Locale ; Parent ; $1 ; HIDE
//supplementalData/parentLocales/parentLocale[@parent="%A"]/_localeRules ; Supplemental ; Locale ; Parent Rules ; $1 ; HIDE
//supplementalData/parentLocales[@component="%A"]/parentLocale[@parent="%A"]/_locales ; Supplemental ; Locale ; Parent ($1) ; $2 ; HIDE
//supplementalData/metadata/skipDefaultLocale/_services ; Supplemental ; Locale ; SkipDefault ; plain ; HIDE
//supplementalData/metadata/defaultContent/_locales ; Supplemental ; Locale ; DefaultContent ; plain ; HIDE
Expand Down

0 comments on commit d09ef3e

Please sign in to comment.