You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Depends on the runtime parsing discussed in #3849.
Transliterators can not only be loaded by a single ID in ICU4C/J, but also through chaining a bunch of other transliterators (including filters) together. Example: [a-z] ; [a] Remove ; Latin-Greek/BGN. These "chains" are actually equivalent to the transform rule source obtained by applying chain.split(";").map(|elt| format!(":: {elt} ;")).collect::<String>(), e.g. :: [a-z] ; :: [a] Remove ; :: Latin-Greek/BGN ;, i.e., the same data struct can be reused (with only an overhead cost of a few empty VZVs).
This is primarily a convenience feature for runtime construction, allowing users to not have to write a dummy source file containing the mapping explained above. Because these chains use the legacy IDs, and ICU4X data uses BCP-47 IDs, the whole issue surrounding mapping legacy IDs to BCP-47 IDs applies (#3891). I suggest instead of supporting these chains of legacy IDs, instead supporting chains of BCP-47 IDs. Support for this is also on the roadmap for ICU: https://unicode-org.atlassian.net/browse/ICU-22474
The text was updated successfully, but these errors were encountered:
Depends on the runtime parsing discussed in #3849.
Transliterators can not only be loaded by a single ID in ICU4C/J, but also through chaining a bunch of other transliterators (including filters) together. Example:
[a-z] ; [a] Remove ; Latin-Greek/BGN
. These "chains" are actually equivalent to the transform rule source obtained by applyingchain.split(";").map(|elt| format!(":: {elt} ;")).collect::<String>()
, e.g.:: [a-z] ; :: [a] Remove ; :: Latin-Greek/BGN ;
, i.e., the same data struct can be reused (with only an overhead cost of a few empty VZVs).This is primarily a convenience feature for runtime construction, allowing users to not have to write a dummy source file containing the mapping explained above. Because these chains use the legacy IDs, and ICU4X data uses BCP-47 IDs, the whole issue surrounding mapping legacy IDs to BCP-47 IDs applies (#3891). I suggest instead of supporting these chains of legacy IDs, instead supporting chains of BCP-47 IDs. Support for this is also on the roadmap for ICU: https://unicode-org.atlassian.net/browse/ICU-22474
The text was updated successfully, but these errors were encountered: