-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace DataLocale
by LanguageIdentifier
on DataRequest
#4984
Conversation
DataLocale
by LanguageIdentifier
on DataRequest
, Locale
on APIDataLocale
by LanguageIdentifier
on DataRequest
, ~Locale
on API~
DataLocale
by LanguageIdentifier
on DataRequest
, ~Locale
on API~DataLocale
by LanguageIdentifier
on DataRequest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Praise: It looks approximately what I expect to see but it's a big PR so I haven't finished looking at every file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Down through provider/core
; next stop provider/datagen
let metadata_payload: DataPayload<crate::provider::CollationMetadataV1Marker> = provider | ||
.load(req) | ||
.or_else(|_| { | ||
provider.load(DataRequest { | ||
langid: req.langid, | ||
..Default::default() | ||
}) | ||
})? | ||
.take_payload()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion, here and elsewhere: perhaps refactor to make more clear why you are making two requests
let metadata_payload: DataPayload<crate::provider::CollationMetadataV1Marker> = provider | |
.load(req) | |
.or_else(|_| { | |
provider.load(DataRequest { | |
langid: req.langid, | |
..Default::default() | |
}) | |
})? | |
.take_payload()?; | |
let metadata_payload: DataPayload<crate::provider::CollationMetadataV1Marker> = provider | |
.load(req_specific) | |
.or_else(|_| { | |
provider.load(req_default) | |
})? | |
.take_payload()?; |
Default::default(), | ||
) | ||
.unwrap(); | ||
assert_writeable_eq!( | ||
dtf.format(&datetime), | ||
*expected, | ||
"\n\ | ||
locale: `{}`,\n\ | ||
locale: `{:?}`,\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought/Issue: We probably want to keep a Display
impl on Preferences?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to make that assumption now. It's not a problem to use Debug
in a test.
// Always retain -u-sd | ||
SUBDIVISION_KEY => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought/Issue: We don't currently use -u-sd
but it would be nice to support it in LocaleFallbacker for future compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new requirement, please don't block the PR on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't block the PR on it but it's not a new requirement; the old fallbacker supported -u-sd so it's a regression in behavior that it doesn't support it now and doesn't have a clear path to supporting it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a design discussion in March to drop unicode keywords from the fallbacker and this was not brought up since then. This will require a lot of rethinking.
TestCase { | ||
// TODO(#4413): -u-rg is not yet supported; when it is, this test should be updated | ||
input: "en-u-rg-gbxxxx", | ||
requires_data: false, | ||
extension_key: None, | ||
fallback_supplement: None, | ||
expected_language_chain: &["en"], | ||
expected_region_chain: &["en"], | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: We should fix this in 2.0 because it might require an architecture change in LocaleFallabcker. #4413
("en-019", "", "Hello from 🌎"), // AMERICAS | ||
("en-142", "", "Hello from 🌏"), // ASIA | ||
("en-GB", "", "Hello from 🇬🇧"), // GREAT BRITAIN | ||
("en-GB", "england", "Hello from 🏴"), // ENGLAND |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: england
should remain a subdivision, not a key attribute... please at least add a TODO(#4413)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is as designed
Reasons we may want to keep
|
We had agreed on this and I have implemented this, which was a non-trivial amount of work.
What does this gain us?
Ok, but there are other ways as well
Having logic duplicated between |
We hadn't considered the restriction of being unable to support
It's a little cheaper 🤷
How? Let's discuss
I understand this point of view but we made the conscious decision to duplicate the API surface in 1.0 because we established the understanding that the needs of locales in the context of data providers was different than the needs of locales in the context of UTS 35. (I was hopeful that the new DataRequest could maybe allow us to align, but I've never been fully convinced that merging DataLocale back into LanguageIdentifier was worthwhile) |
^ If we can demonstrate that there's a good way to support |
(currently feel like no strong opinion on whether we should land this. happy to take part in discussions, but no need to wait for my review, quickly going through this I don't see a reason why I would block this) |
Current thinking of @sffc @zbraniecki:
|
Subtle adjustment:
|
We discussed this at the CLDR design meeting. There wasn't a clear conclusion on what fields should be in data locale, but it was evident that this could change over time. An approach that seems like it would make sense for us would be to restrict to a single subtag on top of LSR which contains an implementation-defined enumeration of data identity. For example, we lump all variants and subdivisions into our own namespace. |
Discussion #[derive(Copy, ...)]
pub struct DataLocale {
language: icu_locale_core::subtags::Language,
script: Option<icu_locale_core::subtags::Script>,
region: Option<icu_locale_core::subtags::Region>,
subdivision: Option<icu_locale_core::extensions::unicode::SubdivisionSuffix>,
single_variant: Option<icu_locale_core::subtags::Variant>,
}
|
Unicode extension keys are now modeled as
DataKeyAttributes
.#3632