Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[iOS][non-icu] Implement missing Locale and Casing functions #94038

Merged
merged 10 commits into from
Nov 1, 2023

Conversation

mkhamoyan
Copy link
Contributor

@mkhamoyan mkhamoyan commented Oct 26, 2023

Implement below functions

  • GetDefaultLocaleNameNative
  • IsPredefinedLocaleNative

Run all tests under System.Globalization.Tests and adjust them for hybrid mode

Contributes to #80689
cc @SamMonoRT

@ghost
Copy link

ghost commented Oct 26, 2023

Tagging subscribers to this area: @dotnet/area-system-globalization
See info in area-owners.md if you want to be subscribed.

Issue Details

Implement below functions

  • InitOrdinalCasingPageNative
  • GetDefaultLocaleNameNative
  • IsPredefinedLocaleNative

Run all tests under System.Globalization.Tests and adjust them for hybrid mode

Contributes to #80689
cc @SamMonoRT

Author: mkhamoyan
Assignees: mkhamoyan
Labels:

area-System.Globalization

Milestone: -

@ghost
Copy link

ghost commented Oct 26, 2023

Tagging subscribers to 'os-ios': @steveisok, @akoeplinger, @kotlarmilos
See info in area-owners.md if you want to be subscribed.

Issue Details

Implement below functions

  • InitOrdinalCasingPageNative
  • GetDefaultLocaleNameNative
  • IsPredefinedLocaleNative

Run all tests under System.Globalization.Tests and adjust them for hybrid mode

Contributes to #80689
cc @SamMonoRT

Author: mkhamoyan
Assignees: mkhamoyan
Labels:

os-ios

Milestone: -

@mkhamoyan
Copy link
Contributor Author

/azp run runtime-extra-platforms

@mkhamoyan mkhamoyan marked this pull request as ready for review October 26, 2023 13:13
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@mdh1418 mdh1418 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all of the tests that we are disabling for HybridGlobalizationOnOSX expected to not work?

@@ -149,4 +149,25 @@ int32_t GlobalizationNative_ChangeCaseInvariantNative(const uint16_t* lpSrc, int
}
}

void GlobalizationNative_InitOrdinalCasingPageNative(int32_t pageNumber, UChar* pTarget)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm forgetting, does pal_casing.c get trimmed when we set HybridGlobalization to true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it does not get trimmed in hybrid mode. But once we will not load ICU at all pal_casing.c will also not be loaded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But once we will not load ICU at all pal_casing.c will also not be loaded.

What would this look like? Would it get trimmed out or would it be a file exclusion?
Since this and GlobalizationNative_InitOrdinalCasingPage have the exact same logic`, it would be nice if we found a way to not duplicate this code.

Copy link
Contributor Author

@mkhamoyan mkhamoyan Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From

and we will filter out for Apple platforms pal_casing.c

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our goal is to make hybrid globalization default mode for Apple platforms and in that case we will not need most of the ICU related functions (we will not need most of the pal_"name".c). I guess at the end to not have duplicates I can move all these kind of functions in some common place like pal_common.c

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were there other instances where the logic was exactly the same as the ICU function? This is the first that I noticed was exactly the same, but maybe if we anticipate more of this moving into a pal_common.c might be a good way to not duplicate code. Does this logic need the @autoreleasepool{} if its not allocating any memory?

Copy link
Contributor Author

@mkhamoyan mkhamoyan Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this was first function with exactly same logic. Moved it to pal_common.c .

@@ -45,5 +45,6 @@ public class CompareInfoTestsBase
s_invariantCompare.Compare("\u3060", "\uFF80\uFF9E", CompareOptions.IgnoreKanaType | CompareOptions.IgnoreWidth | CompareOptions.IgnoreCase) == 0;

protected static bool IsNotWindowsKanaRegressedVersionAndNotHybridGlobalizationOnWasm() => !PlatformDetection.IsHybridGlobalizationOnBrowser && IsNotWindowsKanaRegressedVersion();
protected static bool IsNotWindowsKanaRegressedVersionAndNotHybridGlobalization() => IsNotWindowsKanaRegressedVersionAndNotHybridGlobalizationOnWasm() && !PlatformDetection.IsHybridGlobalizationOnOSX;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
protected static bool IsNotWindowsKanaRegressedVersionAndNotHybridGlobalization() => IsNotWindowsKanaRegressedVersionAndNotHybridGlobalizationOnWasm() && !PlatformDetection.IsHybridGlobalizationOnOSX;
protected static bool IsNotWindowsKanaRegressedVersionAndNotHybridGlobalization() => IsNotWindowsKanaRegressedVersion() && PlatformDetection.IsNotHybridGlobalization();

Similar to the comment in PlatformDetection.cs

Since we have IsNotHybridGlobalizationOnBrowser, maybe we can update the !PlatformDetection.IsHybridGlobalizationOnBrowser to be PlatformDetection.IsNotHybridGlobalizationOnBrowser.

Also it would be nice to be consistent with the Browser vs Wasm. Would there be HybridGlobalization on Wasi?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future maybe there will be HybridGlobalization on WASI, but now all the implementation in WASM is done using browser API and there is no browser API in WASI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I would expect IsNotWindowsKanaRegressedVersionAndNotHybridGlobalizationOnWasm in the future to be !PlatformDetection.IsHybridGlobalizationOnWasm && IsNotWindowsKanaRegressedVersion() (what I meant by consistent)

Where maybe PlatformDetection.IsHybridGlobalizationOnWasm = PlatformDetection.IsHybridGlobalizationOnBrowser || PlatformDetection.IsHybridGlobalizationOnWasi.

So to that pattern IsNotWindowsKanaRegressedVersionAndNotHybridGlobalizationOnWasm -> IsNotWindowsKanaRegressedVersionAndNotHybridGlobalizationOnBrowser here would be more consistent but thats just a nit and can be addressed when HybridGlobalization is extended to Wasi.

@autoreleasepool
{
NSString *localeIdentifier = [NSString stringWithFormat:@"%s", localeName];
NSString *desiredLocaleIdentifier = [localeIdentifier stringByReplacingOccurrencesOfString:@"-" withString:@"_"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we replace all - with _? NSLocale availableLocaleIdentifiers returns a Locale ID which may include - for script designators right?

Copy link
Contributor Author

@mkhamoyan mkhamoyan Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While debugging I noticed that availableLocaleIdentifiers doesn't have any Local ID containing -.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that mean that [NSLocale availableLocaleIdentifiers] doesnt contain Local IDs with _ on your system? Could it be different based on the OS version or on other devices?

Which desiredLocaleIdentifier contained a - that we needed to cast to _ to match the availableLocales?

Given that the docs specify there is a possibility of including -, because we only cast desiredLocaleIdentifier to only have _, it seems like there is a chance to miss a match.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are examples from availableLocaleIdentifiers in my system
avialableLocales
I can see 949 Local IDs and none of them has -.
You are right this can be my system specific result. Will modify to not miss a match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all calls to IcuIsEnsurePredefinedLocaleName (seems like just CultureInfo.GetCultureInfo(string, bool) and CultureData.GetCultureData(string?, bool), it doesn't seem like there is any restriction or assertion on the input name.

If we see instances of the name passed to IcuIsEnsurePredefinedLocaleName being formatted as [language designator]-[region designator] (e.g. zh-HK), I'm wondering if any names containing a script designator would be formatted as [language designator]-[script desginator]-[region designator] (e.g. zh-Hans-HK). Looking at

yield return new object[] { 0x0c04, new [] { "zh-hk" }, "zh-HK", "zho", "ZHH", "zh-Hant-HK", "zh-HK" };
, it seems that this is likely the case. Speaking of which we should double check if for zh-HK if the script should be Hans (the apple doc) or Hant (code snippet).

If the formatting will be passed as such, then neither the original input name nor the version where we replace - with _ will match the documented format for availableLocaleIdentifiers [language designator]-[script designator]_[region designator].

Perhaps we can check if the input string has three components, and if so, ensure its in the format of *-*_*. That being said, we should get a better idea of what the range of inputs we expect to this function. Maybe it would be sufficient to see if availableLocales contains either localeIdentifier or localeIdentifier.ReplaceLastOccurenceOf('-', '_').

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like that apple doc and what I get in availableLocaleIdentifiers are not same. E.g. zh-Hant_HK in availableLocaleIdentifiers is zh_Hant_HK, so if I change the implementation to have localeIdentifier.ReplaceLastOccurenceOf('-', '_') for the test case zh-Hans-HK it will return false but in reality it is in the availableLocaleIdentifiers but with the format [language designator]_[script designator]_[region designator] not like mentioned in the doc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats unfortunate... is there a place with more up-to-date docs? Now I'm not sure if its consistent across all apple OS/version/etc. But if we do have a test that tests IcuIsEnsurePredefinedLocaleName I guess that can be ran on all of the apple platforms.

For now I think it might be fine to just go with your original logic that replaced all - with _. Otherwise we would need to just extract the components (sections separated by - and _)

Copy link
Contributor Author

@mkhamoyan mkhamoyan Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the code I found out that Apple calls to ICU for that information, see CFLocaleCopyAvailableLocaleIdentifiers here https://opensource.apple.com/source/CF/CF-1153.18/CFLocale.c.auto.html. uloc_getAvailable calls _canonicalize and in that function looks like all parts in LocalID are appended by '_' https://github.com/dotnet/icu/blob/dotnet/main/icu/icu4c/source/common/uloc.cpp#L1537

@mkhamoyan
Copy link
Contributor Author

Are all of the tests that we are disabling for HybridGlobalizationOnOSX expected to not work?

Yes. These are cases which either were not possible (yet) to implement in hybrid mode, or behavioural changes after implementation

@mkhamoyan
Copy link
Contributor Author

/azp run runtime-extra-platforms

@mkhamoyan mkhamoyan requested a review from mdh1418 October 30, 2023 12:41
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mkhamoyan
Copy link
Contributor Author

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mdh1418
Copy link
Member

mdh1418 commented Oct 30, 2023

These are cases which either were not possible (yet) to implement in hybrid mode, or behavioural changes after implementation

Can we document the behavior changes/impossible to implement, and if it falls under the "not yet possible", can we have an issue tracking that?

@mkhamoyan
Copy link
Contributor Author

These are cases which either were not possible (yet) to implement in hybrid mode, or behavioural changes after implementation

Can we document the behavior changes/impossible to implement, and if it falls under the "not yet possible", can we have an issue tracking that?

Behaviour changes and things that are not supported are documented here https://github.com/dotnet/runtime/blob/main/docs/design/features/globalization-hybrid-mode.md .

Opened a ticket for the functionalities to investigate again if there is a possibility to implement #94216

Copy link
Member

@mdh1418 mdh1418 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After ensuring GlobalizationNative_IsPredefinedLocaleNative behavior is the same on all apple platforms for all localeName, looks good to me! Thanks!

@ghost
Copy link

ghost commented Oct 31, 2023

Tagging subscribers to this area: @dotnet/area-system-globalization
See info in area-owners.md if you want to be subscribed.

Issue Details

Implement below functions

  • InitOrdinalCasingPageNative
  • GetDefaultLocaleNameNative
  • IsPredefinedLocaleNative

Run all tests under System.Globalization.Tests and adjust them for hybrid mode

Contributes to #80689
cc @SamMonoRT

Author: mkhamoyan
Assignees: mkhamoyan
Labels:

area-System.Globalization, os-ios

Milestone: -

@mkhamoyan
Copy link
Contributor Author

/azp run runtime-extra-platforms

@mkhamoyan
Copy link
Contributor Author

/azp run runtime-extra-platforms

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mkhamoyan
Copy link
Contributor Author

Failures are not related.

@mkhamoyan mkhamoyan merged commit 68dde19 into dotnet:main Nov 1, 2023
185 of 188 checks passed
@mkhamoyan mkhamoyan deleted the hyrbid_locale_functions branch November 1, 2023 10:17
@ghost ghost locked as resolved and limited conversation to collaborators Dec 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants