Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #6537: Improved name search for diacritics #6538

Merged
merged 2 commits into from
Feb 15, 2025

Conversation

SJuliez
Copy link
Member

@SJuliez SJuliez commented Feb 12, 2025

Fixes #6537
In the unit selector, allows base latin characters to find diacritics. Both "Gött" and "Gott" will find the "Götterdämmerung", "Gun" will find the "Gún" and "Arana" the "Aran~a" (cant even find how to type that)

Copy link

codecov bot commented Feb 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 29.03%. Comparing base (1d6c24b) to head (3fd3246).
Report is 36 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master    #6538      +/-   ##
============================================
- Coverage     29.05%   29.03%   -0.02%     
+ Complexity    15160    15158       -2     
============================================
  Files          2835     2836       +1     
  Lines        279129   279296     +167     
  Branches      49214    49237      +23     
============================================
- Hits          81104    81103       -1     
- Misses       192652   192817     +165     
- Partials       5373     5376       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pavelbraginskiy
Copy link
Member

pavelbraginskiy commented Feb 14, 2025

Accent normalization doesn't cover some letters like ø. I don't know if this is a problem now, but it's possible it could be in the future. A way to solve this would be to add a dependency on ICU4J, which has a bunch of utilities for working with Unicode, including a transliterator that does exactly what we want here.

Example:

var transliterator = Transliterator.getInstance("Latin-ASCII");
System.out.println(transliterator.transliterate("Gøtterdämmerung"));
// Gotterdammerung

Note: the transliterator "Any-Latin; Latin-ASCII" will handle non-Latin scripts but that might be overkill

In general, Unicode is hard, and I think it makes sense to leave figuring it out to the experts.

Copy link

@RaozSpaz RaozSpaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was suggested on Discord to add a request to this for this issue: #6550

@HammerGS HammerGS merged commit c002435 into MegaMek:master Feb 15, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFE] Improved name search for Diacritics
4 participants