Issue #6537: Improved name search for diacritics #6538

SJuliez · 2025-02-12T18:11:23Z

Fixes #6537
In the unit selector, allows base latin characters to find diacritics. Both "Gött" and "Gott" will find the "Götterdämmerung", "Gun" will find the "Gún" and "Arana" the "Aran~a" (cant even find how to type that)

codecov · 2025-02-12T18:16:22Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 29.03%. Comparing base (1d6c24b) to head (3fd3246).
Report is 36 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff              @@
##             master    #6538      +/-   ##
============================================
- Coverage     29.05%   29.03%   -0.02%     
+ Complexity    15160    15158       -2     
============================================
  Files          2835     2836       +1     
  Lines        279129   279296     +167     
  Branches      49214    49237      +23     
============================================
- Hits          81104    81103       -1     
- Misses       192652   192817     +165     
- Partials       5373     5376       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pavelbraginskiy · 2025-02-14T23:44:57Z

Accent normalization doesn't cover some letters like ø. I don't know if this is a problem now, but it's possible it could be in the future. A way to solve this would be to add a dependency on ICU4J, which has a bunch of utilities for working with Unicode, including a transliterator that does exactly what we want here.

Example:

var transliterator = Transliterator.getInstance("Latin-ASCII");
System.out.println(transliterator.transliterate("Gøtterdämmerung"));
// Gotterdammerung

Note: the transliterator "Any-Latin; Latin-ASCII" will handle non-Latin scripts but that might be overkill

In general, Unicode is hard, and I think it makes sense to leave figuring it out to the experts.

RaozSpaz

Was suggested on Discord to add a request to this for this issue: #6550

SJuliez added 2 commits February 12, 2025 19:03

make finding unit names with diacritics easier

74bf201

make finding unit names with diacritics easier - improved

3fd3246

HammerGS approved these changes Feb 12, 2025

View reviewed changes

RaozSpaz suggested changes Feb 15, 2025

View reviewed changes

HammerGS merged commit c002435 into MegaMek:master Feb 15, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #6537: Improved name search for diacritics #6538

Issue #6537: Improved name search for diacritics #6538

SJuliez commented Feb 12, 2025

codecov bot commented Feb 12, 2025 •

edited

Loading

pavelbraginskiy commented Feb 14, 2025 •

edited

Loading

RaozSpaz left a comment

Issue #6537: Improved name search for diacritics #6538

Issue #6537: Improved name search for diacritics #6538

Conversation

SJuliez commented Feb 12, 2025

codecov bot commented Feb 12, 2025 • edited Loading

Codecov Report

pavelbraginskiy commented Feb 14, 2025 • edited Loading

RaozSpaz left a comment

Choose a reason for hiding this comment

codecov bot commented Feb 12, 2025 •

edited

Loading

pavelbraginskiy commented Feb 14, 2025 •

edited

Loading