Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Levenshtein: change how ratio is computed #130

Merged
merged 4 commits into from
May 23, 2021
Merged

Conversation

kitbellew
Copy link
Contributor

Don't subtract the length difference from distance, as that leads to low numbers (frequently, zero) on pairs which are quite dissimilar. Instead, divide by the length of the longer of the two.

Fixes #78.

@kitbellew kitbellew requested a review from olafurpg May 22, 2021 15:55
@kitbellew kitbellew force-pushed the 78 branch 2 times, most recently from 490d8e6 to 934a68b Compare May 22, 2021 16:01
kitbellew added 3 commits May 22, 2021 09:12
Previously, we selected by the minimum Levenshtein distance, then used
additional logic to compute a ratio (which may not monotonically track
the distance values). Instead, sort by the final ratio.

As the tests can now attest, the current computation of ratio might be
too optimistic, as suggestions are not as tight.
Don't subtract the length difference from distance, as that leads to low
numbers (frequently, zero) on pairs which are quite dissimilar. Instead,
divide by the length of the longer of the two.
Copy link
Member

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM this looks like a nice improvement 👍

@kitbellew kitbellew merged commit cf69d53 into scalameta:main May 23, 2021
@kitbellew kitbellew deleted the 78 branch May 23, 2021 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fine tune "did you mean" suggestions
2 participants