Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add jarowinkler_similarity implementation, documentation and tests #24656

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

Leziak
Copy link

@Leziak Leziak commented Feb 28, 2025

Description

Add jarowinkler_similarity function

Motivation and Context

Explained in #24651

Impact

Added jarowinkler_similarity function
image

I found an implementation at https://www.geeksforgeeks.org/jaro-and-jaro-winkler-similarity/, and it looks sound, so I repurposed it for this function

Test Plan

Tested manually on TCPH catalog via presto-cli, then wrote tests which include non-ASCII characters (took direction from the Levenshtein distance function tests on this)

I added test cases from https://tilores.io/jaro-winkler-distance-algorithm-online-tool and https://www.geeksforgeeks.org/jaro-and-jaro-winkler-similarity/

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== NO RELEASE NOTE ==

@Leziak Leziak requested review from steveburnett, elharo and a team as code owners February 28, 2025 23:51
@Leziak Leziak requested a review from presto-oss February 28, 2025 23:51
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit of punctuation, looks good otherwise!

Co-authored-by: Steve Burnett <burnett@pobox.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants