Categories to calculate the edit distance between NSString
objects.
#import <MDCDamerauLevenshtein/MDCDamerauLevenshtein.h>
[@"Central Park" mdc_levenshteinDistanceTo:@"Centarl Prak"]; // => 4
[@"Central Park" mdc_damerauLevenshteinDistanceTo:@"Centarl Prak"]; // => 2
MDCDamerauLevenshtein includes two algorithms for calculating the edit distance between NSString objects:
- Levenshtein distance calculates the number of insertions, deletions, and substitions necessary in order to convert one string into the other.
- Damerau-Levenshtein improves upon Levenshtein to include the transposition of two adjacent characters. Damerau states that some combination of the four operations make up for 80% of all human spelling errors.
Potential applications for this library:
- Don't just use
-[NSString compare:options:]
to filter search results, display terms with small edit distances. - ...and many more!
The benchmarking app is included in this repository. It consists of two benchmarks:
- Normal: Finding the Levenshtein distance between "sitting" and "kitten"
- Large: Finding the Levenshtein distance between two paragraphs of text (409 and 728 characters, respectively)
Library | Avg. Time (Normal) | Avg. Time (Large) |
---|---|---|
MDCDamerauLevenshtein | 14,218 nanoseconds | 0.0792383 seconds |
NSString+LevenshteinDistance | 17,812 nanoseconds (25% slower) | 0.0949104 seconds (20% slower) |
koyachi/NSString-LevenshteinDistance only computes Levenshtein distance, not Damerau-Levenshtein, so only Levenshtein benchmarks are included here. The project does not include unit tests, but when benchmarked it produced correct distances.