-
-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pkg.add should display nearest names #616
Comments
This is definitely a good feature to have. Isn't edit distance more commonly used for this? |
Several kind of methods for measuring strings similarity exist :
to name a few https://en.wikipedia.org/wiki/String_metric can give a first idea about string metric but this paper gives probably a better overview I personally used Dice coefficient in https://github.com/scls19fr/arduino_libraries_search and was quite happy with that choice. I noticed that several Julia packages that could help to calculate string similarity exist. To name a few :
Maybe these contributors can give us some advises for choosing a "good distance" function for such an use case. |
Can you put into words at all what about the Dice coefficient have made it work well for that? I'm not doubting, just wondering why it works better than any of the others. This is the first time I've heard of it for this whereas Damerau–Levenshtein seems to be the go-to standard for correcting spelling errors. |
Sorry, but I'm not skilled enough to be the Dice advocate. I just choose this distance because of three reasons.
But anyway... the choice of the distance method could be user defined |
This is maybe a bit out of the scope of this issue... but maybe a |
Julia code for the Damereau-Levenshteindistance can be found in the StringDistances package |
I second the request for |
|
For the record, this was implemented in #2985. |
Hello,
When doing
Pkg.add("NameOfPackage")
, ifNameOfPackage
is not found an error message such asis shown.
If a package name is not found, maybe
Pkg
could try to "help" users by listing name of some packages whom name is quite near to what user is looking for.Computing string similarity using for example dice coefficient (see various implementations) between user provided package name and name of each registered package could help.
Maybe comparison should be done after upper casing (or lower casing) both.
A threshold could probably be set.
Sorting by descending coefficient will be required, taking only (for example) five nearest names (however displaying them with correct case)
Kind regards
The text was updated successfully, but these errors were encountered: