-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a check for swapped words when we can't find an identifier #67849
Conversation
This comment has been minimized.
This comment has been minimized.
Please add some regression tests. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a src/test/ui/suggestions/
test.
Really happy to see this!
@@ -52,14 +52,16 @@ where | |||
T: Iterator<Item = &'a Symbol>, | |||
{ | |||
let max_dist = dist.map_or_else(|| cmp::max(lookup.len(), 3) / 3, |d| d); | |||
let name_vec: Vec<&Symbol> = iter_names.collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to avoid this collect
by cloning the iterator instead and passing that to find_match_by_sorted_words
, if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm not sure how to do that. If we call cloned()
, iter_names
is moved and we can't use that iterator again, and we still can't use the cloned one twice. Is there another way to clone it so that we can still use the original iter_names
afterwards?
Similar to #66849, wouldn't it make sense to switch from the levenshtein distance to damerau-levenshtein or something similar that considers whole-word transpositions a distance of 1 instead of word-length. Adding more and more alternative algorithms means that the benefit of each is mutually exclusive. Folding the operations into a single algorithm allows the detection of two or more different kinds of misspellings. |
📌 Commit e01e8b9 has been approved by |
|
||
fn find_match_by_sorted_words<'a>(iter_names: Vec<&'a Symbol>, lookup: &str) -> Option<Symbol> { | ||
iter_names.iter().fold(None, |result, candidate| { | ||
if sort_by_words(&candidate.as_str()) == sort_by_words(lookup) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you're only doing comparison on the sorted elements...
fn sort_by_words(name: &str) -> String { | ||
let mut split_words: Vec<&str> = name.split('_').collect(); | ||
split_words.sort(); | ||
split_words.join("_") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...I wonder if it wouldn't be more efficient to return a Vec<&str>
here.
That being said, this is only done in a case where we're already emitting an error, so it should be fine.
Add a check for swapped words when we can't find an identifier Fixes rust-lang#66968 Couple things here: 1. The matches take the precedence of case insensitive match, then levenshtein match, then swapped words match. Doing this allows us to not even check for swapped words unless the other checks return `None`. 2. I've assumed that the swapped words check is not held to the limits of the max levenshtein distance threshold (ie. we want to try and find a match even if the levenshtein distance is very high). This means that we cannot perform this check in the `fold` that occurs after the `filter_map` call, because the candidate will be filtered out. So, I've split this into two separate `fold` calls, and had to collect the original iterator into a vec so it can be copied (I don't think we want to change the function signature to take a vec or require the `Copy` trait). An alternative implemenation may be to remove the `filter_map`, `fold` over the entire iterator, and do a check against `max_dist` inside the relevant cases there. r? @estebank
Rollup of 10 pull requests Successful merges: - #67774 (Try statx for all linux-gnu target.) - #67781 (Move `is_min_const_fn` query to librustc_mir.) - #67798 (Remove wrong advice about spin locks from `spin_loop_hint` docs) - #67849 (Add a check for swapped words when we can't find an identifier) - #67875 (Distinguish between private items and hidden items in rustdoc) - #67887 (`Option::{expect,unwrap}` and `Result::{expect, expect_err, unwrap, unwrap_err}` have `#[track_caller]`) - #67955 (rustdoc: Remove more `#[doc(cfg(..))]` duplicates) - #67977 (Updates for VxWorks) - #67985 (Remove insignificant notes from CStr documentation) - #68003 (ci: fix wrong shared.sh import for publish_toolstate) Failed merges: - #67820 (Parse the syntax described in RFC 2632) - #67979 (Move `intravisit` => `rustc_hir` + misc cleanup) r? @ghost
Rollup of 10 pull requests Successful merges: - #67774 (Try statx for all linux-gnu target.) - #67781 (Move `is_min_const_fn` query to librustc_mir.) - #67798 (Remove wrong advice about spin locks from `spin_loop_hint` docs) - #67849 (Add a check for swapped words when we can't find an identifier) - #67875 (Distinguish between private items and hidden items in rustdoc) - #67887 (`Option::{expect,unwrap}` and `Result::{expect, expect_err, unwrap, unwrap_err}` have `#[track_caller]`) - #67955 (rustdoc: Remove more `#[doc(cfg(..))]` duplicates) - #67977 (Updates for VxWorks) - #67985 (Remove insignificant notes from CStr documentation) - #68003 (ci: fix wrong shared.sh import for publish_toolstate) Failed merges: - #67820 (Parse the syntax described in RFC 2632) - #67979 (Move `intravisit` => `rustc_hir` + misc cleanup) r? @ghost
Fixes #66968
Couple things here:
None
.fold
that occurs after thefilter_map
call, because the candidate will be filtered out. So, I've split this into two separatefold
calls, and had to collect the original iterator into a vec so it can be copied (I don't think we want to change the function signature to take a vec or require theCopy
trait). An alternative implemenation may be to remove thefilter_map
,fold
over the entire iterator, and do a check againstmax_dist
inside the relevant cases there.r? @estebank