-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Case-insensitive matching without external crates #1988
Comments
See also: https://crates.io/crates/caseless I suspect this is something that needs to be cultivated on crates.io before it could make it into |
I agree! I will say that the I have a feeling that something like this would require a macro instead of just a function, although I'm sure that there are plenty of solutions. |
That requires Unicode case folding? Can you point to some examples? Anyway, I just came up with this: pub fn iter_eq_any_of<A, B>(mut a: A, bs: &mut [Option<B>]) -> Option<usize>
where A: Iterator, B: Iterator, A::Item: PartialEq<B::Item> {
// Could be `bs.len()` if the caller pinky-swears to never pass `None` in the initial slice.
let mut b_count = bs.iter().filter(|b| b.is_some()).count();
loop {
let a_item = a.next();
for (i, maybe_b) in bs.iter_mut().enumerate() {
if let Some(ref mut b) = *maybe_b {
match (&a_item, b.next()) {
(&None, None) => {
// Reached the end of two iterators without finding a difference:
// they match.
return Some(i)
}
(&Some(ref x), Some(ref y)) if x == y => {
// Matching so far, continue with the next b.
continue
}
_ => {
// Lexical borrows :(
}
}
} else {
continue
}
// Found a difference, stop looking in this b.
*maybe_b = None;
b_count -= 1;
if b_count == 0 {
// Found a difference in every b.
// (This is an optimization, the algorithm works without tracking b_count.)
return None
}
}
if a_item.is_none() {
// Reached the end of a without reaching the end of a matching b.
return None
}
}
}
#[test]
fn it_works() {
extern crate caseless;
// Allocates a String
match &*caseless::default_case_fold_str("String2") {
"string1" | "string2" => {},
"something" | "else" => panic!(),
_ => panic!(),
}
// No heap allocation
match iter_eq_any_of(
caseless::Caseless::default_case_fold("String1".chars()),
&mut [
// These are known to be fixed points (default_case_fold(x) == x)
// In the general case each of them would need to be case-folded too.
// A procedural macro could do this at compile-time.
Some("string1".chars()),
Some("string2".chars()),
Some("something".chars()),
Some("else".chars()),
]
) {
Some(0) | Some(1) => {},
Some(2) | Some(3) => panic!(),
_ => panic!(),
}
} Left as an exercise to the reader: use procedural-masquarade to make the latter code look like the former. This seems very niche, though. Or at least the solution is convoluted enough that I’m not convinced it belongs in
|
Triage ping: Any progress on this? |
I’m going to close this since there is no concrete proposal, and I feel that the set of stated constraints make this niche enough and would require a contrived enough API that whatever we could come up with may not belong in the standard library. To push this further I’d recommend discussing on internals.rlo and/or developing on crates.io, then possibly opening a formal RFC later. |
I see a fair amount of code that does something to the extent of:
And it'd be really nice if we could provide a way to do this with libstd or even libcore without requiring a crate like
regex
to do the heavy lifting.We already have
eq_ascii_ignore_case
but no equivalent with full unicode support, and there seems to be no easy way to do efficient, case-insensitive matching without allocating short of doing something ridiculous like:The text was updated successfully, but these errors were encountered: