ACP: Range::cmp_scalar; comparison (less/equal/greater) to a primitive of the Range #115

golddranks · 2022-09-29T12:02:31Z

Proposal

Problem statement

Comparing ranges to their underlying primitives could be easier; currently one needs to compare with the endpoints directly, which is prone to logic and one-off errors. A comparison operation seems like a natural thing to support for a range.

Motivation, use-cases

I found the motivation for this API with code like this – searching through a list of segmented files to find a range that contains the desired point:

let files = vec![
    File { seqnum: 0, range: 0..1000 },
    File { seqnum: 1, range: 1000..2200 },
    File { seqnum: 2, range: 2200..3900 },
    File { seqnum: 3, range: 3900..5000 },
];
let target = 1600;
let index = files.binary_search_by(|f| f.range.cmp_scalar(target))?;
assert_eq!(files[index].seqnum, 1);

Solution sketches

I actually started by sending a PR: rust-lang/rust#102343 There are some alternative designs one could consider, mentioned in the PR: first, it was pointed out that returning just Ordering doesn't support the case where the Range is degenerate, i.e. the start point being after the endpoint. In this case an ordering makes hardly any sense. Perhaps returning Option<Ordering> would account for this?

I'll copy the other points for discussion here from the PR, for ease of review:

Why compare a range to a scalar, and not scalar to range?

I.e. why (3..9).cmp_scalar(5) instead of 5.cmp_range(3..9)?

Originally, I thought that an API to compare a number to a range would make more sense intuitively, and set out to implement it. However, it turned out that there were two kinds of problems:

The order in the original motivation gets reversed, and reverse() must be called after this operation to fix it:
files.binary_search_by(|f| target.cmp_range(f.range)).reverse()?;
Of course, cmp_range would make sense, if searching for a single number among of many, that fits the range. However, I had hard time imagining a common case where this would be the requirement, whereas the motivation for cmp_scalar was clear from the case introduced above.
The implementation gets gnarly, as the API must take the range as a generic parameter, and the different kind of Ranges are different types. I tried to implement it as generic T: Into<RangeBounds>, but the ergonomics kind of sucked. Also, the problem of ranges not being Copy could be mitigated when the range is passed as the &self parameter because auto-ref works in that position.

Other points for discussion

Does the name cmp_scalar make sense? To me it kind of does, but I see no precedent with calling stuff "scalars" in the stdlib. I wonder if there are better alternatives.
Ord vs PartialOrd? With contains, PartialOrd makes sense, but with a proper ordering returned by this API, I think that Ord is the only sensible choice.
The method currently takes the scalar by copy. It might make sense to take it by reference to support textual ranges? That worsens the ergonomics for numbers though.
Does it make sense to make it more generic (to support &str vs String comparisons, for example)? That might worsen the ergonomics because it makes type inference harder.
Does it make sense to actually just implement PartialOrd<T> for Range<T>, instead of having this as a separate method? Can we consider this kind of comparison "canonical" enough to be able to provide a standard implementation?

Links and related work

Again, here's my attempt at an implementation: rust-lang/rust#102343

I also checked out whether C++ supports comparing ranges. It does seem only support comparing two ranges with a lexicographical comparison: https://cplusplus.com/reference/algorithm/lexicographical_compare/ In my mind, comparing a range to a single scalar actually makes more sense, because it isn't clear what the answer should be when the ranges overlap. (Of course, the word lexicographical accounts for this, but I find the operation not well-motivated.)

The text was updated successfully, but these errors were encountered:

joshtriplett · 2023-06-27T16:59:36Z

We discussed this in today's @rust-lang/libs-api meeting. We didn't finish going through "Other points for discussion" and still need to, but we found the binary_search_by example compelling and we liked the idea of having a method like this. We'll need to finish going through "Other points for discussion" to determine the signature of the method.

golddranks · 2023-07-09T06:04:52Z

Thanks for the heads up. When do you expect to continue the discussion?

Amanieu · 2023-07-09T12:29:25Z

It slipped from the agenda in the last meeting, but I've nominated it for discussion this week.

Amanieu · 2023-07-11T17:50:09Z

We discussed this in the libs-api meeting today.

One of the key points that was mentioned was that this isn't the best way to perform a binary search since this performs multiple comparisons at each step. Instead, this can be implemented much more efficiently like this:

let files = vec![
    File { seqnum: 0, range: 0..1000 },
    File { seqnum: 1, range: 1000..2200 },
    File { seqnum: 2, range: 2200..3900 },
    File { seqnum: 3, range: 3900..5000 },
];
let target = 1600;
let index = files.partition_point(|f| f.range.end > target);
let index = files.get(index).and_then(|f| target >= f.range.start).expect("no matching range");
assert_eq!(files[index].seqnum, 1);

Perhaps it might be worth looking into a different API that makes this pattern easier to use.

We did discuss the specific questions, although because of the previous concern it's not clear that this API is the best way to solve the motivating example.

Does the name cmp_scalar make sense? To me it kind of does, but I see no precedent with calling stuff "scalars" in the stdlib. I wonder if there are better alternatives.

We couldn't think of anything better, but this can be revisited during FCP.

Ord vs PartialOrd? With contains, PartialOrd makes sense, but with a proper ordering returned by this API, I think that Ord is the only sensible choice.

We prefer to just have Ord for now, and potentially a separate function for PartialOrd if there is demand for it.

The method currently takes the scalar by copy. It might make sense to take it by reference to support textual ranges? That worsens the ergonomics for numbers though.

This should take the scalar by reference like contains.

Does it make sense to make it more generic (to support &str vs String comparisons, for example)? That might worsen the ergonomics because it makes type inference harder.

Yes, because contains does the same thing.

Does it make sense to actually just implement PartialOrd<T> for Range<T>, instead of having this as a separate method? Can we consider this kind of comparison "canonical" enough to be able to provide a standard implementation?

(We didn't get around to discussing this question, because the issue around partition_point came up)

m-ou-se · 2024-03-05T17:45:49Z

We discussed this again in the libs-api meeting, but we feel we cannot continue with this ACP without discussing whether this is actually the right solution for the use case, considering what was mentioned in the previous comment:

One of the key points that was mentioned was that this isn't the best way to perform a binary search since this performs multiple comparisons at each step. Instead, this can be implemented much more efficiently like this

Closing due to the lack of reponse since the comment above, but feel free to re-open at any time.

golddranks added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Sep 29, 2022

scottmcm mentioned this issue Feb 12, 2023

ACP template should better reflect actual feedback latency #180

Closed

Amanieu added the I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. label Jul 9, 2023

Amanieu removed the I-libs-api-nominated Indicates that an issue has been nominated for discussion during a team meeting. label Jul 11, 2023

Dylan-DPC mentioned this issue Jul 26, 2023

New API: Range::cmp_scalar; comparison (less/equal/greater) to a primitive of the Range rust-lang/rust#102343

Open

m-ou-se closed this as completed Mar 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ACP: Range::cmp_scalar; comparison (less/equal/greater) to a primitive of the Range #115

ACP: Range::cmp_scalar; comparison (less/equal/greater) to a primitive of the Range #115

golddranks commented Sep 29, 2022 •

edited

Loading

joshtriplett commented Jun 27, 2023

golddranks commented Jul 9, 2023

Amanieu commented Jul 9, 2023

Amanieu commented Jul 11, 2023

m-ou-se commented Mar 5, 2024

ACP: Range::cmp_scalar; comparison (less/equal/greater) to a primitive of the Range #115

ACP: Range::cmp_scalar; comparison (less/equal/greater) to a primitive of the Range #115

Comments

golddranks commented Sep 29, 2022 • edited Loading

Proposal

Problem statement

Motivation, use-cases

Solution sketches

Why compare a range to a scalar, and not scalar to range?

Other points for discussion

Links and related work

joshtriplett commented Jun 27, 2023

golddranks commented Jul 9, 2023

Amanieu commented Jul 9, 2023

Amanieu commented Jul 11, 2023

m-ou-se commented Mar 5, 2024

golddranks commented Sep 29, 2022 •

edited

Loading