-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a total ordering method for floating-point #53938
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -462,6 +462,40 @@ impl<T: Ord> Ord for Reverse<T> { | |
} | ||
} | ||
|
||
/// A wrapper newtype for providing a total order on a type that's normally partial. | ||
/// | ||
/// This is most commonly used for floating-point: | ||
/// | ||
/// ```rust | ||
/// #![feature(float_total_cmp)] | ||
/// | ||
/// use std::cmp::Total; | ||
/// | ||
/// assert!(-0.0 == 0.0); | ||
/// assert!(Total(-0.0) < Total(0.0)); | ||
/// assert_eq!(std::f32::NAN.partial_cmp(&0.0), None); | ||
/// assert_eq!(Total(std::f32::NAN).partial_cmp(&Total(0.0)), Some(std::cmp::Ordering::Greater)); | ||
/// | ||
/// let mut a = [3.0, 1.0, 2.0]; | ||
/// // a.sort(); // ERROR, because floats are !Ord | ||
/// a.sort_by_key(|x| std::cmp::Total(*x)); // But this works! | ||
/// assert_eq!(a, [1.0, 2.0, 3.0]); | ||
/// | ||
/// // By using `Total`, the struct can derive `Eq` and `Ord`. | ||
/// #[derive(PartialEq, Eq, PartialOrd, Ord)] | ||
/// struct MyData { | ||
/// foo: Total<f32>, | ||
/// bar: Total<f64>, | ||
/// } | ||
/// ``` | ||
/// | ||
/// It can also be used to provide both partial and total order implementations | ||
/// for an enum, if orders between variant don't make sense conceptually but | ||
/// are still desired for use as a `BTreeMap` key or similar. | ||
#[derive(Debug, Copy, Clone)] | ||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
pub struct Total<T>(#[unstable(feature = "float_total_cmp", issue = "55339")] pub T); | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The This feel like an extremely niche case for such a general API in Floats are ubiquitous, and it makes sense to make them easier to use. I think we could add In The A less intrusive solution to the problem of adjusting the default ordering of aggregates could be to just use macros to control which ordering is used, e.g., instead of #[derive(PartialOrd,Ord,PartialEq,Eq)]
struct A {
x: Total<f32>,
} one could #[derive(PartialOrd,Ord,PartialEq,Eq)]
struct A {
#[Ord = f32::total_cmp, Eq = f32::total_eq]
x: f32,
} A better long-term solution for types with multiple orderings in the ecosystem would be to evolve APIs like |
||
/// Trait for types that form a [total order](https://en.wikipedia.org/wiki/Total_order). | ||
/// | ||
/// An order is a total order if it is (for all `a`, `b` and `c`): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ | |
|
||
#![stable(feature = "rust1", since = "1.0.0")] | ||
|
||
use cmp; | ||
use mem; | ||
use num::FpCategory; | ||
|
||
|
@@ -474,4 +475,135 @@ impl f32 { | |
// It turns out the safety issues with sNaN were overblown! Hooray! | ||
unsafe { mem::transmute(v) } | ||
} | ||
|
||
/// A 3-way version of the IEEE `totalOrder` predicate. | ||
/// | ||
/// This is most commonly used with `cmp::Total`, not directly. | ||
/// | ||
/// This is useful in situations where you run into the fact that `f32` is | ||
/// `PartialOrd`, but not `Ord`, like sorting. If you need a total order | ||
/// on arbitrary floating-point numbers you should use this -- or one of | ||
/// the related `total_*` methods -- they're implemented efficiently using | ||
/// just a few bitwise operations. | ||
/// | ||
/// This function mostly agrees with `PartialOrd` and the usual comparison | ||
/// operators in how it orders floating point numbers, but it additionally | ||
/// distinguishes negative from positive zero (it considers -0 as less than | ||
/// +0, whereas `==` considers them equal) and it orders Not-a-Number values | ||
/// relative to the numbers and distinguishes NaNs with different bit patterns. | ||
/// | ||
/// NaNs with positive sign are ordered greater than all other floating-point | ||
/// values including positive infinity, while NaNs with negative sign are | ||
/// ordered the least, below negative infinity. Two different NaNs with the | ||
/// same sign are ordered first by whether the are signaling (signaling is | ||
/// less than quiet if the sign is positive, reverse if negative) and second | ||
/// by their payload interpreted as integer (reversed order if the sign is negative). | ||
/// | ||
/// This means all different (canonical) floating point bit patterns are | ||
/// placed in a linear order, given below in ascending order: | ||
/// | ||
/// - Quiet Not-a-Number with negative sign (ordered by payload, descending) | ||
/// - Signaling Not-a-Number with negative sign (ordered by payload, descending) | ||
/// - Negative infinity | ||
/// - Negative finite non-zero numbers (ordered in the usual way) | ||
/// - Negative zero | ||
/// - Positive zero | ||
/// - Positive finite non-zero numbers (ordered in the usual way) | ||
/// - Positive infinity | ||
/// - Signaling Not-a-Number with positive sign (ordered by payload, ascending) | ||
/// - Quiet Not-a-Number with positive sign (ordered by payload, ascending) | ||
/// | ||
/// # Examples | ||
/// | ||
/// Most follow the normal ordering: | ||
/// ``` | ||
/// #![feature(float_total_cmp)] | ||
/// use std::cmp::Ordering::*; | ||
/// | ||
/// assert_eq!(f32::total_cmp(1.0, 2.0), Less); | ||
/// assert_eq!(f32::total_cmp(1.0, 1.0), Equal); | ||
/// assert_eq!(f32::total_cmp(2.0, 1.0), Greater); | ||
/// assert_eq!(f32::total_cmp(-1.0, 1.0), Less); | ||
/// assert_eq!(f32::total_cmp(1.0, -1.0), Greater); | ||
/// assert_eq!(f32::total_cmp(-1.0, -2.0), Greater); | ||
/// assert_eq!(f32::total_cmp(-1.0, -1.0), Equal); | ||
/// assert_eq!(f32::total_cmp(-2.0, -1.0), Less); | ||
/// assert_eq!(f32::total_cmp(-std::f32::MAX, -1.0e9), Less); | ||
/// assert_eq!(f32::total_cmp(std::f32::MAX, 1.0e9), Greater); | ||
/// ``` | ||
/// | ||
/// Zeros and NaNs: | ||
/// ``` | ||
/// #![feature(float_total_cmp)] | ||
/// use std::cmp::Ordering::*; | ||
/// | ||
/// assert_eq!(f32::partial_cmp(&0.0, &-0.0), Some(Equal)); | ||
/// assert_eq!(f32::total_cmp(-0.0, 0.0), Less); | ||
/// assert_eq!(f32::total_cmp(0.0, -0.0), Greater); | ||
/// | ||
/// assert_eq!(f32::partial_cmp(&std::f32::NAN, &std::f32::NAN), None); | ||
/// assert_eq!(f32::total_cmp(-std::f32::NAN, std::f32::NAN), Less); | ||
/// assert_eq!(f32::total_cmp(std::f32::NAN, std::f32::NAN), Equal); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤔 We should probably have examples demonstrating the ordering by signaling bit and payload, constructing some non-default NaNs via |
||
/// assert_eq!(f32::total_cmp(std::f32::NAN, -std::f32::NAN), Greater); | ||
/// assert_eq!(f32::total_cmp(std::f32::NAN, std::f32::INFINITY), Greater); | ||
/// assert_eq!(f32::total_cmp(-std::f32::NAN, -std::f32::INFINITY), Less); | ||
/// ``` | ||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
#[inline] | ||
pub fn total_cmp(self, other: f32) -> cmp::Ordering { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this function take references like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I contemplated that, but decided that would be inconsistent with all the other things that would take references if f32 wasn't copy, but takes owned because it is copy. I note that there's currently not a single Hopefully the copy type ergonomics stuff and better fn type coercions will make There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was wondering about this too. If one just can't There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, the other solution to that would be to just make it safe... |
||
self.total_cmp_key().cmp(&other.total_cmp_key()) | ||
} | ||
|
||
#[inline] | ||
fn total_cmp_key(self) -> i32 { | ||
let x = self.to_bits() as i32; | ||
// Flip the bottom 31 bits if the high bit is set | ||
x ^ ((x >> 31) as u32 >> 1) as i32 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This really, really needs either an in-depth explanation justifying its correctness, or a reference to an external write-up of such an explanation. It sounds really plausible but I'm also kind of unsure about various aspects, and I've already spent more time thinking about this order and about the binary exchange format than any human being should. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have a proof for this, but it's equivalent to the one in rust-lang/rfcs#1249 (comment). I suspect the predicate was chosen specifically to allow such a simple implementation, because of things like "the same floating-point datum" being ordered by exponent first. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have the exact same question for @jpetkau then 😉 I agree with your suspicion (tho note that "same floating point datum" with different exponents only happens in decimal floating point) but I'd rather have it confirmed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about something like: "With their bits interpreted as signed integers, positive IEEE754 floats are ordered as:
With the reverse order for floats with the sign bit set. So if we reverse the order of the negative cases, by xoring the sign bit with the other bits, we get the order:
IEEE754.2008 calls this the 'totalOrder' predicate. (It also specifies an interpretation of the signaling bit which implies -QNANs < -SNANs < +SNANs < +QNANs, but if we happen to be on an older MIPS architecture with the opposite interpretation of the signaling bit, this won't be true.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds mostly convincing! The main thing that isn't obvious to me is why negating the non-sign bits reverses the order but maybe that's just someone you have to convince yourself of by spot checks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's true that this reinterprets the whole 32- or 64-bit unit, but it's possible that an architecture uses different endianness for floats and ints, then the bytes will appear reversed when stored to memory as a float and loaded back as an integer. However, I don't think we currently support such architectures, and if we wanted to, we'd have to update more code than just this method (in particular There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I dispute that it's possible that an architecture uses different endianness for floats and ints. That would be deeply insane and break a lot more than this code. [edit] of course it's technically possible, just like an architecture could use different endianess for signed and unsigned ints if it really wanted to be perverse. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you mean you dispute that it's possible? There's real physical processors (old ARMs) doing so. We don't support them, and I would be open to concluding we don't ever want to support them (just like e.g. architectures with non-octet memory granularity – also breaks tons of code, but definitely a real thing!), but they exist. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I was just being flippant there. Such architectures do exist. But they would break far more than just this code; they'd break everything that relied on the bit pattern of floats, which is, well, everything. Every serialization library, formatter, parser, implementation of math libraries, you name it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thanks for the answer, I just wasn't sure if this had to be the case, but it seems that for Rust this will need to be the case. |
||
} | ||
} | ||
|
||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
impl PartialEq for cmp::Total<f32> { | ||
fn eq(&self, other: &Self) -> bool { | ||
self.0.to_bits() == other.0.to_bits() | ||
} | ||
} | ||
|
||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
impl Eq for cmp::Total<f32> {} | ||
|
||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
impl PartialOrd for cmp::Total<f32> { | ||
fn partial_cmp(&self, other: &Self) -> Option<cmp::Ordering> { | ||
Some(self.cmp(other)) | ||
} | ||
} | ||
|
||
/// See `f32::total_cmp` for details. | ||
/// | ||
/// Using this to sort floats: | ||
/// ``` | ||
/// #![feature(float_total_cmp)] | ||
/// | ||
/// let mut a = [ | ||
/// 1.0, | ||
/// -1.0, | ||
/// 0.0, | ||
/// -0.0, | ||
/// std::f32::NAN, | ||
/// -std::f32::NAN, | ||
/// std::f32::INFINITY, | ||
/// std::f32::NEG_INFINITY, | ||
/// ]; | ||
/// a.sort_by_key(|x| std::cmp::Total(*x)); | ||
/// assert_eq!( | ||
/// format!("{:?}", a), | ||
/// "[NaN, -inf, -1.0, -0.0, 0.0, 1.0, inf, NaN]" | ||
/// ); | ||
/// ``` | ||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
impl Ord for cmp::Total<f32> { | ||
fn cmp(&self, other: &Self) -> cmp::Ordering { | ||
self.0.total_cmp(other.0) | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,7 @@ | |
|
||
#![stable(feature = "rust1", since = "1.0.0")] | ||
|
||
use cmp; | ||
use mem; | ||
use num::FpCategory; | ||
|
||
|
@@ -487,4 +488,135 @@ impl f64 { | |
// It turns out the safety issues with sNaN were overblown! Hooray! | ||
unsafe { mem::transmute(v) } | ||
} | ||
|
||
/// A 3-way version of the IEEE `totalOrder` predicate. | ||
/// | ||
/// This is most commonly used with `cmp::Total`, not directly. | ||
/// | ||
/// This is useful in situations where you run into the fact that `f64` is | ||
/// `PartialOrd`, but not `Ord`, like sorting. If you need a total order | ||
/// on arbitrary floating-point numbers you should use this -- or one of | ||
/// the related `total_*` methods -- they're implemented efficiently using | ||
/// just a few bitwise operations. | ||
/// | ||
/// This function mostly agrees with `PartialOrd` and the usual comparison | ||
/// operators in how it orders floating point numbers, but it additionally | ||
/// distinguishes negative from positive zero (it considers -0 as less than | ||
/// +0, whereas `==` considers them equal) and it orders Not-a-Number values | ||
/// relative to the numbers and distinguishes NaNs with different bit patterns. | ||
/// | ||
/// NaNs with positive sign are ordered greater than all other floating-point | ||
/// values including positive infinity, while NaNs with negative sign are | ||
/// ordered the least, below negative infinity. Two different NaNs with the | ||
/// same sign are ordered first by whether the are signaling (signaling is | ||
/// less than quiet if the sign is positive, reverse if negative) and second | ||
/// by their payload interpreted as integer (reversed order if the sign is negative). | ||
/// | ||
/// This means all different (canonical) floating point bit patterns are | ||
/// placed in a linear order, given below in ascending order: | ||
/// | ||
/// - Quiet Not-a-Number with negative sign (ordered by payload, descending) | ||
/// - Signaling Not-a-Number with negative sign (ordered by payload, descending) | ||
/// - Negative infinity | ||
/// - Negative finite non-zero numbers (ordered in the usual way) | ||
/// - Negative zero | ||
/// - Positive zero | ||
/// - Positive finite non-zero numbers (ordered in the usual way) | ||
/// - Positive infinity | ||
/// - Signaling Not-a-Number with positive sign (ordered by payload, ascending) | ||
/// - Quiet Not-a-Number with positive sign (ordered by payload, ascending) | ||
/// | ||
/// # Examples | ||
/// | ||
/// Most follow the normal ordering: | ||
/// ``` | ||
/// #![feature(float_total_cmp)] | ||
/// use std::cmp::Ordering::*; | ||
/// | ||
/// assert_eq!(f64::total_cmp(1.0, 2.0), Less); | ||
/// assert_eq!(f64::total_cmp(1.0, 1.0), Equal); | ||
/// assert_eq!(f64::total_cmp(2.0, 1.0), Greater); | ||
/// assert_eq!(f64::total_cmp(-1.0, 1.0), Less); | ||
/// assert_eq!(f64::total_cmp(1.0, -1.0), Greater); | ||
/// assert_eq!(f64::total_cmp(-1.0, -2.0), Greater); | ||
/// assert_eq!(f64::total_cmp(-1.0, -1.0), Equal); | ||
/// assert_eq!(f64::total_cmp(-2.0, -1.0), Less); | ||
/// assert_eq!(f64::total_cmp(-std::f64::MAX, -1.0e9), Less); | ||
/// assert_eq!(f64::total_cmp(std::f64::MAX, 1.0e9), Greater); | ||
/// ``` | ||
/// | ||
/// Zeros and NaNs: | ||
/// ``` | ||
/// #![feature(float_total_cmp)] | ||
/// use std::cmp::Ordering::*; | ||
/// | ||
/// assert_eq!(f64::partial_cmp(&0.0, &-0.0), Some(Equal)); | ||
/// assert_eq!(f64::total_cmp(-0.0, 0.0), Less); | ||
/// assert_eq!(f64::total_cmp(0.0, -0.0), Greater); | ||
/// | ||
/// assert_eq!(f64::partial_cmp(&std::f64::NAN, &std::f64::NAN), None); | ||
/// assert_eq!(f64::total_cmp(-std::f64::NAN, std::f64::NAN), Less); | ||
/// assert_eq!(f64::total_cmp(std::f64::NAN, std::f64::NAN), Equal); | ||
/// assert_eq!(f64::total_cmp(std::f64::NAN, -std::f64::NAN), Greater); | ||
/// assert_eq!(f64::total_cmp(std::f64::NAN, std::f64::INFINITY), Greater); | ||
/// assert_eq!(f64::total_cmp(-std::f64::NAN, -std::f64::INFINITY), Less); | ||
/// ``` | ||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
#[inline] | ||
pub fn total_cmp(self, other: f64) -> cmp::Ordering { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could these be implemented using a macro so that all tests and docs can be shared? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Plausibly, but that's a general There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just because the other methods could be improved does not mean that we have to make the situation ""worse"" with this PR. It would be better to just add a If someone wants to reduce duplication for the other methods, they can do the same, and once we have a couple of files we could move them all to a sub-directory or something, but we should start somewhere. |
||
self.total_cmp_key().cmp(&other.total_cmp_key()) | ||
} | ||
|
||
#[inline] | ||
fn total_cmp_key(self) -> i64 { | ||
let x = self.to_bits() as i64; | ||
// Flip the bottom 31 bits if the high bit is set | ||
x ^ ((x >> 31) as u64 >> 1) as i64 | ||
} | ||
} | ||
|
||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
impl PartialEq for cmp::Total<f64> { | ||
fn eq(&self, other: &Self) -> bool { | ||
self.0.to_bits() == other.0.to_bits() | ||
} | ||
} | ||
|
||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
impl Eq for cmp::Total<f64> {} | ||
|
||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
impl PartialOrd for cmp::Total<f64> { | ||
fn partial_cmp(&self, other: &Self) -> Option<cmp::Ordering> { | ||
Some(self.cmp(other)) | ||
} | ||
} | ||
|
||
/// See `f64::total_cmp` for details. | ||
/// | ||
/// Using this to sort floats: | ||
/// ``` | ||
/// #![feature(float_total_cmp)] | ||
/// | ||
/// let mut a = [ | ||
/// 1.0, | ||
/// -1.0, | ||
/// 0.0, | ||
/// -0.0, | ||
/// std::f64::NAN, | ||
/// -std::f64::NAN, | ||
/// std::f64::INFINITY, | ||
/// std::f64::NEG_INFINITY, | ||
/// ]; | ||
/// a.sort_by_key(|x| std::cmp::Total(*x)); | ||
/// assert_eq!( | ||
/// format!("{:?}", a), | ||
/// "[NaN, -inf, -1.0, -0.0, 0.0, 1.0, inf, NaN]" | ||
/// ); | ||
/// ``` | ||
#[unstable(feature = "float_total_cmp", issue = "55339")] | ||
impl Ord for cmp::Total<f64> { | ||
fn cmp(&self, other: &Self) -> cmp::Ordering { | ||
self.0.total_cmp(other.0) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The brief summary seems to indicate that only
T: PartialOrd
s are accepted but the signature does not require that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's completely true, but note that
cmp::Reverse
also doesn't bound itself.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, do you know why that is? Appears to be an oversight, what use is reverse if
T
does not implementPartialOrd
? - If this was an oversight forReverse
, we don't have to make the same mistake here.