Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add {f32,f64}::approx_unchecked_to<Int> unsafe methods #66841

Merged
merged 3 commits into from
Dec 7, 2019

Conversation

SimonSapin
Copy link
Contributor

@SimonSapin SimonSapin commented Nov 28, 2019

As discussed in #10184

Currently, casting a floating point number to an integer with as is Undefined Behavior if the value is out of range. -Z saturating-float-casts fixes this soundness hole by making as “saturate” to the maximum or minimum value of the integer type (or zero for NaN), but has measurable negative performance impact in some benchmarks. There is some consensus in that thread for enabling saturation by default anyway, but provide an unsafe fn alternative for users who know through some other mean that their values are in range.

The “fit” wording is copied from https://llvm.org/docs/LangRef.html#fptoui-to-instruction, but I’m not certain what it means exactly. Presumably this is after rounding towards zero, and the doc-test with i8::MIN seems to confirm this. Clang presumably uses those LLVM intrinsics to implement C and C++ casts, whose respective standard specify that the value after truncating to keep its integral part must be representable in the target type.

@rust-highfive
Copy link
Collaborator

r? @KodrAus

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 28, 2019
@SimonSapin
Copy link
Contributor Author

Regarding naming: we already have f32::round(self) -> Self and f64::round(self) -> Self which round to the nearest integer.

These new methods differ from those in that they round towards zero. However they are already different enough in their return type that perhaps documentation is enough and this different rounding behavior doesn’t need to be called out in the method name?

@Centril
Copy link
Contributor

Centril commented Nov 28, 2019

r? @rkruppe

src/librustc_codegen_llvm/intrinsic.rs Outdated Show resolved Hide resolved
///
/// # Safety
///
/// The value must be finite and fit in the return type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be elaborated upon what finite and fitting entails?

Copy link
Contributor Author

@SimonSapin SimonSapin Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finite is defined in the docs of the is_finite method, but I can repeat in this comment.

As discussed in the PR description, “fit” is copied from https://llvm.org/docs/LangRef.html#fptoui-to-instruction and I couldn’t find anything more precise. I’d prefer to find out what LLVM does exactly rather than documenting suppositions as fact, but I’m not sure how.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. finite, maybe link to is_finite saying "as defined by is_finite"?

Hopefully @rkruppe is more knowledgeable re. "fit".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit of doc-comment is now “The value must be finite (not infinite or NaN) and fit in the return type.” and I’ve added a (non-doc-) comment with a link to relevant LLVM docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not spelled out in LLVM's LangRef, then my working knowledge of how the instruction works probably isn't worth much. There's dozens of places in LLVM that could deviate from the expected semantics (various IR passes and utilities, as well as the lowerings in the backends) and I do not know all of them.

@hanna-kruppe
Copy link
Contributor

The implementation looks good to me.

Regarding naming: I don't think "round" is a good name. Not only do we already have APIs that use "round" in a different sense for the same pairs of types (even if the signatures are not literally identical because these new functions are generic), I would also never think to suspect that something named just "round" defaults to RTZ mode, even with the context of a float -> int signature.

I don't love any of these but here are some alternatives that seem less misleading to me:

  • unchecked_cast_to (leaning on the general precedent of float -> int casts rounding to zero)
  • unchecked_to_int
  • unchecked_round_to_zero (wow this is ugly)

@SimonSapin
Copy link
Contributor Author

The “fit” wording is copied from https://llvm.org/docs/LangRef.html#fptoui-to-instruction, but I’m not certain what it means exactly.

Presumably fptoui and fptosi exist to allow clang to implement C and C++. Emphasis mine:

C11 standard https://port70.net/~nsz/c/c11/n1570.html#6.3.1.4

When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.

C++17 standard http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4659.pdf#section.7.10

A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.

So “fit” is indeed after rounding/truncation.

@SimonSapin
Copy link
Contributor Author

@rkruppe What do you think of just unchecked_to? There is precedent with Into::into and TryInto::try_into of the method name not saying (in)to what, and although it’s optional the turbofish can answer that question: float_value.unchecked_to::<u16>()

@hanna-kruppe
Copy link
Contributor

Seems fine to me. I assume we don't really have to worry about a future name collision with a hypothetical std::convert::UncheckedInto trait.

@SimonSapin
Copy link
Contributor Author

I’m considering whether the trait should be something like std::convert::FloatToInt, such that in the future we could have other methods in it (also supporting inherent methods of f32 and f64) with various conversions semantics.

@SimonSapin
Copy link
Contributor Author

These methods could fit among other conversion methods: https://internals.rust-lang.org/t/pre-rfc-add-explicitly-named-numeric-conversion-apis/11395

@SimonSapin SimonSapin changed the title Add {f32,f64}::round_unchecked_to<Int> unsafe methods Add {f32,f64}::approx_unchecked_to<Int> unsafe methods Dec 2, 2019
@SimonSapin
Copy link
Contributor Author

I’ve renamed the methods to approx_unchecked_to per conversation in that internals thread, and moved/renamed the trait to convert::FloatToInt in anticipation for additional methods with different semantics.

@SimonSapin
Copy link
Contributor Author

I’ve filed tracking issue #67057 and #67058, and made the stability attribute point to them.

@rkruppe, anything else you’d like to see before we land this?

@hanna-kruppe
Copy link
Contributor

I'm happy with the final diff but FWIW I also don't have any opinion on the API design aspects (e.g., the entirety of the FloatToInt trait) other than not making the name misleading.

The individual commits seem worth revising before merging, though:

  • Contrary to its commit message, 0145ed7edee9ec9de71c0b509841d7263eb16045 seems to just delete libstd/convert.rs. libstd/convert/mod.rs file only shows up in the following commit.
  • Having separate commits that rename the file and move the numeric From/TryFrom impls to covert/num.rs seems useful for history spelunking, but the other commits should be squashed into 20f091b8948a03072adbcc3e9207c3795399f88c (including updating the commit message for the new method names and dropping the last paragraph about LLVM docs being vague).

As discussed in rust-lang#10184

Currently, casting a floating point number to an integer with `as` is Undefined Behavior if the value is out of range. `-Z saturating-float-casts` fixes this soundness hole by making `as` “saturate” to the maximum or minimum value of the integer type (or zero for `NaN`), but has measurable negative performance impact in some benchmarks. There is some consensus in that thread for enabling saturation by default anyway, but provide an `unsafe fn` alternative for users who know through some other mean that their values are in range.
This makes `libcore/num/mod.rs` slightly smaller. It’s still 4911 lines and not easy to navigate. This doesn’t change any public API.
@SimonSapin
Copy link
Contributor Author

Squashed.

Input from @rust-lang/libs about the API design is welcome here. This is all #[unstable] so we can always discuss in the tracking issues after this is merged and change it later.

@hanna-kruppe
Copy link
Contributor

Yes, let's merge this and bikeshedding can happen on the tracking issues.

@bors r+

@bors
Copy link
Contributor

bors commented Dec 6, 2019

📌 Commit a213ff8 has been approved by rkruppe

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 6, 2019
Centril added a commit to Centril/rust that referenced this pull request Dec 6, 2019
…, r=rkruppe

Add `{f32,f64}::approx_unchecked_to<Int>` unsafe methods

As discussed in rust-lang#10184

Currently, casting a floating point number to an integer with `as` is Undefined Behavior if the value is out of range. `-Z saturating-float-casts` fixes this soundness hole by making `as` “saturate” to the maximum or minimum value of the integer type (or zero for `NaN`), but has measurable negative performance impact in some benchmarks. There is some consensus in that thread for enabling saturation by default anyway, but provide an `unsafe fn` alternative for users who know through some other mean that their values are in range.

<del>The “fit” wording is copied from https://llvm.org/docs/LangRef.html#fptoui-to-instruction, but I’m not certain what it means exactly. Presumably this is after rounding towards zero, and the doc-test with `i8::MIN` seems to confirm this.</del> Clang presumably uses those LLVM intrinsics to implement C and C++ casts, whose respective standard specify that the value *after truncating to keep its integral part* must be representable in the target type.
bors added a commit that referenced this pull request Dec 6, 2019
Rollup of 10 pull requests

Successful merges:

 - #66606 (Add feature gate for mut refs in const fn)
 - #66841 (Add `{f32,f64}::approx_unchecked_to<Int>` unsafe methods)
 - #67009 (Emit coercion suggestions in more places)
 - #67052 (Ditch `parse_in_attr`)
 - #67071 (Do not ICE on closure typeck)
 - #67078 (accept union inside enum if not followed by identifier)
 - #67090 (Change "either" to "any" in Layout::from_size_align's docs)
 - #67092 (Fix comment typos in src/libcore/alloc.rs)
 - #67094 (get rid of __ in field names)
 - #67102 (Add note to src/ci/docker/README.md about multiple docker images)

Failed merges:

 - #67101 (use `#[allow(unused_attributes)]` to paper over incr.comp problem)

r? @ghost
@bors bors merged commit a213ff8 into rust-lang:master Dec 7, 2019
@SimonSapin SimonSapin deleted the float_round_unchecked_to branch March 29, 2020 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants