Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for integer_sign_cast #125882

Open
2 of 3 tasks
Rua opened this issue Jun 2, 2024 · 34 comments
Open
2 of 3 tasks

Tracking Issue for integer_sign_cast #125882

Rua opened this issue Jun 2, 2024 · 34 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. relnotes Marks issues that should be documented in the release notes of the next release. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Rua
Copy link
Contributor

Rua commented Jun 2, 2024

Feature gate: #![feature(integer_sign_cast)]

This is a tracking issue for explicit signedness casting methods for integer primitive types. Libs discussion: rust-lang/libs-team#359.

Public API

(for N in [8, 16, 32, 64, 128, size]):

impl uN {
    pub const fn cast_signed(self) -> iN {}
}

impl iN {
    pub const fn cast_unsigned(self) -> uN {}
}

impl NonZero<uN> {
    pub const fn cast_signed(self) -> NonZero<iN> {}
}

impl NonZero<iN> {
    pub const fn cast_unsigned(self) -> NonZero<uN> {}
}

Steps / History

Unresolved Questions

Mostly bikeshedding regarding naming, as mentioned in the Libs discussion. The current proposal follows the naming of cast_const and cast_mut for pointers.

Alternatively, these could be implemented as from_bits and to_bits methods for the signed types.

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@Rua Rua added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Jun 2, 2024
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Jun 2, 2024
…lacrum

Implement feature `integer_sign_cast`

Tracking issue: rust-lang#125882

Since this is my first time making a library addition I wasn't sure where to place the new code relative to existing code. I decided to place it near the top where there are already some other basic bitwise manipulation functions. If there is an official guideline for the ordering of functions, please let me know.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jun 3, 2024
Rollup merge of rust-lang#125884 - Rua:integer_sign_cast, r=Mark-Simulacrum

Implement feature `integer_sign_cast`

Tracking issue: rust-lang#125882

Since this is my first time making a library addition I wasn't sure where to place the new code relative to existing code. I decided to place it near the top where there are already some other basic bitwise manipulation functions. If there is an official guideline for the ordering of functions, please let me know.
@tbu-
Copy link
Contributor

tbu- commented Jun 3, 2024

Mostly bikeshedding regarding naming, as mentioned in the Libs discussion. The current proposal follows the naming of cast_const and cast_mut for pointers.

I think that cast_signed and cast_unsigned do not communicate intent on what to do on overflows. This is unlike cast_const and cast_mut where no such overflow can occur.

I think something like reinterpret_signed or the proposed from_bits/to_bits woul be more appropriate.

@electroCutie
Copy link

do not communicate intent on what to do on overflows

There wouldn't be any overflows, as I understand the term. The number of bits stays the same, which is expressly part of the design of the function.

@tbu-
Copy link
Contributor

tbu- commented Jun 4, 2024

It does not communicate what happens to values of unsigned integers that do not fit into the target integer values.

We're reinterpreting the bits as an unsigned integer, but cast does not say that. E.g. casting from integer to float does not reinterpret the bits, casting from i8 to i32 does not solely reinterpret the bits either, it sign-extends them. So "cast" doesn't tell me what happens.

@electroCutie
Copy link

electroCutie commented Jun 4, 2024

@tbu- ah now I see your meaning

Your suggestion of reinterpret_signed seems like a good way to phrase it

@Rua
Copy link
Contributor Author

Rua commented Jun 4, 2024

What about transmute_signed? That seems closer to Rust's typical jargon.

@aapoalas
Copy link

aapoalas commented Jun 8, 2024

IMO: A consistent API is most important, so from_bits / to_bits should be chosen.

@Rua
Copy link
Contributor Author

Rua commented Sep 3, 2024

After thinking about this for a while, and looking at how I'm actually using this function, I think I agree with the suggestion to use from_bits and to_bits instead. Do others agree? If so, can I just make a PR to change the name?

@jhpratt
Copy link
Member

jhpratt commented Sep 5, 2024

I personally find from_bits/to_bits to be too similar to the {from|to}_{n|b|l}e_bytes methods. However from_bits and to_bits are present and have been stable for years, with the behavior equivalent to that of a transmute (i.e. the same as this proposal).

So as much as I don't care for the naming, it would be consistent with that. With that said, casting a sign and reinterpreting the type entirely feel like two different things that don't necessarily have precedential impact. An integer and a float share effectively zero in common when using from_bits/to_bits, while integers will share a value for the overlapping range.

@aapoalas
Copy link

aapoalas commented Sep 6, 2024

After thinking about this for a while, and looking at how I'm actually using this function, I think I agree with the suggestion to use from_bits and to_bits instead. Do others agree? If so, can I just make a PR to change the name?

Do you happen to have any examples of this usage in public repositories?

@jhpratt
Copy link
Member

jhpratt commented Sep 6, 2024

@aapoalas While not using this feature directly, the time-rs/time repository uses an API from num-conv that is identical to the current proposal.

@SabrinaJewson
Copy link
Contributor

SabrinaJewson commented Sep 28, 2024

I’m not too keen on from_bits and to_bits. While it works for the use-case of shuttling an i64 value in a u64-shaped container, it doesn’t look so nice for the dual operation (which I think I find myself doïng more often), since it implies that i64 is your real integer value and u64 is its bitwise representation, when it’s in fact the other way around.

(re some other suggestions: transmute_* reads a danger word in my head, so probably isn’t ideal for a safe method, while reinterpret_* is kinda verbose and doesn’t have precedent. I think that cast does carry a sufficient connotation of reinterpreting the bit values directly; it’s in line with pointers’ cast method too.)

@electroCutie
Copy link

What about something like bits_to_i / bits_to_u (and if desired the bits_from_* versions in associated functions)

Hints that it is bitwise and doesn't introduce a new term (I take it as a given that someone performing this operation already knows the term "bits"), doesn't try to put one or the other as the canonical representation, still uses the familiar from/to jargon, and is a bit less verbose than some other wordier proposals

@endrift
Copy link

endrift commented Nov 16, 2024

I keep finding myself needing this for dealing with C APIs and ioctls. I don't like the to_bits/from_bits proposal because it's quite vague for people not familiar with 2's complement. May I suggest cast_wrapping or to_unsigned_wrapping/to_signed_wrapping, or similar?

@Rua
Copy link
Contributor Author

Rua commented Nov 17, 2024

I do like to_(un)signed_wrapping. I think including the destination signedness within the name is important, so that it's more robust against refactorings. If you change the signedness of the type it's called on, the compiler will immediately complain at you that the method is wrong.

@electroCutie
Copy link

@Rua I'm definitely not liking the "wrapping" or "unwrapping" terminology. They're not at all wrappers. They're just different ways of interpreting the same data

@Rua
Copy link
Contributor Author

Rua commented Nov 17, 2024

It's not wrappers as in wrapping around another thing. It's wrapping as in allowing integer wrap-around. The same as methods like wrapping_add, wrapping_neg etc.

@electroCutie
Copy link

electroCutie commented Nov 17, 2024

I didn't think at all in that direction, I see what you mean but it doesn't immediately click

@vnghia
Copy link

vnghia commented Dec 6, 2024

Hi, I would like to ask is there something planned for NonZero as well since they are also integer and will work pretty much the same as normal integer in this cases.

@DarkKirb
Copy link

👍 for cast_signed, cast_unsigned because it’s more consistent with other uses of the word cast in std (bit-preserving reinterpretation) vs as or to (which in many cases don’t preserve the underlying representation)

@CodesInChaos
Copy link

I think it's likely that we will get similar functions with different overflow semantics (e.g. panicking, saturating, etc.), so I'd go with wrapping_to_(un)signed for these, strict_to_(un)signed for the panicking version. For consistency with methods like wrapping_add, wrapping should be in the beginning of the name.

@endrift
Copy link

endrift commented Dec 11, 2024

Now that you mention it, I would love a saturating version as well and have written short implementations of that already, since it's more or less trivial in comparison (without unsafe code that is). Future proofing like that seems like a decent idea.

@joshtriplett
Copy link
Member

These have been around for a while, they're useful for eliminating uses of as, and it seems like the only concern about them is whether they have the right name. The current cast_signed and cast_unsigned seem like reasonable names here.

Shall we stabilize these?

@rfcbot merge

@rfcbot
Copy link

rfcbot commented Feb 3, 2025

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Feb 3, 2025
@tbu-
Copy link
Contributor

tbu- commented Feb 3, 2025

it seems like the only concern about them is whether they have the right name. The current cast_signed and cast_unsigned seem like reasonable names here.

I don't think they're reasonable names:

It does not communicate what happens to values of unsigned integers that do not fit into the target integer values.

We're reinterpreting the bits as an unsigned integer, but cast does not say that. E.g. casting from integer to float does not reinterpret the bits, casting from i8 to i32 does not solely reinterpret the bits either, it sign-extends them. So "cast" doesn't tell me what happens.

cast is currently not used for bit reinterpretations in the standard library.

@joshtriplett
Copy link
Member

joshtriplett commented Feb 3, 2025

I would be happy with anything that isn't wildly longer. (These are meant to replace as casts, and people shouldn't feel that doing so makes their code much more verbose.)

Personally I think to_signed and to_unsigned, or even to_i and to_u, would be clear, because there are only two possible interpretations that fit the return value, and panicking is not normally something that a to method does.

@joshtriplett joshtriplett added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Feb 3, 2025
@joshtriplett
Copy link
Member

Nominating to discuss the names in order to resolve this issue for stabilization.

@electroCutie
Copy link

What about something like bits_to_i / bits_to_u (and if desired the bits_from_* versions in associated functions)

Hints that it is bitwise and doesn't introduce a new term (I take it as a given that someone performing this operation already knows the term "bits"), doesn't try to put one or the other as the canonical representation, still uses the familiar from/to jargon, and is a bit less verbose than some other wordier proposals

I'm just going to push one more time for my names, I still think that if nothing else they accomplish a good compromise between length, not re-using the 'cast' terminology, doesn't introduce a new word for programmers to know since bits should be familiar to someone using this, and making clear the type of the result. Even if someone isn't sure what they do one glance at the documentation will cement it for them.

@tgross35
Copy link
Contributor

tgross35 commented Feb 3, 2025

Hi, I would like to ask is there something planned for NonZero as well since they are also integer and will work pretty much the same as normal integer in this cases.

@joshtriplett could this be included in the discussion?

@joshtriplett
Copy link
Member

What about something like bits_to_i / bits_to_u (and if desired the bits_from_* versions in associated functions)
Hints that it is bitwise and doesn't introduce a new term (I take it as a given that someone performing this operation already knows the term "bits"), doesn't try to put one or the other as the canonical representation, still uses the familiar from/to jargon, and is a bit less verbose than some other wordier proposals

I'm just going to push one more time for my names, I still think that if nothing else they accomplish a good compromise between length, not re-using the 'cast' terminology, doesn't introduce a new word for programmers to know since bits should be familiar to someone using this, and making clear the type of the result. Even if someone isn't sure what they do one glance at the documentation will cement it for them.

Those seem potentially reasonable and worth considering.

@joshtriplett
Copy link
Member

@tgross35 Writing a patch for that now.

@joshtriplett
Copy link
Member

#136511

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Feb 4, 2025
@rfcbot
Copy link

rfcbot commented Feb 4, 2025

🔔 This is now entering its final comment period, as per the review above. 🔔

jhpratt added a commit to jhpratt/rust that referenced this issue Feb 4, 2025
…nsigned, r=dtolnay

Add `cast_signed` and `cast_unsigned` methods for `NonZero` types

Requested in rust-lang#125882 .

Note that this keeps the same names as the methods currently present on other
integer types. If we want to rename them, we can rename them all at the same
time.
@Amanieu
Copy link
Member

Amanieu commented Feb 4, 2025

We discussed this in the libs-api meeting and decided to keep the names as-is (cast_[un]signed). "Cast" is well documented in the reference with regards to integers so it is clear that this simply re-interprets the bits. In any case this will be clearly described in the documentation.

@Amanieu Amanieu removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Feb 4, 2025
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 4, 2025
Rollup merge of rust-lang#136511 - joshtriplett:nonzero-cast-signed-unsigned, r=dtolnay

Add `cast_signed` and `cast_unsigned` methods for `NonZero` types

Requested in rust-lang#125882 .

Note that this keeps the same names as the methods currently present on other
integer types. If we want to rename them, we can rename them all at the same
time.
@tgross35
Copy link
Contributor

tgross35 commented Feb 4, 2025

FCP includes the methods added to NonZero in #136511, correct? (top post needs to be updated)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. relnotes Marks issues that should be documented in the release notes of the next release. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests