-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add carrying_add, borrowing_sub, widening_mul, carrying_mul methods to integers #85017
Conversation
r? @m-ou-se (rust-highfive has picked a reviewer for you, use r? to override) |
05e2efa
to
a5d1851
Compare
This comment has been minimized.
This comment has been minimized.
a5d1851
to
8b0b7ac
Compare
This comment has been minimized.
This comment has been minimized.
8b0b7ac
to
21a332e
Compare
This comment has been minimized.
This comment has been minimized.
21a332e
to
9981d28
Compare
This comment has been minimized.
This comment has been minimized.
cc @rust-lang/wg-const-eval some new inherent const fns on integer types. The impl for them already exists (https://github.com/rust-lang/rust/blob/master/compiler/rustc_mir/src/interpret/intrinsics.rs#L274-L280). @clarfonthey could you split out the constification into a separate commit and add tests that run these functions during the evaluation of a constant or static item? |
What's missing? |
library/core/src/num/mod.rs
Outdated
($SelfT:ty, $WideT:ty, $BITS:literal) => { | ||
/// Calculates the full multiplication of `self` and `rhs`. | ||
/// | ||
/// This returns the low-order (insignificant) bits and the high-order (significant) bits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "insignificant" the right word?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, honestly-- open to suggestions. My main goal was to get an implementation out there and to work out the specifics on stuff like this after merging, though, since it has to exist in order for people to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the terminology "least significant" and "most significant" bits? Wikipedia.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct -- I don't know why I thought insignificant was the right term.
Another u64 unsafe method (here it uses #![feature(asm)]
#[allow(unused_assignments, unused_variables)]
#[cfg(any(target_arch = "x86_64"))]
pub unsafe fn mul_mod(this: u64, b: u64, m: u64) -> u64 {
let mut q: u64;
let mut r: u64; // r = a b mod m
asm!(
"mul {b}; div {m};",
b = in(reg) b, m = in(reg) m,
inout("rax") this => q, out("rdx") r,
options(nostack, nomem)
);
r
} |
I am very wary of using the "carry" terminology for signed integers. Usually for two's complement carry is basically what I would use for an adder when I'm thinking about unsigned only. For example for something like
|
pub const fn widening_mul(self, rhs: Self) -> (Self, Self) { | ||
// note: longer-term this should be done via an intrinsic, | ||
// but for now we can deal without an impl for u128/i128 | ||
// SAFETY: overflow will be contained within the wider types | ||
let wide = unsafe { (self as $WideT).unchecked_mul(rhs as $WideT) }; | ||
(wide as $SelfT, (wide >> $BITS) as $SelfT) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe widening_mul
on signed integers should return the less-significant half as unsigned instead. For example consider i8
, and −128 × 3 (0x80 * 0x03
). Now, when widened to i16
(0xff80 * 0x0003
), this gives −384 (0xfe80
), which is then returned as (0x80
, 0xfe
). Now, 0xfe
itself is −2, so think of this as meaning −512 because it is the higher word. The correct result is −512 + 128, so the 0x80
should be interpreted as +128 not −128, so it would be u8
, not i8
.
Note: I'm using unsigned for all hex values to not make it more confusing, even though technically 0xfe_i8
is an error and would be written as -0x02_i8
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, using unsigned for the lower half would make it possible to write something like real_result = (i16::from(high) << 8) | i16::from(low)
, or +
instead of |
, both work. With signed for the lower half, i16::from(low)
would be sign extended and would actually need to be masked with something like (i16::from(low) & i16::from(u8::MAX))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should add, these kinds of suggestions are exactly the kind that I would love to see discussed, and I would definitely suggest bringing it up in the original RFC for widening_mul
. Right now I'm kind of inclined to keep this particular method as-is until it's merged and this discussion can happen before stabilisation, since the main goal is to get something out there for people to use at the moment.
I am fine with making this change before stabilisation though if anyone else would like to +1 the idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also FWIW, my justification for using the same type is because any big-integer implementation that uses these would be adding one half or the other to some part of the bigint anyway, and using a separate unsigned type for that would just require unnecessary casting. I could easily be swayed, though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have looked into the internal representation of two bigint implementations: GMP and num-bigint. Both store the digits as unsigned and use an extra flag to indicate whether the big number is positive or negative, so I don't know how an actual bignum implementation that stores digits as signed would use these, if there is such a bignum implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I did notice that most bigint implementations do use a sign flag instead of two's complement. However, in my case, I was working on an implementation of a variable-length integer encoding where integers would still be stored in two's complement as it would make sign extension to get a larger, single unit (e.g. u64
or u128
) easier. It probably does make sense for most bigint implementations to use a sign flag and unsigned int units, but that doesn't mean that two's complement bigints are completely out of the question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the ordering of the return tuple: I guess it makes sense to be consistent with other methods who's carry is the last element of the return tuple, and as such have the most significant word on the right. Personally, I prefer the most significant word on the left; then one can write nice idiomatic code e.g. to compare x * y
with a * b
with x.widening_mul(y).cmp(&a.widening_mul(b))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely think that add_with_carry
matching the order of overflowing_add
is the most important part.
But interestingly, the least-significant word on the left has a nice interpretation on (currently more common) little-endian machines: the return is just the number for wide_mul
(assuming the obvious, though unspecified, tuple layout): https://rust.godbolt.org/z/sG9E4G8vY
(Though apparently we no longer encode (u32, u32)
as i64
in LLVM, which is why the Godbolt demo uses an old rustc version, so maybe that matters less these days.)
I think it would make sense to have methods for this stuff -- the correct incantations today requires knowing the patterns that LLVM accepts, which is knowledge that shouldn't be needed and cranelift might later need something different anyway. Specifically which signatures and names should be picked I'm unsure. I do think the versions with carries are important, though -- I have an example in rust-lang/rfcs#2417 (comment) of how the most obvious use of wide multiplication is awkward without it. (And it would probably be fine for that to be the only version -- optimizing away a constant-zero carry ought to be completely trivial for any backend. Basically it just means providing a full adder, rather than giving them only half adders and requiring that they figure out how to combine them efficiently.) |
In hindsight I should have separated these out anyway, so I put them in a separate PR that can be merged before this one. We can move the discussion into #85096.
A lot, actually. While the carrying add/sub implementations do make it relatively trivial to implement addition, subtraction, and negation for big integers, multiplication requires either a naïve O(N²) algorithm or other algorithms that do the multiplication properly. These are very critical building-blocks, but alone, they aren't a bigint implementation, only the biggest piece that depends heavily on how it's compiled for performance.
I definitely agree that modular-arithmetic methods would be useful, but I find that kind of out-of-scope for this particular change, and I'm also not sure how much the assembly is needed for performance.
100% agree with you there; I'm mostly adding the |
Do you have an alternative term you think might be more accurate? Two's complement is still relevant for big integers even if it's not always used, and I think these methods would still be useful in that context. |
9981d28
to
66a6aa1
Compare
This comment has been minimized.
This comment has been minimized.
66a6aa1
to
d6b3560
Compare
This comment has been minimized.
This comment has been minimized.
d6b3560
to
64f7db2
Compare
This comment has been minimized.
This comment has been minimized.
☔ The latest upstream changes (presumably #87540) made this pull request unmergeable. Please resolve the merge conflicts. |
Ping from triage: |
Will try to remember to do this this evening. |
5bfd4ae
to
adbc328
Compare
@rustbot label -S-waiting-on-author +S-waiting-on-review Rebased, tests pass locally so it should be good on CI too. |
adbc328
to
cc15047
Compare
ping @m-ou-se, this PR has been hanging for a while and I'm not sure if it slid out of your radar, or if it might be better for someone else to review it. :) |
Sorry! This looks good now. Let's try them out and discuss on the tracking issue to see if we can resolve the questions there. :) @bors r+ |
📌 Commit cc15047 has been approved by |
Rollup of 10 pull requests Successful merges: - rust-lang#85017 (Add carrying_add, borrowing_sub, widening_mul, carrying_mul methods to integers) - rust-lang#86362 (Avoid cloning LocalDecls) - rust-lang#88391 (Fix json tuple struct enum variant ) - rust-lang#88399 (Disallow the aapcs CC on Aarch64) - rust-lang#88418 (Allow `~const` bounds on trait assoc functions) - rust-lang#88445 (Clean up the lowering of AST items) - rust-lang#88495 (Add `TcpStream::set_linger` and `TcpStream::linger`) - rust-lang#88501 (Use right span in prelude collision suggestions with macros. ) - rust-lang#88504 (Keep turbofish in prelude collision lint.) - rust-lang#88524 (Remove unnecessary `mut` from udp doctests) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
@@ -1307,6 +1307,33 @@ macro_rules! int_impl { | |||
(a as Self, b) | |||
} | |||
|
|||
/// Calculates `self + rhs + carry` without the ability to overflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ability to overflow" sounds like overflow is desirable and the lack of overflow is a limitation of this method. Is that what you meant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say
Extended summation of `self + rhs + carry`. The booleans are interpreted as a single bit
integer of value 0 or 1. If unsigned overflow occurs, then the boolean in the tuple returns 1. This
output carry can be chained into the input carry of another carrying add, which allows
for arbitrarily large additions to be calculated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: Since this PR is merged, this discussion should go in the tracking issue (#85532).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait pardon me this has already been merged. I will take discussion elsewhere or open my own PR.
#[must_use = "this returns the result of the operation, \ | ||
without modifying the original"] | ||
#[inline] | ||
pub const fn borrowing_sub(self, rhs: Self, borrow: bool) -> (Self, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To a Rustacean, "borrowing" sounds like it has to do with references. So I am very confused by the name of this function. Maybe the docs should explain it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word borrow here comes from the terminology for an full subtractor. I am thinking that maybe the borrowing_sub
function should be removed altogether. The same effect that borrowing_sub
has can be obtained from carrying_add
by making the first carrying_add
in the chain have a set carry
bit, and then bitnot every rhs
. This fact could be put in the documentation of carrying_add
.
Hi, I'm still not sure I understand this fully, but I have this test failing with the current code: assert_eq!(i8::MAX.carrying_add(i8::MIN, true), (0, false)); Playground: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=a161cf42872e087303d1435648e6eceb Is the test correct ? A carry did happen in an intermediate calculation, but should the overall calculation report a carry ? |
So, the carrying behaviour for signed integers is a bit disputed; this should definitely go in the tracking issue. |
@anisse A closed PR isn't a good place to have a conversation about stuff -- most people won't look at them -- I would suggest you open a new issue as a place to have discussion about that particular question (ccing the tracking issue #85532 from it). |
@scottmcm thanks, I'll open a new issue and cc the tracking issue. |
(Tracking issue: #85532)
This comes in part from my own attempts to make (crude) big integer implementations, and also due to the stalled discussion in RFC 2417. My understanding is that changes like these are best offered directly as code and then an RFC can be opened if there needs to be more discussion before stabilisation. Since all of these methods are unstable from the start, I figured I might as well offer them now.
I tried looking into intrinsics, messed around with a few different implementations, and ultimately concluded that these are "good enough" implementations for now to at least put up some code and maybe start bikeshedding on a proper API for these.
For the
carrying_add
andborrowing_sub
, I tried looking into potential architecture-specific code and realised that even using the LLVM intrinsics foraddcarry
andsubborrow
on x86 specifically, I was getting exactly the same assembly as the naive implementation usingoverflowing_add
andoverflowing_sub
, although the LLVM IR did differ because of the architecture-specific code. Longer-term I think that they would be best suited to specific intrinsics as that would make optimisations easier (instructions like add-carry tend to use implicit flags, and thus can only be optimised if they're done one-after-another, and thus it would make the most sense to have compact intrinsics that can be merged together easily).For
widening_mul
andcarrying_mul
, for now at least, I simply cast to the larger type and perform arithmetic that way, since we currently have no intrinsic that would work better for 128-bit integers. In the future, I also think that some form of intrinsic would work best to cover that case, but for now at least, I think that they're "good enough" for now.The main reasoning for offering these directly to the standard library even though they're relatively niche optimisations is to help ensure that the code generated for them is optimal. Plus, these operations alone aren't enough to create big integer implementations, although they could help simplify the code required to do so and make it a bit more accessible for the average implementor.
That said, I 100% understand if any or all of these methods are not desired simply because of how niche they are. Up to you. 🤷🏻