Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use less divisions in display u128/i128 #76017

Merged
merged 1 commit into from
Oct 4, 2020
Merged

Conversation

JulianKnodt
Copy link
Contributor

@JulianKnodt JulianKnodt commented Aug 28, 2020

This PR is an absolute mess, and I need to test if it improves the speed of fmt::Display for u128/i128, but I think it's correct.
It hopefully is more efficient by cutting u128 into at most 2 u64s, and also chunks by 1e16 instead of just 1e4.

Also I specialized the implementations for uints to always be non-false because it bothered me that it was checked at all

Do not merge until I benchmark it and also clean up the god awful mess of spaghetti.
Based on prior work in #44583

cc: @Dylan-DPC

Due to work on itoa and suggestion in original issue:
r? @dtolnay

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 28, 2020
}

fn udiv_1e9(n: u128) -> (u128, u64) {
const DIV: u64 = 1e19 as u64;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this hard "as" cast from your code like this (needs const_int_pow feature):

const DIV: u64 = 10_u64.pow(19);

Copy link
Contributor Author

@JulianKnodt JulianKnodt Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean this specific one or all the ascasts? In the other cases, I inlined the others, and while technically it should be const-propagated, the cast to me should always occur at compile time, more so than doing .pow(..).

This is also a convention as both should be equivalent.

Copy link

@leonardo-m leonardo-m Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant this specific cast.

(I'd like to see Rustc devs and stdlib devs start getting into the habit of minimizing the number of "as" casts in the rustc codebase, because such casts are sharp blades that sometimes cut you.)

Regarding the performance, using a const like I've suggested is zero-cost at run-time (I think).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case can these casts cause unexpected behaviour? In this case the constant is definitely in the u64 range.

If it's defined as const VAR: u64 = { expr }, then there is definitely no runtime cost. Inlining it elsewhere might incur some runtime cost.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case can these casts cause unexpected behaviour? In this case the constant is definitely in the u64 range.<

(That specific "as" doesn't cause unexpected behaviour. But being willing to use "as" everywhere is like being against the presence of unsafe{} statement in Rust language on the base of the code inside a specific usage of unsafe{} in a program being evidently safe. Even if a specific usage of "as" is safe, it's a sharp tool, and it may lead to problems in other cases, so better to minimize its usage in a codebase. I can call this language blindness: what a language puts in front is visible and taken care of, what a language doesn't care about the programmer too doesn't care regardless the troubles it could cause. But eventually the unsafety of using "as" will come out in Rust culture. The recently introduced Rustc suggestions to use try_into are a step forward).

@Dylan-DPC-zz Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 28, 2020
@Dylan-DPC-zz
Copy link

marked this as waiting on author, when you are are done and ready for review, let me know :)

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to unsubscribe until this is ready to look at. Please re-request review when needed.

@dtolnay dtolnay removed their assignment Aug 28, 2020
@Dylan-DPC-zz
Copy link

r? @Dylan-DPC

(i'll keep myself as a placeholder till this is ready )

@JulianKnodt
Copy link
Contributor Author

Seems that this increases the speed of fmt noticeably, will clean up the code and prep for review.

@JulianKnodt JulianKnodt force-pushed the fmt_fast branch 4 times, most recently from d4632f8 to ed49201 Compare August 29, 2020 06:45
@JulianKnodt JulianKnodt marked this pull request as ready for review August 29, 2020 06:49
@leonardo-m
Copy link

leonardo-m commented Aug 29, 2020

Related: #39078

}

/// Partition of `n` into n > 1e19 and rem <= 1e19
fn udiv_1e9(n: u128) -> (u128, u64) {
Copy link
Member

@nagisa nagisa Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the name says 1e9, the implementation does 1e19.

I find it surprising that LLVM does not strength-reduce the division by constants for 128-bit integers into shr (mul $CONST1) $CONST2, like it does for 64-bit ones. I wonder if its actually something infeasible, or if its just them having a wrong conditional in a wrong place.

(If you wanted to look into this http://gmplib.org/~tege/divcnst-pldi94.pdf describes the mechanism to strength-reduce)

Copy link
Member

@nagisa nagisa Aug 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think the reason this happens is because this algorithm requires the upper half of the multiplication result, which for division by 128 bits means the upper 128-bits of the result when multiplying two 128-bit integers.

Since we don’t have a 256-bit multiplication algorithm, this is not something that's super feasible to implement here.

(It would still be faster than the iterative algorithm presented here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this algorithm, you mean the division into u128 and u64?
I've taken that component from itoa, so I'm afraid my understanding of it it lacking.
After doing some googling to figure out what strength reduction, I'm not sure if this is the place to do such, as it would seem pretty one-off as you say, but I do wonder if it could be added somewhere upstream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the place to do such, as it would seem pretty one-off

My problem with this is that once this algorithm lands, it is exceedingly easy to forget to remove this function once something faster, better is implemented upstream. It would be less of an issue if this was as fast as theoretically possible, as even if the upstream (LLVM) is fixed, this code won’t pessimize "just" because it implements the algorithm manually.

Copy link
Contributor Author

@JulianKnodt JulianKnodt Aug 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be alright to update this, not resolve the issues, and leave a comment/marker on the corresponding issues to update in the future?

@Dylan-DPC-zz
Copy link

this is ready for review now

r? @dtolnay

@crlf0710 crlf0710 added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 18, 2020
@dtolnay
Copy link
Member

dtolnay commented Sep 19, 2020

I just had a lot of PRs stack up, so I need to reassign this in order to focus on ones that contain public API change.

r? @nagisa since you have already been looking into u128 division

@rust-highfive rust-highfive assigned nagisa and unassigned dtolnay Sep 19, 2020
@JulianKnodt
Copy link
Contributor Author

No worries, this is not important at all, just a minor optimization, so if there are more pressing things those should take priority.

@nagisa
Copy link
Member

nagisa commented Sep 26, 2020

Still somewhat uncomfortable with us encoding a division algorithm manually, but I guess we can remove it later once LLVM does the right thing.

@bors r+

@bors
Copy link
Contributor

bors commented Sep 26, 2020

📌 Commit 1087590 has been approved by nagisa

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 26, 2020
@jonas-schievink
Copy link
Contributor

@bors r- failed in #77240 (comment)

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Sep 26, 2020
@bors
Copy link
Contributor

bors commented Sep 28, 2020

☔ The latest upstream changes (presumably #77302) made this pull request unmergeable. Please resolve the merge conflicts.

Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:

@rustbot modify labels: +S-waiting-on-review -S-waiting-on-author

Add zero padding

Add benchmarks for fmt u128

This tests both when there is the max amount of work(all characters used)
And least amount of work(1 character used)
@nagisa
Copy link
Member

nagisa commented Oct 3, 2020

@bors r+ rollup=iffy

@bors
Copy link
Contributor

bors commented Oct 3, 2020

📌 Commit 3f1d2aa has been approved by nagisa

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 3, 2020
@bors
Copy link
Contributor

bors commented Oct 4, 2020

⌛ Testing commit 3f1d2aa with merge 4cf3dc1...

@bors
Copy link
Contributor

bors commented Oct 4, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: nagisa
Pushing 4cf3dc1 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 4, 2020
@bors bors merged commit 4cf3dc1 into rust-lang:master Oct 4, 2020
@rustbot rustbot added this to the 1.49.0 milestone Oct 4, 2020
}
doit! { i8 i16 i32 i64 i128 isize u8 u16 u32 u64 u128 usize }
macro_rules! impl_uint {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why two macros when they're doing the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, here I was thinking it made sense to explicitly separate the two cases, as I was explicitly changing to recognize ints separately from uints, but I think I removed that change before the final revision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All's good. Just cause a minor confusion to guess the intention to duplicate here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.