Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast path for vec.clear/truncate #64375

Merged
merged 1 commit into from
Sep 14, 2019
Merged

Fast path for vec.clear/truncate #64375

merged 1 commit into from
Sep 14, 2019

Conversation

kornelski
Copy link
Contributor

For trivial types like u8, vec.truncate()/vec.clear() relies on the optimizer to remove the loop. This means more work in debug builds, and more work for the optimizer.

Avoiding this busywork is exactly what mem::needs_drop::<T>() is for.

@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 11, 2019
@Centril
Copy link
Contributor

Centril commented Sep 11, 2019

Guarantee vec.clear/truncate is O(1) for trivial types

The wording here suggests we will make it a commitment that this will stay O(1) forever. However, needs_drop may not be relied upon to return false as it may return true in all cases. Can you adjust the title to clarify this?


@bors try

@bors
Copy link
Contributor

bors commented Sep 11, 2019

⌛ Trying commit 051cf46 with merge 3979ca4...

bors added a commit that referenced this pull request Sep 11, 2019
Guarantee vec.clear/truncate is O(1) for trivial types

For trivial types like `u8`, `vec.truncate()`/`vec.clear()` relies on the optimizer to remove the loop. This means more work in debug builds, and more work for the optimizer.

Avoiding this busywork is exactly what `mem::needs_drop::<T>()` is for.
@kornelski kornelski changed the title Guarantee vec.clear/truncate is O(1) for trivial types Fast path for vec.clear/truncate Sep 11, 2019
@kornelski
Copy link
Contributor Author

kornelski commented Sep 11, 2019

This change came from an internal discussion, where a developer has avoided use of Vec<u8> exactly because Rust's implementation appears to be O(n) and only relies on the optimizer to remove the loop.

Why mem::needs_drop can't guarantee to return false for u8?

@Centril
Copy link
Contributor

Centril commented Sep 11, 2019

Why mem::needs_drop can't guarantee to return false for u8?

I refer you to the documentation of needs_drop:

This is purely an optimization hint, and may be implemented conservatively: it may return true for types that don't actually need to be dropped. As such always returning true would be a valid implementation of this function. However if this function actually returns false, then you can be certain dropping T has no side effect.

@kornelski
Copy link
Contributor Author

I know the documentation, but it seems overly conservative. I wouldn't be surprised if it couldn't guarantee precision for trait objects or even structs, but for basic built-in types?

@Centril
Copy link
Contributor

Centril commented Sep 11, 2019

I think it becomes a question of "what is the criteria?" then... Maybe we could guarantee it for Copy types for which the user has already made a no-drop-glue commitment, but I think that's a larger discussion that needs to involve the language team and is best done in a less ephemeral place than this PR, e.g. an RFC.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-09-11T10:54:10.4487453Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-09-11T10:54:10.4678173Z ##[command]git config gc.auto 0
2019-09-11T10:54:10.4758466Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-09-11T10:54:10.4828475Z ##[command]git config --get-all http.proxy
2019-09-11T10:54:10.4977819Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/64375/merge:refs/remotes/pull/64375/merge
---
2019-09-11T12:01:56.6532614Z .................................................................................................... 1500/9008
2019-09-11T12:02:03.1009139Z .................................................................................................... 1600/9008
2019-09-11T12:02:16.9992147Z .........................................................i...............i.......................... 1700/9008
2019-09-11T12:02:25.5574822Z .................................................................................................... 1800/9008
2019-09-11T12:02:42.0547200Z ................................................iiiii............................................... 1900/9008
2019-09-11T12:02:54.9412447Z .................................................................................................... 2100/9008
2019-09-11T12:02:57.9941199Z .................................................................................................... 2200/9008
2019-09-11T12:03:02.1263020Z .................................................................................................... 2300/9008
2019-09-11T12:03:10.9371167Z .................................................................................................... 2400/9008
---
2019-09-11T12:06:28.9122540Z ...................................i...............i................................................ 4700/9008
2019-09-11T12:06:41.4342330Z .................................................................................................... 4800/9008
2019-09-11T12:06:48.7810679Z .................................................................................................... 4900/9008
2019-09-11T12:07:00.5689256Z .................................................................................................... 5000/9008
2019-09-11T12:07:07.3301465Z ..................ii.ii............................................................................. 5100/9008
2019-09-11T12:07:18.8147054Z .................................................................................................... 5300/9008
2019-09-11T12:07:30.0542225Z .................................................................................i.................. 5400/9008
2019-09-11T12:07:38.8019200Z .................................................................................................... 5500/9008
2019-09-11T12:07:45.3295255Z .................................................................................................... 5600/9008
2019-09-11T12:07:45.3295255Z .................................................................................................... 5600/9008
2019-09-11T12:07:56.8834085Z ...........................................................................ii...i..ii...........i... 5700/9008
2019-09-11T12:08:25.1905937Z .................................................................................................... 5900/9008
2019-09-11T12:08:36.1852468Z .................................................................................................... 6000/9008
2019-09-11T12:08:36.1852468Z .................................................................................................... 6000/9008
2019-09-11T12:08:44.9799863Z .............................................................................i..ii.................. 6100/9008
2019-09-11T12:09:18.1655625Z .................................................................................................... 6300/9008
2019-09-11T12:09:20.7342979Z ....................................i............................................................... 6400/9008
2019-09-11T12:09:23.7606154Z .................................................................................................... 6500/9008
2019-09-11T12:09:26.6753413Z ........i........................................................................................... 6600/9008
---
2019-09-11T12:14:26.8220172Z  finished in 5.544
2019-09-11T12:14:26.8424764Z Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
2019-09-11T12:14:27.0397613Z 
2019-09-11T12:14:27.0400138Z running 150 tests
2019-09-11T12:14:30.6878428Z i....iii......iii..iiii....i.............................i..i..................i....i.........ii.i.i 100/150
2019-09-11T12:14:32.9224646Z ..iiii..............i.........iii.i.......iiF.....thread 'main' panicked at 'Some tests failed', src/tools/compiletest/src/main.rs:536:22
2019-09-11T12:14:32.9225225Z 
2019-09-11T12:14:32.9226568Z failures:
2019-09-11T12:14:32.9227043Z 
2019-09-11T12:14:32.9228112Z ---- [codegen] codegen/vec-clear.rs stdout ----
2019-09-11T12:14:32.9228112Z ---- [codegen] codegen/vec-clear.rs stdout ----
2019-09-11T12:14:32.9228542Z 
2019-09-11T12:14:32.9229119Z error: verification with 'FileCheck' failed
2019-09-11T12:14:32.9229803Z status: exit code: 1
2019-09-11T12:14:32.9230302Z command: "/usr/lib/llvm-6.0/bin/FileCheck" "--input-file" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/vec-clear/vec-clear.ll" "/checkout/src/test/codegen/vec-clear.rs"
2019-09-11T12:14:32.9232032Z ------------------------------------------
2019-09-11T12:14:32.9232139Z 
2019-09-11T12:14:32.9232388Z ------------------------------------------
2019-09-11T12:14:32.9232439Z stderr:
2019-09-11T12:14:32.9232439Z stderr:
2019-09-11T12:14:32.9232675Z ------------------------------------------
2019-09-11T12:14:32.9233001Z /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/vec-clear/vec-clear.ll:14:21: error: CHECK-NOT: string occurred!
2019-09-11T12:14:32.9233288Z  %current_len.i.i = load i64, i64* %0, align 8
2019-09-11T12:14:32.9233360Z                     ^
2019-09-11T12:14:32.9233664Z /checkout/src/test/codegen/vec-clear.rs:9:16: note: CHECK-NOT: pattern specified here
2019-09-11T12:14:32.9233866Z  // CHECK-NOT: load
2019-09-11T12:14:32.9233965Z 
2019-09-11T12:14:32.9234181Z ------------------------------------------
2019-09-11T12:14:32.9234213Z 
2019-09-11T12:14:32.9234256Z 
---
2019-09-11T12:14:32.9234912Z test result: FAILED. 118 passed; 1 failed; 31 ignored; 0 measured; 0 filtered out
2019-09-11T12:14:32.9234950Z 
2019-09-11T12:14:32.9234976Z 
2019-09-11T12:14:32.9235001Z 
2019-09-11T12:14:32.9244127Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/codegen" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--mode" "codegen" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-6.0/bin/FileCheck" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "6.0.0\n" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--llvm-components" "" "--llvm-cxxflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"
2019-09-11T12:14:32.9244488Z 
2019-09-11T12:14:32.9244538Z 
2019-09-11T12:14:32.9249738Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
2019-09-11T12:14:32.9250030Z Build completed unsuccessfully in 1:13:10
2019-09-11T12:14:32.9250030Z Build completed unsuccessfully in 1:13:10
2019-09-11T12:14:32.9318349Z == clock drift check ==
2019-09-11T12:14:32.9344921Z   local time: Wed Sep 11 12:14:32 UTC 2019
2019-09-11T12:14:33.0853604Z   network time: Wed, 11 Sep 2019 12:14:33 GMT
2019-09-11T12:14:33.0853727Z == end clock drift check ==
2019-09-11T12:14:36.6377750Z ##[error]Bash exited with code '1'.
2019-09-11T12:14:36.6425957Z ##[section]Starting: Checkout
2019-09-11T12:14:36.6428434Z ==============================================================================
2019-09-11T12:14:36.6428502Z Task         : Get sources
2019-09-11T12:14:36.6428548Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@bors
Copy link
Contributor

bors commented Sep 11, 2019

☀️ Try build successful - checks-azure
Build commit: 3979ca4

@kornelski
Copy link
Contributor Author

It's funny that LLVM optimized the whole loop better than my if len < self.len. I've changed it to if len <= self.len which again makes it unconditional for clear.

It's great that there are tests for it.

@Centril
Copy link
Contributor

Centril commented Sep 11, 2019

Heh... guess we need another builder for it again then:

@bors try

@bors
Copy link
Contributor

bors commented Sep 11, 2019

⌛ Trying commit 223600a with merge 3a66acf73fc62246576fedf73a80249f523d0e3d...

@bors
Copy link
Contributor

bors commented Sep 12, 2019

☀️ Try build successful - checks-azure
Build commit: 3a66acf73fc62246576fedf73a80249f523d0e3d

@Centril
Copy link
Contributor

Centril commented Sep 12, 2019

@rust-timer build 3a66acf73fc62246576fedf73a80249f523d0e3d

@rust-timer
Copy link
Collaborator

Success: Queued 3a66acf73fc62246576fedf73a80249f523d0e3d with parent fe6d05a, comparison URL.

}
} else if len <= self.len {
self.len = len;
Copy link
Contributor

@gnzlbg gnzlbg Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this function just implemented as:

if len >= self.len { return; }
let s = &mut self[len..] as *mut [_];
self.len = len;
ptr::drop_in_place(&mut *s);

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know, maybe it’s older than drop_in_place?

One difference however, as hinted by the code comment, is that the explicit loop + SetLenOnDrop doesn’t leak other items when T::drop panics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know, maybe it’s older than drop_in_place?

Might be.

One difference however, as hinted by the code comment, is that the explicit loop + SetLenOnDrop doesn’t leak other items when T::drop panics.

Yes, I think this leaks as little as it possibly can - only some fields of the T for which T::drop panics might be leaked with the current implementation.

I don't see this guaranteed or documented anywhere, but it is a reasonable thing to do. We don't do this for, e.g., Vec<T>::drop, but observing a partially dropped Vec is hard, while if truncate panics, then maybe observing a partially truncated Vec is easier ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then maybe observing a partially truncated Vec is easier ?

Well yes, it is really easy:

let mut vec: Vec<T>;
catch_unwind(AssertUnwindSafe(|| vec.truncate(N)));
// partially-truncated vec easily observable

Does ptr::drop_in_place([T]) stop dropping elements the first time a panic happens ? Or does it continue dropping elements and re-raises the panic at the end?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ptr::drop_in_place([T]) stop dropping elements the first time a panic happens ?

No.

Or does it continue dropping elements and re-raises the panic at the end?

It continues dropping elements, and re-raises the panic at the end, just like for all other aggregates. If a second element panics, then we have a double panic. See here.

So the main difference between this implementation and one using ptr::drop_in_place, is that this one stops dropping elements the moment there is a panic, and with ptr::drop_in_place, all elements are always dropped, except at most one if it panics, and if more than one panic the program aborts.

Copy link
Contributor

@gnzlbg gnzlbg Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bjorn3 so that example works with the current implementation, but it wouldn't work if we were to use ptr::drop_in_place (it would abort due to a double panic).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn’t realize drop_in_place did that. So maybe it’s preferable to use here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific implementation has more instructions for clear, because it changes unconditional store 0 into a check. Changing it to if len > self.len { return; } generates the same code for u8. Drop cases are different of course.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Yes, with if len > self.len { return; } the assembly for Vec::clear looks much nicer: https://rust.godbolt.org/z/6pSj7K

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 3a66acf73fc62246576fedf73a80249f523d0e3d, comparison URL.

@Centril
Copy link
Contributor

Centril commented Sep 12, 2019

Perf looks like mostly noise -- might be worth other forms of benchmarks to show this is useful...

r? @rkruppe

@bjorn3
Copy link
Member

bjorn3 commented Sep 12, 2019

Perf looks like mostly noise -- might be worth other forms of benchmarks to show this is useful...

Yeah, I believe this mostly improves debug builds. Because the rustc used during benching by @rust-timer is build in release mode, it won't have much effect. I personally use https://github.com/ebobby/simple-raytracer to bench perf of cg_clif. In debug mode it takes ~15s to run, so you can easily run it many times for less noise using for example hyperfine.

@rust-timer
Copy link
Collaborator

Insufficient permissions to issue commands to rust-timer.

@hanna-kruppe
Copy link
Contributor

I wouldn't expect this to help at all when code is optimized, and perf runs a rustc compiled with optimization. I also don't need any benchmarks to be rather sure that this helps some programs compiled without optimizations. The only question is whether that is worth doing at all, which IIRC was a bit contentious in the past 🤷‍♀ Will wait a bit with r+ in case anyone objects on that basis (and to take some time to review the code properly) but let's hope not.

@kornelski
Copy link
Contributor Author

kornelski commented Sep 12, 2019

To me the main motivation is not O0 performance, but a guarantee that clear and truncate are O(1) for built-in integer types like u8. As Centril pointed out, needs_drop technically doesn't guarantee that, which is a shame.

Could this be solved by documenting that libstd commits to having clear and truncate O(1) for types without Drop? (or at least built-in integer types). I've noticed that there's a unit test verifying that, but that's technically "just" an implementation detail.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 12, 2019

@alexcrichton There is a different between "runtime improvements" and "algorithmic complexity improvements". Leaving "algorithmic complexity improvements" to LLVM feels risky, and having them depend on the optimization level makes the algorithms unusable for some.

In this particular case, there is a proposal that's simpler, shorter, more maintainable, and has better semantics and algorithmic complexity than the current code.

@alexcrichton
Copy link
Member

I won't pretend to speak for the libs team, but I would personally be ok with documenting that drop/clear/truncate/etc all are O(1) for u8 and other primitives like that. We rely on LLVM for that undocumented guarantee today, but if that were to ever regress we would indeed pull out all the stops to fix it.

@Centril
Copy link
Contributor

Centril commented Sep 13, 2019

Could this be solved by documenting that libstd commits to having clear and truncate O(1) for types without Drop? (or at least built-in integer types). I've noticed that there's a unit test verifying that, but that's technically "just" an implementation detail.

For types without a Drop impl it isn't correct as these types may have recursive drop glue due to fields having Drop impls (or at any level transitively). The Copy trait may be a reasonable set of types to guarantee this for however.

The standard library cannot commit to this without a corresponding language level guarantee. Or in other words, if the standard library does commit to it, you have also forced it on the language because it's not just the rustc implementation the guarantee is being made for, but for all implementations. If you want such a guarantee for needs_drop, then the trade-offs should be discussed in a T-Lang RFC. I would like any such guarantee to be phrased precisely and in a non-ad-hoc manner.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 13, 2019

@Centril IIUC the intent is to provide this optimizations for types that, when dropped, do not invoke any cleanup code. Providing this guarantee for Vec::truncate and how Vec::truncate is implemented are orthogonal issues. The standard library implementation can do things that Rust code cannot, so we don't necessarily need to modify the language to do so.

I do think that it is problematic that we don't even have a name for these types, much less facilities to detect them. I often see users confused about this - they think that's what Drop means - and we sometimes need to be very careful with the wording in the reference to account for these types. I think it is actually super rare to actually care whether a type implements Drop, as opposed to caring about whether any cleanup code will be invoked on drop.

So I would actually be very happy if the lang team could take a look at this issue. However, I would prefer to move the discussion about providing such guarantee somewhere else (maybe an issue the RFC repo?), and focus here on landing an incremental improvement instead.

@Centril
Copy link
Contributor

Centril commented Sep 13, 2019

The standard library implementation can do things that Rust code cannot, so we don't necessarily need to modify the language to do so.

If by "modify the language" you mean "change code in the compiler" then no, you don't necessarily have to modify anything. However, the guarantee the standard library forwards needs to be backed by something at the language level (in this case needs_drop) as opposed to e.g. "what LLVM does".

So I would actually be very happy if the lang team could take a look at this issue. However, I would prefer to move the discussion about providing such guarantee somewhere else (maybe an issue the RFC repo?), and focus here on landing an incremental improvement instead.

I agree (it's what I suggested in #64375 (comment)).

@hanna-kruppe
Copy link
Contributor

After considering all the points raised here yet again, I've decided I will go ahead and merge this as-is. There's been a bunch of different discussion streams and directions so here's my rationale for merging as-is with respect to each of them:

  • @alexcrichton is right that -O0/-O1 performance is across the board neglected and atrocious and trying to fix that would require a lot of work we're not ready to commit to. However, as with e.g. must_use-ification, I do not believe we need to block a couple of individually reasonable and well-motivated changes out of fear it'll turn into a flood. The specific change here is trivial, obviously correct, has neglegible code size and compile time overhead, and is motivated by a real situation rather than an abstract desire for good -O0 performance. The last bit is also roughly the criterium we've applied to must_use additions in the past, FWIW.
  • I see the appeal of giving nice bounds on algorithmic complexity, but as @Centril points out, wording that correctly is somewhat subtle and it conflicts with the current definition of needs_drop. I see no reason to block this PR on that whole discussion.
  • Drastically simplifying truncate by using drop_in_place::<[T]> (and getting the needs_drop check included for free) as brought up by @gnzlbg is interesting, but because it changes behavior w.r.t. Drop panics, I would prefer it be done in a separate PR (not reviewed by me tho, that is T-libs territory IMO). If that goes in, this PR is soon obsolete, but who cares, the code is written and reviewed, might as well merge it in case the drop_in_place change doesn't happen after all.

@bors r+

@bors
Copy link
Contributor

bors commented Sep 13, 2019

📌 Commit 223600a has been approved by rkruppe

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 13, 2019
Centril added a commit to Centril/rust that referenced this pull request Sep 13, 2019
Fast path for vec.clear/truncate

For trivial types like `u8`, `vec.truncate()`/`vec.clear()` relies on the optimizer to remove the loop. This means more work in debug builds, and more work for the optimizer.

Avoiding this busywork is exactly what `mem::needs_drop::<T>()` is for.
Centril added a commit to Centril/rust that referenced this pull request Sep 13, 2019
Fast path for vec.clear/truncate

For trivial types like `u8`, `vec.truncate()`/`vec.clear()` relies on the optimizer to remove the loop. This means more work in debug builds, and more work for the optimizer.

Avoiding this busywork is exactly what `mem::needs_drop::<T>()` is for.
Centril added a commit to Centril/rust that referenced this pull request Sep 14, 2019
Fast path for vec.clear/truncate

For trivial types like `u8`, `vec.truncate()`/`vec.clear()` relies on the optimizer to remove the loop. This means more work in debug builds, and more work for the optimizer.

Avoiding this busywork is exactly what `mem::needs_drop::<T>()` is for.
Centril added a commit to Centril/rust that referenced this pull request Sep 14, 2019
Fast path for vec.clear/truncate

For trivial types like `u8`, `vec.truncate()`/`vec.clear()` relies on the optimizer to remove the loop. This means more work in debug builds, and more work for the optimizer.

Avoiding this busywork is exactly what `mem::needs_drop::<T>()` is for.
Centril added a commit to Centril/rust that referenced this pull request Sep 14, 2019
Fast path for vec.clear/truncate

For trivial types like `u8`, `vec.truncate()`/`vec.clear()` relies on the optimizer to remove the loop. This means more work in debug builds, and more work for the optimizer.

Avoiding this busywork is exactly what `mem::needs_drop::<T>()` is for.
bors added a commit that referenced this pull request Sep 14, 2019
Rollup of 17 pull requests

Successful merges:

 - #63846 (Added table containing the system calls used by Instant and SystemTime.)
 - #64116 (Fix minor typo in docs.)
 - #64203 (A few cosmetic improvements to code & comments in liballoc and libcore)
 - #64302 (Shrink `ObligationCauseCode`)
 - #64372 (use randSecure and randABytes)
 - #64374 (Box `DiagnosticBuilder`.)
 - #64375 (Fast path for vec.clear/truncate )
 - #64378 (Fix inconsistent link formatting.)
 - #64384 (Trim rustc-workspace-hack)
 - #64393 ( declare EnvKey before use to fix build error)
 - #64420 (Inline `mark_neighbours_as_waiting_from`.)
 - #64422 (Remove raw string literal quotes from error index descriptions)
 - #64423 (Add self to .mailmap)
 - #64425 (typo fix)
 - #64431 (fn ptr is structural match)
 - #64435 (codegen: use "_N" (like for other locals) instead of "argN", for argument names.)
 - #64439 (fix #64430, confusing `owned_box` error message in no_std build)

Failed merges:

r? @ghost
@bors bors merged commit 223600a into rust-lang:master Sep 14, 2019
@therealprof
Copy link
Contributor

@alexcrichton

To me O0/O1 runtime improvements will be and endless game of whack-a-mole and I do not personally wish to participate nor do I want to see this game end up hurting the readability of the standard library. I am not worried about this PR alone but the cascade of PRs O0/O1 improvements might look like.

Halfway decent "optimisation" (rather a lack of pessimisation) at O0/O1 is very important for a quite a few cases with the most prominent (maybe) being embedded: If you want to debug on a microcontroller you're mostly limited to -O0 but the code generated by Rust is a true horror show; even the most simple programs (think blinky) can easily overflow the available amount of flash of simple MCUs. Also we're now regularly seeing problems that code paths are so pathologically slow that we're seeing hardware buffer overflows and thus dropped data with I/O, rendering applications completely unusable. It also is no fun stepping through hundreds of trivial libcore functions which tend to be put in the code path.

None of the Rust competitors do have this particular problem, so it's really a problem hurting the reputation of Rust as a viable systems programming language and thus should be addressed, even with some priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.