Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch on needs_drop in Vec::drop? #24280

Closed
SimonSapin opened this issue Apr 10, 2015 · 24 comments
Closed

Branch on needs_drop in Vec::drop? #24280

SimonSapin opened this issue Apr 10, 2015 · 24 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one.

Comments

@SimonSapin
Copy link
Contributor

When a arena::TypedArena is dropped, it uses std::intrinsics::needs_drop to skip entirely the loop that reads items and runs their destructors.

Vec however does no such thing, and runs a similar loop unconditionally. Should it check needs_drop as well, or can we rely on the loop being optimized away entirely when it does nothing?

cc @pnkfelix, @gankro

@Gankra
Copy link
Contributor

Gankra commented Apr 10, 2015

First I've seen of needs_drop, but it seems to me that this is something best handled by general optimizations. especially for a case as simple as Vec::drop. Is it not being optimized away today?

@SimonSapin
Copy link
Contributor Author

Is it not being optimized away today?

I admit I haven’t bothered writing a benchmark to find out :]

@steveklabnik steveklabnik added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Apr 10, 2015
@SimonSapin
Copy link
Contributor Author

@steveklabnik, does "needstest" include comparative benchmarks as well?

@SimonSapin
Copy link
Contributor Author

Or is it possible to test directly the output of the optimizer to see that some bit of code that ends up doing nothing is actually optimized away?

@steveklabnik
Copy link
Member

I don't know off the top of my head, to be honest.

@pnkfelix pnkfelix added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Apr 13, 2015
@pnkfelix
Copy link
Member

(i don't think this qualifies as "needstest"; that is usually applied to issues, usually bugs, where the only remaining task post bug fix was to put in a regression test. The use of the word "test" in the title for this issue is referring to a conditional in the low-level code, not a regression test; not sure if that was part of confusion here or not. Anyway, removing the E-needstest label.)

@pnkfelix pnkfelix removed the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Apr 13, 2015
@pnkfelix
Copy link
Member

(i would not object, however, if we used a hypothetical B-datagathering label, as in "blocked on data gathering".)

@SimonSapin
Copy link
Contributor Author

Ah, yes, "test" in the title was meant as "use an if statement".

@steveklabnik
Copy link
Member

@SimonSapin ohhhhhhhhhhh yeah, I just had a parsing error 😉

@pnkfelix pnkfelix changed the title Test needs_drop in Vec::drop? Branch on needs_drop in Vec::drop? Apr 13, 2015
@abonander
Copy link
Contributor

Because it caught my attention, I spent most of the morning tinkering on this. I even built the exact revision of LLVM that Rust master is using so I could use the llvm-dis tool for the correct version of LLVM.

I believe I've come to the conclusion that the drop-loop is optimized out for Vec<T> where T: !Drop.

This is the code I used to test this:

fn main() {
    (0..256).collect::<Vec<_>>(); 
}

I then compiled it with rustc -O --emit=llvm-bc and disassembled the subsequent bitcode file with llvm-dis, which yielded this optimized IR.

Now, I can't really read LLVM IR, but I only see one loop-body section, and that looks to be part of the .collect() call. That's the only looping structure that I can see, scanning through this code. It actually looks like the Vec is unceremoniously deallocated immediately after the loop exits, at label then-block-1112.

But I could be wrong. This is half-gibberish to me.

@sfackler
Copy link
Member

Branching will make things easier in the unoptimized case for things like BufStream that have 2 64k Vec<u8>s.

@abonander
Copy link
Contributor

@sfackler I hadn't considered the unoptimized case. 👍 to branching from me.

@lilyball
Copy link
Contributor

@cybergeek94 Is llvm-dis actually doing anything useful there that you can't get with --emit=llvm-ir instead?

@abonander
Copy link
Contributor

@kballard Correct me if I'm wrong, but doesn't --emit=llvm-ir emit the IR before it's run through the optimizer? If not, I did a lot of work for nothing based on a faulty assumption. 😆

@lilyball
Copy link
Contributor

@cybergeek94 Ok: "you're wrong" 😉 If you pass the -O flag it emits optimized IR.

@abonander
Copy link
Contributor

Well, I learned how to build LLVM and work with IR, so maybe not a total waste.

@whitequark
Copy link
Member

I have some image processing code that has to display video. It meets timings with -O2, but that takes 20s to compile. With -O0, it takes 4s, which gives a more reasonable feedback loop, but then it calls drop_in_place for every u8 in a 12MB Vec.

I have a workaround, which is really terrible:

        unsafe {
            let mut data = <whatever we drop>;
            let raw_vec = alloc::raw_vec::RawVec::from_raw_parts(
                                data.as_mut_ptr(), data.capacity());
            mem::forget(data);
            mem::drop(raw_vec);
        }

@bluss
Copy link
Member

bluss commented Sep 20, 2015

By the way, vec.set_len(0) is a much better workaround if the elements don't need drop.

@whitequark
Copy link
Member

@bluss Thanks!

@SimonSapin
Copy link
Contributor Author

So even if a loop of no-op drops is optimized away, it sounds like skipping it entirely based on needs_drop would help non-optimized builds. It should be an easy patch.

@gankro does it sound reasonable to do across all collections?

@Gankra
Copy link
Contributor

Gankra commented Sep 21, 2015

@SimonSapin it's a bit of a bumber to have to do, but it seems reasonable, yeah.

bors added a commit that referenced this issue Sep 21, 2015
With -O2, LLVM's inliner can remove this code, but this does not happen
with -O1 and lower. As a result, dropping Vec<u8> was linear with length,
resulting in abysmal performance for large buffers.

See issue #24280.
@nagisa
Copy link
Member

nagisa commented Sep 28, 2015

Fixed by #28531?

There’s still no set_len(0) else branch, though. Not sure if it matters much though.

@SimonSapin
Copy link
Contributor Author

Yeah, I’d say this is fixed.

I don’t think set_len(0) would do anything here.

@bluss
Copy link
Member

bluss commented Sep 28, 2015

set_len was just a better alternative to @whitequark's workaround code (comment above)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one.
Projects
None yet
Development

No branches or pull requests

10 participants