Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent Vec::drain_filter from double dropping on panic #61224

Merged
merged 7 commits into from
Jul 9, 2019

Conversation

aloucks
Copy link
Contributor

@aloucks aloucks commented May 27, 2019

Fixes: #60977

The changes in this PR prevent leaking and double-panicking in addition to double-drop.

Tracking issue: #43244

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cramertj (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 27, 2019
@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:00cb62a0:start=1558926460815491735,finish=1558926461599598202,duration=784106467
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
$ export GCP_CACHE_BUCKET=rust-lang-ci-cache
$ export AWS_ACCESS_KEY_ID=AKIA46X5W6CZEJZ6XT55
---
[00:05:00] tidy error: /checkout/src/liballoc/vec.rs:2810: trailing whitespace
[00:05:05] some tidy checks failed
[00:05:05] 
[00:05:05] 
[00:05:05] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor" "--quiet"
[00:05:05] 
[00:05:05] 
[00:05:05] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:05:05] Build completed unsuccessfully in 0:01:17
[00:05:05] Build completed unsuccessfully in 0:01:17
[00:05:05] Makefile:67: recipe for target 'tidy' failed
[00:05:05] make: *** [tidy] Error 1
The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:015996d7
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Mon May 27 03:12:58 UTC 2019
---
travis_time:end:21f6eb60:start=1558926779115216981,finish=1558926779121200896,duration=5983915
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:003e6c0a
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:007a9ce8
travis_time:start:007a9ce8
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:3497a48c
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Centril
Copy link
Contributor

Centril commented May 27, 2019

r? @RalfJung

@rust-highfive rust-highfive assigned RalfJung and unassigned cramertj May 27, 2019
src/liballoc/vec.rs Outdated Show resolved Hide resolved
@RalfJung
Copy link
Member

Cc @rust-lang/libs would appreciate if one of you could take over review, this is a bit outside my comfort zone.

let del = self.del;
let src: *const T = &v[i];
let dst: *mut T = &mut v[i - del];
// This is safe because self.vec has length 0
// thus its elements will not have Drop::drop
// called on them in the event of a panic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment no longer true? Why is it safe instead then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the intent was to leak instead of double drop, but it didn't quite work. I've done some minor refactoring and added additional comments.

It's safe now because there are additional checks in DrainFilter::drop that perform the cleanup in the event of a panic in the filter predicate.

@jonas-schievink jonas-schievink added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jun 6, 2019
@ebkalderon
Copy link
Contributor

Is this PR still being worked on? I'd love to see this unsound behavior disappear.

@RalfJung
Copy link
Member

From what I can see, this is waiting for a reviewer. As mentioned above, I don't feel comfortable reviewing this. Cc @rust-lang/libs

@Centril
Copy link
Contributor

Centril commented Jun 25, 2019

r? @gankro

@rust-highfive rust-highfive assigned Gankra and unassigned RalfJung Jun 25, 2019
@Gankra
Copy link
Contributor

Gankra commented Jun 25, 2019

sigh fine
Screen Shot 2019-06-25 at 4 35 31 PM

@aloucks
Copy link
Contributor Author

aloucks commented Jul 1, 2019

@gankro Any feedback?

@Gankra
Copy link
Contributor

Gankra commented Jul 2, 2019

yeah sorry, i plan to review this this morning, just caught me coming off a very draining work trip and leading into a long weekend.

Copy link
Contributor

@Gankra Gankra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks good, assuming that panicking is guaranteed to behave as assumed here. (compiler devs have always been hand-wavy to me about this topic).

Mostly some cosmetic concerns, as "the right way to write unsafe code" has changed a fair bit since this code landed, and we should take the opportunity to bring this code up to snuff. In particular, creating a slice for the whole buffer when we're de-initializing parts of it is very dubious.

// but was not due to a panic in the filter predicate. This is
// implemented via drop so that it's guaranteed to run even in
// the event of a panic while consuming the remainder of the
// DrainFilter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(currently checking with the lang team that we guarantee that this works)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @rust-lang/lang @matthewjasper @arielb1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specifically: if we panic in a destructor and we weren't yet panicking:

  • we unwind the drop impl, dropping its locals
  • then drop_in_place self's fields as normal
  • then proceed into a normal panic, unwinding the stack

(the middle point I'm not sure is necessary here, but is just my understanding of what is supposed to happen.)

Copy link
Member

@RalfJung RalfJung Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* then drop_in_place `self`'s fields as normal

This is indeed what we're currently doing:

// START rustc.ptr-real_drop_in_place.std__vec__Vec_i32_.AddMovesForPackedDrops.before.mir
// bb0: {
// goto -> bb7;
// }
// bb1: {
// return;
// }
// bb2 (cleanup): {
// resume;
// }
// bb3: {
// goto -> bb1;
// }
// bb4 (cleanup): {
// goto -> bb2;
// }
// bb5 (cleanup): {
// drop(((*_1).0: alloc::raw_vec::RawVec<i32>)) -> bb4;
// }
// bb6: {
// drop(((*_1).0: alloc::raw_vec::RawVec<i32>)) -> [return: bb3, unwind: bb4];
// }
// bb7: {
// _2 = &mut (*_1);
// _3 = const <std::vec::Vec<i32> as std::ops::Drop>::drop(move _2) -> [return: bb6, unwind: bb5];
// }
// END rustc.ptr-real_drop_in_place.std__vec__Vec_i32_.AddMovesForPackedDrops.before.mir

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@gnzlbg gnzlbg Jul 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gankro your understanding is correct. If drop panics:

  • we unwind the drop impl, dropping its locals,
  • then drop_in_place self's fields as normal (where normal means in field declaration order - first field is dropped before second field is dropped),
  • then we continue unwinding Drop::drop's caller stack

If dropping a value in Drop::drop stack while its unwinding panics, or if dropping one of the type's fields panics, then we have a double-drop, and the process is guarantee to abort for all double drops.

Note also that the panic doesn't need to originate in the Drop::impl, this one can succeed, but then the fields are dropped in declaration order, and one of them can panic. When this happens, the remaining fields are still dropped in place, so if a second one of them panics, then you have a double drop. If the Drop::impl does not panic, but dropping one of the fields panics, and you don't get a double drop, you are guaranteed that the field that panicked is the only one that wasn't properly dropped.

EDIT: for all practical purposes, if dropping a value panics, you should treat that value as dropped, since "most" of the value will have been dropped in the process (at most one field wasn't dropped).

let dst: *mut T = &mut v[i - del];
ptr::copy_nonoverlapping(src, dst, 1);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this while loop exists. Surely this should just be:

if self.drain.idx < self.drain.old_len {
  // It looks like `pred` panicked, so we didn't process all the elements.
  // This is a pretty messed up state, and there isn't really an obviously right
  // thing to do (and we don't want to keep trying to execute `pred`). So we
  // just backshift all the unprocessed elements and tell the vec that they still
  // exist, hoping that doesn't mess up anyone further along in the panic.
  let idx = self.drain.idx;
  let num_deleted = self.drain.del;
  let tail_len = self.drain.old_len - idx;
  let ptr = self.drain.vec.as_mut_ptr();
  if num_deleted > 0 {
    ptr.add(idx).copy_to(ptr.add(idx - num_deleted), tail_len);
  }
}

self.drain.vec.set_len(self.drain.old_len - self.drain.del);

(Here I modernized the code a bit to use the newer raw pointer APIs, and some clearer names. It would be a nice cleanup to this code if you also did that to the Iterator's fields and its next impl as well, although not a blocker.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I incorporated this in the latest commit. I also consolidated the the self.drain.del check into the parent if and made the src and dst pointers explicit while still using the more modern pointer APIs.

@@ -2751,6 +2752,7 @@ pub struct DrainFilter<'a, T, F>
del: usize,
old_len: usize,
pred: F,
panic_flag: bool,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has grown enough fields that they should really be documented. Either individually here, or with a high-level description of what we're trying to do in the implementation of next.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added docs to each field in the latest commit.

if (self.pred)(&mut v[i]) {
self.panic_flag = true;
let drained = (self.pred)(&mut v[i]);
self.panic_flag = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a terrible compulsion to try to encode this state in some magic combination of old_len/idx/del, but this is probably clearest, and easiest for llvm to evaporate when it sees pred can't unwind.

Copy link
Contributor Author

@aloucks aloucks Jul 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that route initially but came to the conclusion that even if it's possible, the simple flag would be much easier to understand and maintain.

@Gankra

This comment has been minimized.

@Centril Centril closed this Jul 8, 2019
@Centril Centril reopened this Jul 8, 2019
@Centril
Copy link
Contributor

Centril commented Jul 8, 2019

@bors delegate=Gankro

@bors
Copy link
Contributor

bors commented Jul 8, 2019

✌️ @gankro can now approve this pull request

@Gankra
Copy link
Contributor

Gankra commented Jul 8, 2019

@bors r+

@bors
Copy link
Contributor

bors commented Jul 8, 2019

📌 Commit df5b32e has been approved by Gankro

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 8, 2019
@bors
Copy link
Contributor

bors commented Jul 8, 2019

⌛ Testing commit df5b32e with merge 53cb0008f8a448df6461773fd42a8c81a77637dd...

@bors
Copy link
Contributor

bors commented Jul 8, 2019

💔 Test failed - checks-azure

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jul 8, 2019
@aloucks
Copy link
Contributor Author

aloucks commented Jul 8, 2019

@gankro @Centril Retry?

macOS dist-x86_64-apple
Started: 7/8/2019, 11:40:37 AM
Agent: Hosted Agent
3h 22m 8s

Job issues
·
1 error
The agent: Hosted Agent lost communication with the server. Verify the machine is running and has a healthy network connection. For more information, see: https://go.microsoft.com/fwlink/?linkid=846610

@Centril
Copy link
Contributor

Centril commented Jul 8, 2019

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 8, 2019
@bors
Copy link
Contributor

bors commented Jul 8, 2019

⌛ Testing commit df5b32e with merge 09ab31b...

bors added a commit that referenced this pull request Jul 8, 2019
Prevent Vec::drain_filter from double dropping on panic

Fixes: #60977

The changes in this PR prevent leaking and double-panicking in addition to double-drop.

Tracking issue: #43244
@bors
Copy link
Contributor

bors commented Jul 9, 2019

☀️ Test successful - checks-azure, checks-travis, status-appveyor
Approved by: Gankro
Pushing 09ab31b to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 9, 2019
@bors bors merged commit df5b32e into rust-lang:master Jul 9, 2019
Centril pushed a commit to Centril/rust that referenced this pull request Dec 15, 2019
This PR reverts rust-lang#48065, which aimed to optimize `Vec::retain` by
making use of `Vec::drain_filter`. Unfortunately at that time,
`drain_filter` was unsound.

The soundness hole in `Vec::drain_filter` was fixed in rust-lang#61224 by
guaranteeing that cleanup logic runs via a nested `Drop`, even in
the event of a panic. Implementing this nested drop affects codegen
(apparently?) and results in slower code.

Fixes rust-lang#65970
Centril added a commit to Centril/rust that referenced this pull request Dec 15, 2019
Restore original implementation of Vec::retain

This PR reverts rust-lang#48065, which aimed to optimize `Vec::retain` by making use of `Vec::drain_filter`. Unfortunately at that time, `drain_filter` was unsound.

The soundness hole in `Vec::drain_filter` was fixed in rust-lang#61224 by guaranteeing that cleanup logic runs via a nested `Drop`, even in the event of a panic. Implementing this nested drop affects codegen (apparently?) and results in slower code.

Fixes rust-lang#65970
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Double drop in Vec::drain_filter