Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unsound TrustedRandomAccess implementations #85874

Merged
merged 8 commits into from
Jul 29, 2021

Conversation

steffahn
Copy link
Member

@steffahn steffahn commented Jun 1, 2021

Removes the implementations that depend on the user-definable trait Copy.

Fixes #85873 in the most straightforward way.


Edit: This PR now contains additional trait infrastructure to avoid performance regressions around in-place collect, see the discussion in this thread starting from the codegen test failure at #85874 (comment).

With this PR, TrustedRandomAccess gains additional documentation that specifically allows for and specifies the safety conditions around subtype coercions – those coercions can happen in safe Rust code with the Zip API’s usage of TrustedRandomAccess. This PR introduces a new supertrait of TrustedRandomAccess(currently named TrustedRandomAccessNoCoerce) that doesn’t allow such coercions, which means it can be still be useful for optimizing cases such as in-place collect where no iterator is handed out to a user (who could do coercions) after a get_unchecked call; the benefit of the supertrait is that it doesn’t come with the additional safety conditions around supertraits either, so it can be implemented for more types than TrustedRandomAccess.

The TrustedRandomAccess implementations for vec::IntoIter, vec_deque::IntoIter, and array::IntoIter are removed as they don’t conform with the newly documented safety conditions, this way unsoundness is removed. But this PR in turn (re-)adds a TrustedRandomAccessNoCoerce implementation for vec::IntoIter to avoid performance regressions from stable in a case of in-place collecting of Vecs [the above-mentioned codegen test failure]. Re-introducing the (currently nightly+beta-only) impls for VecDeque’s and [T; N]’s iterators is technically possible, but goes beyond the scope of this PR (i.e. it can happen in a future PR).

@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 1, 2021
@steffahn
Copy link
Member Author

steffahn commented Jun 1, 2021

Maybe such a PR should also add some form of test that reproduces a setting similar to the unsoundness in the PR and asserts that things don’t behave incorrectly? If this seems appropriate / necessary, please say so.

This PR will most likely have negative performance impact since it removes certain optimizations. Thus it might be interesting to do a benchmark to get a better feeling of how much performance is lost. Since it’s fixing a soundness issue, losing some performance shouldn’t be too bad, especially considering that the relevant optimizations (involving TrustedRandomAccess for Zip) AFAIK aren’t around for too long yet anyways.

There is the future possibility to re-add part of the removed impls with a new bound that isn’t Copy. The Copy trait was a approximation for !DropGlue anyway, so maybe some unstable unsound trait SpecializableNoDropGlue that’s implemented by non-lifetime-dependent impls (i.e. impls that are soundly usable for specialization) for a lot of standard-library-types can help regaining the optimizations at least for common standard library types such as primitive types, etc… and even for more types that didn’t have the optimization before, like e.g. mutable references. Maybe the effort of building such a trait infrastructure isn’t worth the effort though, hence my mention of a benchmark being useful.

@steffahn
Copy link
Member Author

steffahn commented Jun 3, 2021

Added some update to the documentation of TrustedRandomAccess that gives new details regarding coercions to a subtype.
This includes conditions which clarify/imply that the impls for {array, vec, vec_deque}::IntoIter<T> this PR removes were unsound, to make sure that future TrustedRandomAccess don’t reintroduce the same problem.

@steffahn
Copy link
Member Author

steffahn commented Jun 3, 2021

Rendered (internal) documentation after changes from this commit.

Screenshot_20210603_183253

/// * `std::iter::Iterator::__iterator_get_unchecked()`
/// * `std::iter::TrustedRandomAccess::size()`
/// 3. After `self.__iterator_get_unchecked(idx)` has been called, then `self.next_back()` will
/// only be called at most `self.size() - idx - 1` times. If `Self: Clone` and `self` is cloned,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this assume that idx is incremented in forwards direction? I.e. next_back may not do the right thing if you're calling __iterator_get_unchecked(self.size() - 1)

Note that the trait is called TrusedRandomAccess, so theoretically some future new iterator method could take some shortcut by directly indexing into an iterator in a non-linear fashion.

Copy link
Member Author

@steffahn steffahn Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, first of all note that I didn’t write the original text, I just clarified it w.r.t. Clone, but second: if __iterator_get_unchecked(self.size() - 1), then the requirement will be that

self.next_back() will only be called at most self.size() - idx - 1 times

so…

idx == size-1

size - idx - 1 == size - (size - 1) - 1 == 0

There you go, now you’re officially no longer allowed to call next_back at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess for all of this to make just a little bit more sense, there should also be the requirement (currently undocumented) that calling next_back decreases the size of the iterator exactly by 1.

///
/// 1. `0 <= idx` and `idx < self.size()`.
/// 2. If `self: !Clone`, then `get_unchecked` is never called with the same
/// 2. If `Self: !Clone`, then `self.__iterator_get_unchecked(idx)` is never called with the same
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have never quite understood the way this is phrased. I mean of course an index will be accessed multiple times if an iterator gets cloned, but that's happens across the different clones.
What would be the reason of having an index accessed multiple times without actually cloning an iterator?

@steffahn
Copy link
Member Author

steffahn commented Jun 4, 2021

Expected merge conflict with my own PR #85888. I won't rebase for now as to not interfere with review, the rebase will be trivial though because the other PR was just a small fix for a typo on the documentation of TrustedRandomAccess, in fact the same typo is fixed in this PR as well.

@bors
Copy link
Contributor

bors commented Jun 4, 2021

☔ The latest upstream changes (presumably #85984) made this pull request unmergeable. Please resolve the merge conflicts.

@pnkfelix pnkfelix added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jun 10, 2021
@pnkfelix
Copy link
Member

nominating for (emergency) backport to 1.53-beta

@pnkfelix
Copy link
Member

pnkfelix commented Jun 11, 2021

Can (/should?) we beta-approve just the commit c5390e3adf2402368ddbd089983ef11397125627 that removes the unsound impls, without concerning ourselves with the documentation changes for the beta...?

@steffahn
Copy link
Member Author

@pnkfelix The documentation is completely hidden anyway, it's compiler internal / only visible of you manually run x.py doc with the right environment variable for documenting hidden items. So it doesn't really matter either way.

The main open question regarding this PR is whether the implementations that this PR removes are unsound or the usage of TrustedRandomAccess by Zip is unsound. This is a question of convention and of clarifying the safety conditions around the TrustedRandomAccess trait, not a question with a clear cut answer. Short term it's a question of whether we want to break the optimizations for owned array/vec/vecdeque iterators for all applications of TrustedRandomAccess (which is only Zip and one or two other use cases) or remove all TrustedRandomAccess related optimizations just for the Zip use case. And long-term its a question of how to restructure the TrustedRandomAccess trait (and perhaps additional traits) in order to maybe have all optimizations back except for just the owned array/etc iterators only in the Zip use case.

If a fix is wanted for 1.53, applying this PR is probably easier than any alternative because this PR already exists. Leaving out the documentation commit is probably easier because there's less potential for merge conflict and the internal documentation changes are irrelevant for a release version of the compiler.

The effects of this PR beyond fixing the unsound case are only affecting performance and any alternative approach that fixes the unsoundness, too, and reintroduces the removed impls in some way, can still be chosen by future PRs without any problems. So merging this PR into nightly doesn't come with any form of commitment / drawbacks long-term.

@Mark-Simulacrum Mark-Simulacrum removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jun 11, 2021
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 27, 2021
@JohnCSimon
Copy link
Member

ping from triage;
@steffahn - can you please address the merge conflicts?
Thank you.

@steffahn steffahn force-pushed the fix_unsound_zip_optimization branch from 328c35f to 99f95e5 Compare June 28, 2021 08:50
@steffahn
Copy link
Member Author

can you please address the merge conflicts?

done

@rustbot label S-waiting-on-review, -S-waiting-on-author

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 28, 2021
@m-ou-se
Copy link
Member

m-ou-se commented Jun 30, 2021

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 30, 2021
@m-ou-se m-ou-se assigned m-ou-se and unassigned kennytm Jun 30, 2021
@bors
Copy link
Contributor

bors commented Jul 25, 2021

☔ The latest upstream changes (presumably #86595) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@yaahc yaahc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did my best to review the full backlog of comments in all the related issues and then review this PR and while I can't say I fully understand how these specializations work yet the explanations you gave do seem to make sense. Particularly the bit about adding the supertrait that only gets used in locations where we don't coerce to sub types which seems like a valid fix to the perf regression without breaking the Zip unsoundness fixes.

I'd love to sit down sometime with either @the8472 or @steffahn to more fully understand how this code all fits together but for now I'm satisfied with this fix. I have one question I'd like to answer before approving which hopefully will clarify much of the bits I don't currently understand then we should be able to get this merged.

{
// Safety: The TrustedRandomAccess contract requires that callers only pass an index
unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
// Safety: The TrustedRandomAccess contract requires that callers only pass an index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: I'm still unfamiliar with __iterator_get_unchecked and the related specializations so I'm going to be asking some probably silly questions:

Is this comment still relevant?

Also, why doesn't this need a where Self: TrustedRandomAccessNoCoerce bound?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still relevant?

It's unsafe code so it should have a SAFETY comment. And it's still correct since this method should only be called by specializing impls relying on TrustedRandomAccess.

Also, why doesn't this need a where Self: TrustedRandomAccessNoCoerce bound?

TrustedRandomAccess is implemented unconditionally for this iterator so the bound would be always true.

But maybe it would still make sense for consistency or in case the trait impls are removed at some point. But I don't think it's needed for safety here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah, I see. I'm assuming that means that all the other __iterator_get_unchecked impls that leave of that bound also have it similarly implied? If that's the case I'd like to see those all have the bound added explicitly though that doesn't need to happen in this PR, but them being left off in some cases and not others seems somewhat confusing and could add to the maintenance burden.

Copy link
Member Author

@steffahn steffahn Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But maybe it would still make sense for consistency

Actually consistency is the reason I removed the bound. Initially I had left the Self: TrustedRandomAccess bounds in place and relied on type errors to point out all the places that needed changing (to Self: TrustedRandomAccessNoCoerce). Afterwards, grepping for and going through all the remaining Self: TrustedRandomAccess occurrences (just to be sure) I came acoss this case (and another one, too IIRC) that didn’t produce an error message: Surprised me as well, I eventually figured out that the bound was entirely useless anyways. Looking at other iterators in std, it seems to have be commonly done without any Self: TrustedRandomAccess bound before when those weren’t necessary. On all kinds of iterators, e.g. all the ones on slices (e.g. also the ones from .chunks(…) and similar). IIRC, with the changes of this PR all the superfluous bounds are consistently gone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case I'd like to see those all have the bound added explicitly though that doesn't need to happen in this PR, but them being left off in some cases and not others seems somewhat confusing and could add to the maintenance burden.

So as far as I know, after this PR only those implementations that need the bound will have it. Don’t quote me on that, I’d need to double check if this is indeed true. This very PR demonstrated that leaving the bounds off can actually help maintainance in some cases, like it helped me being able to use error messages to find all the places that needed to change, like I described above. The redundant Self: TrustedRandomAccess binding didn’t generate any error messages though, so their existence made the refactor a bit harder (since I needed to grep for things in order to fix those redundant bindings (for now, by removing them instead of changing them to Self: TrustedRandomAccesNoCoerce)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here’s a “grep” example. Looks like there are about 15 Iterators without the bound on the method (on the branch of this PR, so before this PR there were 13 Iterators without the bound)

~/forks/rust/library   fix_unsound_zip_optimization  rg 'fn __iterator_' -A3
core/src/iter/range.rs
677:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item
678-    where
679-        Self: TrustedRandomAccessNoCoerce,
680-    {

core/src/str/iter.rs
299:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> u8 {
300-        // SAFETY: the caller must uphold the safety contract
301-        // for `Iterator::__iterator_get_unchecked`.
302-        unsafe { self.0.__iterator_get_unchecked(idx) }

core/src/iter/adapters/fuse.rs
119:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item
120-    where
121-        Self: TrustedRandomAccessNoCoerce,
122-    {

core/src/iter/adapters/enumerate.rs
117:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> <Self as Iterator>::Item
118-    where
119-        Self: TrustedRandomAccessNoCoerce,
120-    {

core/src/iter/adapters/map.rs
128:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> B
129-    where
130-        Self: TrustedRandomAccessNoCoerce,
131-    {

core/src/iter/adapters/cloned.rs
64:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> T
65-    where
66-        Self: TrustedRandomAccessNoCoerce,
67-    {

core/src/slice/iter.rs
1268:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1269-        // SAFETY: since the caller guarantees that `i` is in bounds,
1270-        // which means that `i` cannot overflow an `isize`, and the
1271-        // slice created by `from_raw_parts` is a subslice of `self.v`
--
1423:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1424-        let start = idx * self.chunk_size;
1425-        let end = match start.checked_add(self.chunk_size) {
1426-            None => self.v.len(),
--
1588:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1589-        let start = idx * self.chunk_size;
1590-        let end = match start.checked_add(self.chunk_size) {
1591-            None => self.v.len(),
--
1759:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1760-        let start = idx * self.chunk_size;
1761-        // SAFETY: mostly identical to `Chunks::__iterator_get_unchecked`.
1762-        unsafe { from_raw_parts(self.v.as_ptr().add(start), self.chunk_size) }
--
1911:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
1912-        let start = idx * self.chunk_size;
1913-        // SAFETY: see comments for `ChunksMut::__iterator_get_unchecked`.
1914-        unsafe { from_raw_parts_mut(self.v.as_mut_ptr().add(start), self.chunk_size) }
--
2172:    unsafe fn __iterator_get_unchecked(&mut self, i: usize) -> &'a [T; N] {
2173-        // SAFETY: The safety guarantees of `__iterator_get_unchecked` are
2174-        // transferred to the caller.
2175-        unsafe { self.iter.__iterator_get_unchecked(i) }
--
2289:    unsafe fn __iterator_get_unchecked(&mut self, i: usize) -> &'a mut [T; N] {
2290-        // SAFETY: The safety guarantees of `__iterator_get_unchecked` are transferred to
2291-        // the caller.
2292-        unsafe { self.iter.__iterator_get_unchecked(i) }
--
2435:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
2436-        let end = self.v.len() - idx * self.chunk_size;
2437-        let start = match end.checked_sub(self.chunk_size) {
2438-            None => 0,
--
2597:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
2598-        let end = self.v.len() - idx * self.chunk_size;
2599-        let start = match end.checked_sub(self.chunk_size) {
2600-            None => 0,
--
2761:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
2762-        let end = self.v.len() - idx * self.chunk_size;
2763-        let start = end - self.chunk_size;
2764-        // SAFETY:
--
2919:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
2920-        let end = self.v.len() - idx * self.chunk_size;
2921-        let start = end - self.chunk_size;
2922-        // SAFETY: see comments for `RChunksMut::__iterator_get_unchecked`.

core/src/iter/adapters/copied.rs
80:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> T
81-    where
82-        Self: TrustedRandomAccessNoCoerce,
83-    {

core/src/iter/adapters/zip.rs
92:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item
93-    where
94-        Self: TrustedRandomAccessNoCoerce,
95-    {

core/src/iter/traits/iterator.rs
3492:    unsafe fn __iterator_get_unchecked(&mut self, _idx: usize) -> Self::Item
3493-    where
3494-        Self: TrustedRandomAccessNoCoerce,
3495-    {

core/src/slice/iter/macros.rs
321:            unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
322-                // SAFETY: the caller must guarantee that `i` is in bounds of
323-                // the underlying slice, so `i` cannot overflow an `isize`, and
324-                // the returned references is guaranteed to refer to an element

alloc/src/collections/vec_deque/iter.rs
107:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
108-        // Safety: The TrustedRandomAccess contract requires that callers only pass an index
109-        // that is in bounds.
110-        unsafe {

alloc/src/collections/vec_deque/iter_mut.rs
93:    unsafe fn __iterator_get_unchecked(&mut self, idx: usize) -> Self::Item {
94-        // Safety: The TrustedRandomAccess contract requires that callers only pass an index
95-        // that is in bounds.
96-        unsafe {

alloc/src/vec/into_iter.rs
169:    unsafe fn __iterator_get_unchecked(&mut self, i: usize) -> Self::Item
170-    where
171-        Self: TrustedRandomAccessNoCoerce,
172-    {

@yaahc
Copy link
Member

yaahc commented Jul 27, 2021

I'm ready to merge this once the conflicts have been resolved.

@bors delegate+

@bors
Copy link
Contributor

bors commented Jul 27, 2021

✌️ @steffahn can now approve this pull request

@steffahn
Copy link
Member Author

I’ll handle the merge conflict tomorrow.

Removes the implementations that depend on the user-definable trait `Copy`.
Include new details regarding coercions to a subtype.
These conditions also explain why the previously removed implementations
for {array, vec, vec_deque}::IntoIter<T> were unsound, because they introduced
an extra `T: Clone` for the TrustedRandomAccess impl, even though their parameter T
is covariant.
…arantees about subtype coercions

Update all the TrustedRandomAccess impls to also implement the new supertrait
@steffahn steffahn force-pushed the fix_unsound_zip_optimization branch from 18a6d45 to 6d9c0a1 Compare July 28, 2021 12:37
@steffahn
Copy link
Member Author

Rebased the PR and resolved the conflict.

Thanks a lot for the final review, @yaahc!

@bors r=yaahc

@bors
Copy link
Contributor

bors commented Jul 28, 2021

📌 Commit 6d9c0a1 has been approved by yaahc

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 28, 2021
@bors
Copy link
Contributor

bors commented Jul 29, 2021

⌛ Testing commit 6d9c0a1 with merge 8523788...

@bors
Copy link
Contributor

bors commented Jul 29, 2021

☀️ Test successful - checks-actions
Approved by: yaahc
Pushing 8523788 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 29, 2021
@bors bors merged commit 8523788 into rust-lang:master Jul 29, 2021
@rustbot rustbot added this to the 1.56.0 milestone Jul 29, 2021
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Aug 7, 2021
…=dtolnay

Fix intra doc link in hidden doc of Iterator::__iterator_get_unchecked

Recently, I edited the import list of the `core::iter::traits::iterator` module (in rust-lang#85874). This results in a broken intra doc link in a hidden documentation with the effect that `RUSTDOCFLAGS='--document-private-items --document-hidden-items' x doc library/std` fails. (This can be worked around by adding `-Arustdoc::broken-intra-doc-links`; still, it’s a broken link so let’s fix it.)

`@rustbot` label C-cleanup, T-libs
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Aug 7, 2021
…=dtolnay

Fix intra doc link in hidden doc of Iterator::__iterator_get_unchecked

Recently, I edited the import list of the `core::iter::traits::iterator` module (in rust-lang#85874). This results in a broken intra doc link in a hidden documentation with the effect that `RUSTDOCFLAGS='--document-private-items --document-hidden-items' x doc library/std` fails. (This can be worked around by adding `-Arustdoc::broken-intra-doc-links`; still, it’s a broken link so let’s fix it.)

``@rustbot`` label C-cleanup, T-libs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet