-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stacked Borrows: How precise should UnsafeCell be tracked? #236
Comments
To be precise, here is currently how we compute the permissions associated with a shared reference
Every location not marked as writable in the above procedure is marked as read-only. |
For reasons similar to those brought up in #77, #84, and #253, I think the multiple variants case should consider the active variant, when determining runtime writability. I cannot think of a particular example where this distinction would have an effect (one would probably use an external |
FWIW, making |
In fact, even just making each field be in an So making use of the exception that is being added in rust-lang/rust#98017 is rather fragile currently, but would fairly very well if we don't attempt to track where inside a type the |
There is still a possible axiomatic alternative: treat padding even more weakly. All fields being I don't know exactly what this would look like, and it would probably break some important properties of Stacked Borrows — the simplest interpretation is probably to have a “padding” permission tag on the borrow stack for a byte reborrowed as padding, and then reading through a padding permission both returns an uninitialized byte and stores uninit. (Making a read perform a write is the certainly broken bit.) Ultimately I'm also in favor of the weakening applied by UnsafeCell (and by analogy, PhantomPinned) applying to the entire pointee if the pointee type contains any UnsafeCell. As chorman has pointed out, this greatly limits the reasoning the compiler1 can make about generic code, but giving that up this seems justified2 by the operational specification simplifications gained. What might be beneficial under such a model is splitting the Footnotes
|
We could do something operational where if all fields of a struct are entirely
Basically, separating its effect on protectors and permissions? Sure, that would be easy to represent. |
I actually think there might be a more insidious problem, if this includes the fact that you can get a
I'd be even fine with a more expansive restriction: Any contiguous extent of padding bytes that preceeds or follows an |
I don't see why that has to be the case, or why it would be the case in MIR+LLVM.
Even precisely defining the notion of "padding byte" is super hard (and subtle), on top of the already subtle rules for how far we traverse looking for |
On Tue, Jun 28, 2022 at 09:52 Ralf Jung ***@***.***> wrote:
If it's just mutability, then the loss in generic contexts actually
affects non-generic contexts as well,
I don't see why that has to be the case, or why it would be the case in
MIR.
In XIR, pointer types consistent of a number of validity attributes and
some definition attributes (pointer kind and syntax declaration kinds), and
pointee type. Different attributes forms different pointer types that are
incompatible with each other. An explicit conversion (called the derive
operation) is necessary. This is because an implicit conversion is
otherwise needed (else you can get a *T that is the same pointer as a
*unique T, and which allows both to be passed to the same function) and I
don't like doing non-trivial operations implicitly.
I'd be even fine with a more expansive restriction: Any contiguous extent
of padding bytes that preceeds or follows an UnsafeCell gets SRW as well.
Unlike actual fields, XIRs model doesn't let you just point to padding
bytes the same way as a subobject (I mean, you can have a pointer to it,
but not a pointer to it), which is where the problem would arrise.
Even precisely defining the notion of "padding byte" is super hard (and
subtle), on top of the already subtle rules for how far we traverse looking
for UnsafeCell, on top of the already super subtle aliasing rules... I
think this is way overspending our complexity budget.
Fair, though it encompasses the previous proposal. Since searching for
Freeze itself is structural on a type, though, couldn't it just use the
implicit Pad fields?
… —
Reply to this email directly, view it on GitHub
<#236 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGLD26CJRDU24DNQR672N3VRL7RTANCNFSM4N4GIRZQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Can you explain this without going into the specifics of XIR? As in, is there actually something fundamental here, or just a design quirk of XIR?
There are no implicit Pad fields. Padding is the absence of a field at a given offset into a struct. (And something more complicated for enums.) At least that's how I prefer to think of it. |
It's fundamental, part of the type system. Especially for non-trivial validity, it changes how the pointers interact with certain operations, like being moved (particularily into a function, local, or a static, the same way that
Yeah, that's how I'd think of it as well. Parts of the object-representation that don't participate in determining the value. |
I don't mean fundamental to XIR. I mean fundamental in a way that does not depend on the details of the language this is formalized in. |
Context: I am mostly interested in XIR insofar as it teaches us general lessons that we might miss by taking a MIR/LLVM view, due to the quirks of those languages. I am much less interested in XIR itself. So if there is a lesson here that transcends XIR, I would like to learn it. |
Fair, it's not really fundmental in that way, It comes arround from the design of XIR and just not wanting to do highly complicated operations implicitly. I think you could argue that any time you've got a generic IR that doesn't have high-level source language queries, you'd have a similar problem though. Having it be part of the type system isn't but is that not what the type system is for, encoding information about the constraints on values? |
What I don't get is why you aren't willing to / think you can't handle making the operations explicit as part of lowering to XIR. This kind of elaboration is an intrinsic part of lowering to any IR.
This is in fact a pessimisation of what optimizations can be done on a generic IR (e.g. MIR, XIR), but it's not like this is a choice we're making without knowing its the case; we'd love to have more MIR opts in rustc as well. What I think you/XIR needs and doesn't have at the moment is to be able to apply the pointer (restricting) attributes to the types, with the semantics that this makes any pointer access to that type act as if it had the attribute. And if you're throwing away the required type information before the point where it would be required to provide/take advantage of pointer attributes on the pointee, then I think you'll just need to accept that you won't be able to optimize Rust code as well as otherwise. (An option to preserve doing the optimizations in generic paths relying on the lost attributes would be to assume a monomorphization adding the "usual" attributes and preprovide a partial morphization as such.) |
Yes, but then the question is what is the type system that actually encodes this properly. :) My proposal is to view interior mutability as a property of the reference, i.e., we have two kinds of shared references: interior mutable ones and regular frozen ones. The only purpose of Of course actually erasing it can only be done after monomorphization, so a compiler working on polymorphic IR has to consider that each |
OTOH, already the current status of UnsafeCell "infecting" surrounding enums is a problem for some possible approaches to ensuring that |
On Wed, 29 Jun 2022 at 19:12, Christopher Durham ***@***.***> wrote:
not wanting to do highly complicated operations implicitly.
What I don't get is why you aren't willing to / think you can't handle
making the operations explicit as part of lowering to XIR. This kind of
elaboration is an intrinsic part of lowering to *any* IR.
The issue here is something like function pointers. `fn<T>(&T)`
instantiated with `i32` has to be compatible with `fn(&i32)`.
fn(&i32) currently becomes `function(*readonly dereferenceable(4)
aligned(4) int(32))`, which requires being called with a single parameter
of type `*readonly dereferencable(4) aligned(4) int(32)`. A conversion does
exist (the `derive` instruction does pointer attribute conversion w/o
directly changing types), but the problem is that `fn foo<T>(&T)` under
these rules doesn't get `readonly`, but if I instantiate `foo<T>` with
`i32` and take an fn-ptr to it, then the resulting pointer must be the same
type as `fn(&i32)`.
Extending mutable similarily to UnsafeCell is a possibility but I run into
problems with optimizing C and C++ (specifically, non-mutable subobjects of
a const complete object with a mutable subobject can't assume
immutability), and if I'm allowed to get a `unique` interior pointer that
moving around the exterior pointer doesn't invalidate, then I'm fairly
certain I can get a `*unique` pointer using offsetting, due to lack of
reachability restrictions (which is basically banned by C and restrained by
C++) and a no-attribute pointer to have the same pointer value (which means
that both can be passed into a function, and still alias).
I think you could argue that any time you've got a generic IR that doesn't
have high-level source language queries, you'd have a similar problem
though.
This is in fact a pessimisation of what optimizations can be done on a
generic IR (e.g. MIR, XIR), but it's not like this is a choice we're making
without knowing its the case; we'd love to have more MIR opts in rustc as
well.
What I *think* you/XIR needs and doesn't have at the moment is to be able
to apply the pointer (restricting) attributes to the types, with the
semantics that this makes any pointer access to that type act as if it had
the attribute.
And if you're throwing away the required type information before the point
where it would be required to provide/take advantage of pointer attributes
on the pointee, then I think you'll just need to accept that you won't be
able to optimize Rust code as well as otherwise.
I had originally considered this, but I opted against it in favour of doing
the same for pointer types (`__lccc::xlang_pointer_attributes` adds pointer
attributes to a pointer type used in a field, there's also
`xlang_scalar_attributes` which is the same but for scalar types, and is
used by `NonZero*` types). One consideration is that the user could just
have plugins stop being run at any point in compilation (incl. before
generics are instantiated), which means any and *all* information has to be
encoded in the IR directly. Additionally, adding pointer attributes
implicitly is bad for two reasons:
* One, how would you distinguish between a pointer that can have it validly
added vs. one that can't (IE. `*const i32` vs. `&i32`). Both use the XIR
pointer type internally
* Two, Adding pointer attributes changes the behaviour of operations, hence
making it part of the pointer type. Further, the entire core of the
pointer model in XIR is based on this special behaviour - implicit derive
operations. Adding pointer attributes to pointees means that you come back
to the "operations implicitly changing behaviour"
The ability to remove pointer attributes, which would be better for the
first, was even more problematic, due to backwards and forwards
compatibility of the IR - adding an attribute to a type in the future
strictly increases the defined behaviour of a program.
… —
Reply to this email directly, view it on GitHub
<#236 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGLD2YTRDGN7MAFTBVY56LVRTJ7BANCNFSM4N4GIRZQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
This came up in Zulip again because Tree Borrows, unlike Stacked Borrows, considers UnsafeCell to be "fully infectious" to its neighbors. Noteworthy examples:
|
Isn't there impl<T: ?Sized> Freeze for PhantomData<T>?
…On Sun, Aug 6, 2023 at 10:11 Ralf Jung ***@***.***> wrote:
Yes but UnsafeCell lacks several auto traits -- Sync and Freeze. You
cannot reasonably expect its Phantom version to only affect one of them.
—
Reply to this email directly, view it on GitHub
<#236 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGLD2Y6ZTYUGULTQISFAXDXT6Q2TANCNFSM4N4GIRZQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Not that I know of. |
In any case,
You can use oibit impl ramblingWhen possible, I do think it's generally preferable to avoid exposing any types with explicit impls for autotraits, because the rustdoc render is preferable for the automatic trait impl than explicit ones, especially since I prefer to bound the impl "by example" but the |
I'd note that this also opts-out of And yes, please give actual auto-trait opt-outs. I work with a lot of code that is on older versions of rust, though, for various reasons. And in general, I think it's reasonable to expect pub struct Box<T>{
ptr: NonNull<T>,
phantom: PhantomData<T>
} |
The Footnotes
|
I don't quite agree. The docs say:
which I think would mean that the not-specifically-
Does this mean it only impacts static safety properties? Imo that's not what it says, but it could be read that way. |
But it is not, as I had argued over Zulip fn foo<'unbounded, T : ?Sized + 'unbounded>() {
let _: &'unbounded _ = &("not a zst", ::core::marker::PhantomData::<T>);
} says that there is an unconditional such impl of Since we already have |
Also says that |
Actually, I think that means that |
Looks like that would be a breaking change then. We should crater that. |
Actually, even if I remove the blanket Freeze for UnsafeCell, these tests all still pass... |
Wait, what? Shared reference to |
It took me a while, here's a breaking change: use std::cell::UnsafeCell;
use std::marker::PhantomData;
trait Trait {
const C: (u32, PhantomData<UnsafeCell<u32>>);
}
fn bar<T: Trait>() {
let x: &'static (u32, PhantomData<UnsafeCell<u32>>) = &T::C;
} |
TBH, this might have issues with the question presented here. |
Or hmm... Miri at least seems to think that mutating a promoted constant, even an |
Promoted constants are put in read-only memory, so yeah their mutation is definitely UB. That's a different "source of UB" that
Yeah there are definitely some interesting potential interactions. |
There are at least 2 creates that rely on This doesn't necessarily impact the aliasing rules, I think, but it is still certainly useful to know. |
…dead const-eval interning: accept interior mutable pointers in final value …but keep rejecting mutable references This fixes rust-lang#121610 by no longer firing the lint when there is a pointer with interior mutability in the final value of the constant. On stable, such pointers can be created with code like: ```rust pub enum JsValue { Undefined, Object(Cell<bool>), } impl Drop for JsValue { fn drop(&mut self) {} } // This does *not* get promoted since `JsValue` has a destructor. // However, the outer scope rule applies, still giving this 'static lifetime. const UNDEFINED: &JsValue = &JsValue::Undefined; ``` It's not great to accept such values since people *might* think that it is legal to mutate them with unsafe code. (This is related to how "infectious" `UnsafeCell` is, which is a [wide open question](rust-lang/unsafe-code-guidelines#236).) However, we [explicitly document](https://doc.rust-lang.org/reference/behavior-considered-undefined.html) that things created by `const` are immutable. Furthermore, we also accept the following even more questionable code without any lint today: ```rust let x: &'static Option<Cell<i32>> = &None; ``` This is even more questionable since it does *not* involve a `const`, and yet still puts the data into immutable memory. We could view this as promotion [potentially introducing UB](rust-lang/unsafe-code-guidelines#493). However, we've accepted this since ~forever and it's [too late to reject this now](rust-lang#122789); the pattern is just too useful. So basically, if you think that `UnsafeCell` should be tracked fully precisely, then you should want the lint we currently emit to be removed, which this PR does. If you think `UnsafeCell` should "infect" surrounding `enum`s, the big problem is really rust-lang/unsafe-code-guidelines#493 which does not trigger the lint -- the cases the lint triggers on are actually the "harmless" ones as there is an explicit surrounding `const` explaining why things end up being immutable. What all this goes to show is that the hard error added in rust-lang#118324 (later turned into the future-compat lint that I am now suggesting we remove) was based on some wrong assumptions, at least insofar as it concerns shared references. Furthermore, that lint does not help at all for the most problematic case here where the potential UB is completely implicit. (In fact, the lint is actively in the way of [my preferred long-term strategy](rust-lang/unsafe-code-guidelines#493 (comment)) for dealing with this UB.) So I think we should go back to square one and remove that error/lint for shared references. For mutable references, it does seem to work as intended, so we can keep it. Here it serves as a safety net in case the static checks that try to contain mutable references to the inside of a const initializer are not working as intended; I therefore made the check ICE to encourage users to tell us if that safety net is triggered. Closes rust-lang#122153 by removing the lint. Cc `@rust-lang/opsem` `@rust-lang/lang`
const-eval interning: accept interior mutable pointers in final value …but keep rejecting mutable references This fixes rust-lang/rust#121610 by no longer firing the lint when there is a pointer with interior mutability in the final value of the constant. On stable, such pointers can be created with code like: ```rust pub enum JsValue { Undefined, Object(Cell<bool>), } impl Drop for JsValue { fn drop(&mut self) {} } // This does *not* get promoted since `JsValue` has a destructor. // However, the outer scope rule applies, still giving this 'static lifetime. const UNDEFINED: &JsValue = &JsValue::Undefined; ``` It's not great to accept such values since people *might* think that it is legal to mutate them with unsafe code. (This is related to how "infectious" `UnsafeCell` is, which is a [wide open question](rust-lang/unsafe-code-guidelines#236).) However, we [explicitly document](https://doc.rust-lang.org/reference/behavior-considered-undefined.html) that things created by `const` are immutable. Furthermore, we also accept the following even more questionable code without any lint today: ```rust let x: &'static Option<Cell<i32>> = &None; ``` This is even more questionable since it does *not* involve a `const`, and yet still puts the data into immutable memory. We could view this as promotion [potentially introducing UB](rust-lang/unsafe-code-guidelines#493). However, we've accepted this since ~forever and it's [too late to reject this now](rust-lang/rust#122789); the pattern is just too useful. So basically, if you think that `UnsafeCell` should be tracked fully precisely, then you should want the lint we currently emit to be removed, which this PR does. If you think `UnsafeCell` should "infect" surrounding `enum`s, the big problem is really rust-lang/unsafe-code-guidelines#493 which does not trigger the lint -- the cases the lint triggers on are actually the "harmless" ones as there is an explicit surrounding `const` explaining why things end up being immutable. What all this goes to show is that the hard error added in rust-lang/rust#118324 (later turned into the future-compat lint that I am now suggesting we remove) was based on some wrong assumptions, at least insofar as it concerns shared references. Furthermore, that lint does not help at all for the most problematic case here where the potential UB is completely implicit. (In fact, the lint is actively in the way of [my preferred long-term strategy](rust-lang/unsafe-code-guidelines#493 (comment)) for dealing with this UB.) So I think we should go back to square one and remove that error/lint for shared references. For mutable references, it does seem to work as intended, so we can keep it. Here it serves as a safety net in case the static checks that try to contain mutable references to the inside of a const initializer are not working as intended; I therefore made the check ICE to encourage users to tell us if that safety net is triggered. Closes rust-lang/rust#122153 by removing the lint. Cc `@rust-lang/opsem` `@rust-lang/lang`
Currently, when looking for
UnsafeCell
behind shared references, Miri descends through arrays, structs and the like, but does not descend throughenum
s. Instead, when it sees anenum
, it checks ifT: Freeze
, and if not, treats the entireenum
as anUnsafeCell
.The benefit of this is that finding
UnsafeCell
does not require reading from memory (rust-lang/miri#931), which makes formal reasoning about Stacked Borrows a lot simpler. Accessing memory duringUnsafeCell
search opens all sorts of nasty questions, such as whether those accesses are themselves subject to Stacked Borrows rules or not (and if yes, which tag they use). Not reading enum discriminants also avoids potential confusion from Stacked Borrows partially checking the validity invariant of the referenced data.On the other hand, being more precise with
UnsafeCell
search could help optimizations. When a function works on&Result<Cell<i32>, i32>
, and the compiler somehow can know that theErr
variant is active, we would be able to rely on this shared reference being read-only -- currently, that is not an assumption that the compiler can make. (But note that once a shared reference gets created to thei32
in theErr
variant, that is already guaranteed to be read-only.)This is somewhat related to #204.
Some posts with useful datapoints (not exhaustive):
Other parts of this question:
T
?UnsafeCell
insidePhantomData
have any effect?The text was updated successfully, but these errors were encountered: