-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Stacked Borrows: raw pointers inherit the tag from their parent pointer #142170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@bors2 try |
Stacked Borrows: raw pointers inherit the tag from their parent pointer The biggest design mistake of SB in my opinion was the decision to have raw pointers be distinct from the reference they are derived from. This has multiple undesirable consequences: - Within a function, a `let mut x` and `&raw mut x` are not allowed to be used in an interleaving way, which is quite surprising and requires dedicated work-arounds e.g. in c2rust. - Casting a `&mut T` to a `*const T` results in a read-only pointer, which regularly startles people. - On the implementation side, this means we have to recognize all places where "safe" pointers turn into raw pointers, which turns out to be tricky for `Box`. TB fully mitigates this by having a raw pointer inherit the tag from its parent pointer. I think we should do the same for SB. However, this requires a bit of hackery to prevent code like this from becoming UB: ```rust fn foo(x: &mut [i32]) { let ptr = x.as_mut_ptr(); let len = x.len(); assert!(len > 0); ptr.write(0); } ``` Under SB, `x.len()` causes a read of the entire slice which invalidates `ptr` since that is derived from a child of `x` (created as part of the implicit retag when calling `as_mut_ptr`). So this experiment adds some special magic hackery to `len` to avoid `x.len()` from retagging anything. `len` is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking. This is definitely not the final answer, but it is a minimal step that lets us make SB slightly cleaner, thus unlocking some patterns in the ecosystem that Miri has so far rejected, and some cleanup in the compiler. Longer-term plans include a new attribute one can put on functions to avoid retags -- putting that attribute on `len` *or* `as_mut_ptr` would fix the above example. TODO: - Check which further TB tests should be moved to "both borrows" now. - Entirely remove `RetagKind::Raw`. - Add mir-opt tests checking the MIR around `x.len()` looks the way we want it to. try-job: `*gnu*aux`
This comment has been minimized.
This comment has been minimized.
It seems to me like a cleaner way to handle this would be to add an attribute that can be used to indicate that a given function argument should not be retagged (and thus also should lose |
ba46876
to
7816394
Compare
That's the longer-term solution I mentioned. I don't know how to implement the caller-side behavior of this, since the retags come from the reborrows that are inserted during MIR building. (The callee-side retags of function arguments are easy to suppress, comparatively.) Another form this attribute could take is that we could have functions that take a raw pointer receiver, and we arrange things so that they get called even if the receiver is actually a reference, implicitly coercing that to a raw pointer. That would neatly prevent the implicit reborrows during MIR building (I hope), but it needs cooperation of method resolution which I really don't want to touch.^^ This restricts the attribute to functions that can safely work on raw pointers, but that does not seem entirely unreasonable -- it definitely applies to our main candidates ( |
💔 Test failed
|
That failure was interesting. Previously,we happened to accept code that does something like this on an interior mutable local variable
This used to work thanks to the "quirk" where invalidating a mutable reference with a "foreign read" in SB does not necessarily invalidate all its derived raw pointers. With this PR, that no longer works since the raw pointer is identical to the mutable reference, so the code needs to be adjusted a little. Arguably this code was highly sketchy before... @bors2 try |
Stacked Borrows: raw pointers inherit the tag from their parent pointer The biggest design mistake of SB in my opinion was the decision to have raw pointers be distinct from the reference they are derived from. This has multiple undesirable consequences: - Within a function, a `let mut x` and `&raw mut x` are not allowed to be used in an interleaving way, which is quite surprising and requires dedicated work-arounds e.g. in c2rust. - Casting a `&mut T` to a `*const T` results in a read-only pointer, which regularly startles people. - On the implementation side, this means we have to recognize all places where "safe" pointers turn into raw pointers, which turns out to be tricky for `Box`. TB fully mitigates this by having a raw pointer inherit the tag from its parent pointer. I think we should do the same for SB. However, this requires a bit of hackery to prevent code like this from becoming UB: ```rust fn foo(x: &mut [i32]) { let ptr = x.as_mut_ptr(); let len = x.len(); assert!(len > 0); ptr.write(0); } ``` Under SB, `x.len()` causes a read of the entire slice which invalidates `ptr` since that is derived from a child of `x` (created as part of the implicit retag when calling `as_mut_ptr`). So this experiment adds some special magic hackery to `len` to avoid `x.len()` from retagging anything. `len` is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking. We also do one further change to the SB access logic: writing to a `Unique` does not invalidate SRW above this item. This, too, matches Tree Borrows. It does not affect local reasoning since owners of a `Unique` know whether any SRW have been pushed on top. This is definitely not the final answer, but it is a minimal step that lets us make SB slightly cleaner, thus unlocking some patterns in the ecosystem that Miri has so far rejected, and some cleanup in the compiler. Longer-term plans include a new attribute one can put on functions to avoid retags -- putting that attribute on `len` *or* `as_mut_ptr` would fix the above example. The main potential downside of this PR is that it will lead Miri to accept code like `let x = 5; (&raw const x).cast_mut().write(0);` by default. This is discussed in rust-lang/unsafe-code-guidelines#400. Depending on how likely we are to make this UB in the future, we may want Miri to keep people away from those patterns. However, rejecting such code in an operational way is non-trivial; the only proposal we have is [arguably a hack](rust-lang/unsafe-code-guidelines#400 (comment)). Also, note that destructors will receive an `&mut self` pointing to such an immutable local variable, so operationally, it seems a bit futile to pretend that they are immutable. TODO: - Check which further TB tests should be moved to "both borrows" now. - Entirely remove `RetagKind::Raw`. - Add mir-opt tests checking the MIR around `x.len()` looks the way we want it to. try-job: `*gnu*aux`
Amazingly, it seems like we can even entirely get rid of the "quirk" with this approach. This feels like SB is finally getting into the form it was meant to have. :D @bors2 try |
Stacked Borrows: raw pointers inherit the tag from their parent pointer The biggest design mistake of SB in my opinion was the decision to have raw pointers be distinct from the reference they are derived from. This has multiple undesirable consequences: - Within a function, a `let mut x` and `&raw mut x` are not allowed to be used in an interleaving way, which is quite surprising and requires dedicated work-arounds e.g. in c2rust. - Casting a `&mut T` to a `*const T` results in a read-only pointer, which regularly startles people. - On the implementation side, this means we have to recognize all places where "safe" pointers turn into raw pointers, which turns out to be tricky for `Box`. TB fully mitigates this by having a raw pointer inherit the tag from its parent pointer. I think we should do the same for SB. However, this requires a bit of hackery to prevent code like this from becoming UB: ```rust fn foo(x: &mut [i32]) { let ptr = x.as_mut_ptr(); let len = x.len(); assert!(len > 0); ptr.write(0); } ``` Under SB, `x.len()` causes a read of the entire slice which invalidates `ptr` since that is derived from a child of `x` (created as part of the implicit retag when calling `as_mut_ptr`). So this experiment adds some special magic hackery to `len` to avoid `x.len()` from retagging anything. `len` is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking. We also do one further change to the SB access logic: writing to a `Unique` does not invalidate SRW above this item. This, too, matches Tree Borrows. It does not affect local reasoning since owners of a `Unique` know whether any SRW have been pushed on top. This is definitely not the final answer, but it is a minimal step that lets us make SB slightly cleaner, thus unlocking some patterns in the ecosystem that Miri has so far rejected, and some cleanup in the compiler. Longer-term plans include a new attribute one can put on functions to avoid retags -- putting that attribute on `len` *or* `as_mut_ptr` would fix the above example. The main potential downside of this PR is that it will lead Miri to accept code like `let x = 5; (&raw const x).cast_mut().write(0);` by default. This is discussed in rust-lang/unsafe-code-guidelines#400. Depending on how likely we are to make this UB in the future, we may want Miri to keep people away from those patterns. However, rejecting such code in an operational way is non-trivial; the only proposal we have is [arguably a hack](rust-lang/unsafe-code-guidelines#400 (comment)). Also, note that destructors will receive an `&mut self` pointing to such an immutable local variable, so operationally, it seems a bit futile to pretend that they are immutable. TODO: - Check which further TB tests should be moved to "both borrows" now. - Entirely remove `RetagKind::Raw`. - Add mir-opt tests checking the MIR around `x.len()` looks the way we want it to. try-job: `*gnu*aux`
@saethlin how hard would it be to use your infrastructure to compare the Miri from this branch with the Miri we currently ship on some of the most widely used crates, to see if there's any surprises there? |
@bors2 try jobs= |
Stacked Borrows: raw pointers inherit the tag from their parent pointer The biggest design mistake of SB in my opinion was the decision to have raw pointers be distinct from the reference they are derived from. This has multiple undesirable consequences: - Within a function, a `let mut x` and `&raw mut x` are not allowed to be used in an interleaving way, which is quite surprising and requires dedicated work-arounds e.g. in c2rust. - Casting a `&mut T` to a `*const T` results in a read-only pointer, which regularly startles people. - On the implementation side, this means we have to recognize all places where "safe" pointers turn into raw pointers, which turns out to be tricky for `Box`. TB fully mitigates this by having a raw pointer inherit the tag from its parent pointer. I think we should do the same for SB. However, this requires a bit of hackery to prevent code like this from becoming UB: ```rust fn foo(x: &mut [i32]) { let ptr = x.as_mut_ptr(); let len = x.len(); assert!(len > 0); ptr.write(0); } ``` Under SB, `x.len()` causes a read of the entire slice which invalidates `ptr` since that is derived from a child of `x` (created as part of the implicit retag when calling `as_mut_ptr`). So this experiment adds some special magic hackery to `len` to avoid `x.len()` from retagging anything. `len` is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking. We also do one further change to the SB access logic: writing to a `Unique` does not invalidate SRW above this item. This, too, matches Tree Borrows. It does not affect local reasoning since owners of a `Unique` know whether any SRW have been pushed on top. SRW are now only created for `UnsafeCell`, so this allows aliasing of an `&UnsafeCell` with the `&mut UnsafeCell` from which it was derived (making it equivalent to a `*mut UnsafeCell` derived from that mutable reference). This is definitely not the final answer, but it is a minimal step that lets us make SB slightly cleaner, thus unlocking some patterns in the ecosystem that Miri has so far rejected, and some cleanup in the compiler. Longer-term plans include a new attribute one can put on functions to avoid retags -- putting that attribute on `len` *or* `as_mut_ptr` would fix the above example. The main potential downside of this PR is that it will lead Miri to accept code like `let x = 5; (&raw const x).cast_mut().write(0);` by default. This is discussed in rust-lang/unsafe-code-guidelines#400. Depending on how likely we are to make this UB in the future, we may want Miri to keep people away from those patterns. However, rejecting such code in an operational way is non-trivial; the only proposal we have is [arguably a hack](rust-lang/unsafe-code-guidelines#400 (comment)). Also, note that destructors will receive an `&mut self` pointing to such an immutable local variable, so operationally, it seems a bit futile to pretend that they are immutable. WIP: This also gets rid of the "quirk" in Stacked Borrows, where reading from some pointer disabled the `Unique` above that pointer, but did *not* touch SRW/SRO derived from that `Unique` pointer. This quirk was always quite unfortunate since it violates the stack discipline and can be used to write code that is accepted by SB but rejected by TB. The changes described above make the quirk a lot less effective since it no longer applies to raw pointers (as those no longer get a separate item in the stack); rather than having a quirk that only applies to particular situations involving `UnsafeCell`, IMO we should get rid of it entirely. TODO: - Check which further TB tests should be moved to "both borrows" now. - Entirely remove `RetagKind::Raw`. - Add mir-opt tests checking the MIR around `x.len()` looks the way we want it to. - If we can indeed get rid of the quirk: remove leftovers of "disable `Unique`" infrastructure (including removing `Permission::Disabled`). try-job: `*gnu*aux` try-job: `dist-x86_64-linux`
I'm doing the top 1000 crates now. |
This does fix some crates, but it breaks more. This jumps out at me from
Here are your short lists of crates that were fixed:
And crates that were broken:
It's likely that some of these are duplicates, because they were fixed or broken through a dependency. |
Thanks! Breaking arrayvec and smallvec is not great... do you have the error message for smallvec as well? That arrayvec issue arises because Very odd that hasbrown appears under "fixed", that repo has Miri under CI so it should already work before this PR. |
My tool enables all features that are enabled in The smallvec error is:
|
Okay so the smallvec erorr is basically the same as the arrayvec error, another Bummer, I think that kills the approach of special-casing |
I see no other way to adjust SB to move it closer to TB, so I think my path forward here will be to find ways to make TB stricter to move it closer to SB, until the remaining gap is uncontroversial and we can drop SB. I opened rust-lang/unsafe-code-guidelines#573 to track that. Thanks all for the help and feedback here! |
The biggest design mistake of SB in my opinion was the decision to have raw pointers be distinct from the reference they are derived from. This has multiple undesirable consequences:
let mut x
and&raw mut x
are not allowed to be used in an interleaving way, which is quite surprising and requires dedicated work-arounds e.g. in c2rust.&mut T
to a*const T
results in a read-only pointer, which regularly startles people.Box
.TB fully mitigates this by having a raw pointer inherit the tag from its parent pointer. I think we should do the same for SB. However, this requires a bit of hackery to prevent code like this from becoming UB:
Under SB,
x.len()
causes a read of the entire slice which invalidatesptr
since that is derived from a child ofx
(created as part of the implicit retag when callingas_mut_ptr
).So this experiment adds some special magic hackery to
len
to avoidx.len()
from retagging anything.len
is already special in invisible ways by having the function call be replaced by a MIR op; this extends that magic for the purposes of our alias tracking.We also do one further change to the SB access logic: writing to a
Unique
does not invalidate SRW above this item. This, too, matches Tree Borrows. It does not affect local reasoning since owners of aUnique
know whether any SRW have been pushed on top. SRW are now only created forUnsafeCell
, so this allows aliasing of an&UnsafeCell
with the&mut UnsafeCell
from which it was derived (making it equivalent to a*mut UnsafeCell
derived from that mutable reference).This is definitely not the final answer, but it is a minimal step that lets us make SB slightly cleaner, thus unlocking some patterns in the ecosystem that Miri has so far rejected, and some cleanup in the compiler. Longer-term plans include a new attribute one can put on functions to avoid retags -- putting that attribute on
len
oras_mut_ptr
would fix the above example.The main potential downside of this PR is that it will lead Miri to accept code like
let x = 5; (&raw const x).cast_mut().write(0);
by default. This is discussed in rust-lang/unsafe-code-guidelines#400. Depending on how likely we are to make this UB in the future, we may want Miri to keep people away from those patterns. However, rejecting such code in an operational way is non-trivial; the only proposal we have is arguably a hack. Also, note that destructors will receive an&mut self
pointing to such an immutable local variable, so operationally, it seems a bit futile to pretend that they are immutable.WIP: This also gets rid of the "quirk" in Stacked Borrows, where reading from some pointer disabled the
Unique
above that pointer, but did not touch SRW/SRO derived from thatUnique
pointer. This quirk was always quite unfortunate since it violates the stack discipline and can be used to write code that is accepted by SB but rejected by TB. The changes described above make the quirk a lot less effective since it no longer applies to raw pointers (as those no longer get a separate item in the stack); rather than having a quirk that only applies to particular situations involvingUnsafeCell
, IMO we should get rid of it entirely.TODO:
RetagKind::Raw
.x.len()
looks the way we want it to.Unique
" infrastructure (including removingPermission::Disabled
).try-job:
*gnu*aux