Skip to content

Commit 00e77d7

Browse files
author
Andrew Brinker
committed
Rewrote "How Safe and Unsafe Interact" Nomicon chapter.
The previous version of the chapter covered a lot of ground, but was a little meandering and hard to follow at times. This draft is intended to be clearer and more direct, while still providing the same information as the previous version.
1 parent 97e3a24 commit 00e77d7

File tree

1 file changed

+118
-137
lines changed

1 file changed

+118
-137
lines changed
+118-137
Original file line numberDiff line numberDiff line change
@@ -1,150 +1,131 @@
11
% How Safe and Unsafe Interact
22

3-
So what's the relationship between Safe and Unsafe Rust? How do they interact?
4-
5-
Rust models the separation between Safe and Unsafe Rust with the `unsafe`
6-
keyword, which can be thought as a sort of *foreign function interface* (FFI)
7-
between Safe and Unsafe Rust. This is the magic behind why we can say Safe Rust
8-
is a safe language: all the scary unsafe bits are relegated exclusively to FFI
9-
*just like every other safe language*.
10-
11-
However because one language is a subset of the other, the two can be cleanly
12-
intermixed as long as the boundary between Safe and Unsafe Rust is denoted with
13-
the `unsafe` keyword. No need to write headers, initialize runtimes, or any of
14-
that other FFI boiler-plate.
15-
16-
There are several places `unsafe` can appear in Rust today, which can largely be
17-
grouped into two categories:
18-
19-
* There are unchecked contracts here. To declare you understand this, I require
20-
you to write `unsafe` elsewhere:
21-
* On functions, `unsafe` is declaring the function to be unsafe to call.
22-
Users of the function must check the documentation to determine what this
23-
means, and then have to write `unsafe` somewhere to identify that they're
24-
aware of the danger.
25-
* On trait declarations, `unsafe` is declaring that *implementing* the trait
26-
is an unsafe operation, as it has contracts that other unsafe code is free
27-
to trust blindly. (More on this below.)
28-
29-
* I am declaring that I have, to the best of my knowledge, adhered to the
30-
unchecked contracts:
31-
* On trait implementations, `unsafe` is declaring that the contract of the
32-
`unsafe` trait has been upheld.
33-
* On blocks, `unsafe` is declaring any unsafety from an unsafe
34-
operation within to be handled, and therefore the parent function is safe.
35-
36-
There is also `#[unsafe_no_drop_flag]`, which is a special case that exists for
37-
historical reasons and is in the process of being phased out. See the section on
38-
[drop flags] for details.
39-
40-
Some examples of unsafe functions:
41-
42-
* `slice::get_unchecked` will perform unchecked indexing, allowing memory
43-
safety to be freely violated.
44-
* every raw pointer to sized type has intrinsic `offset` method that invokes
45-
Undefined Behavior if it is not "in bounds" as defined by LLVM.
46-
* `mem::transmute` reinterprets some value as having the given type,
47-
bypassing type safety in arbitrary ways. (see [conversions] for details)
48-
* All FFI functions are `unsafe` because they can do arbitrary things.
49-
C being an obvious culprit, but generally any language can do something
50-
that Rust isn't happy about.
3+
What's the relationship between Safe Rust and Unsafe Rust? How do they
4+
interact?
5+
6+
The separation between Safe Rust and Unsafe Rust is controlled with the
7+
`unsafe` keyword, which acts as a sort of *foreign function interface*
8+
from one to the other. This boundary is why we can say Safe Rust is a
9+
safe language: all the unsafe parts are kept exclusively behind the FFI
10+
boundary, *just like any other safe language*. Best of all, because Safe
11+
Rust is a subset of Unsafe Rust, the two can be cleanly intermixed,
12+
without headers, runtimes, or any other FFI boilerplate.
13+
14+
The `unsafe` keyword has dual purposes: to declare the existence of
15+
contracts the compiler can't check, and to declare that the adherence
16+
of some code to those contracts has been checked by the programmer,
17+
and the code can therefore be trusted.
18+
19+
You can use `unsafe` to indicate the existence of unchecked contracts on
20+
_functions_ and on _trait declarations_. On functions, `unsafe` means that
21+
users of the function must check that function's documentation to ensure
22+
they are using it in a way that maintains the contracts the function
23+
requires. On trait declarations, `unsafe` means that implementors of the
24+
trait must check the trait documentation to ensure their implementation
25+
maintains the contracts the trait requires.
26+
27+
You can use `unsafe` on a block to declare that all constraints required
28+
by an unsafe function within the block have been adhered to, and the code
29+
can therefore be trusted. You can use `unsafe` on a trait implementation
30+
to declare that the implementation of that trait has adhered to whatever
31+
contracts the trait's documentation requires.
32+
33+
There is also the `#[unsafe_no_drop_flag]` attribute, which exists for
34+
historic reasons and is being phased out. See the section on [drop flags]
35+
for details.
36+
37+
The standard library has a number of unsafe functions, including:
38+
39+
* `slice::get_unchecked`, which performs unchecked indexing, allowing
40+
memory safety to be freely violated.
41+
* `mem::transmute` reinterprets some value as having a given type, bypassing
42+
type safety in arbitrary ways (see [conversions] for details).
43+
* Every raw pointer to a sized type has an intrinstic `offset` method that
44+
invokes Undefined Behavior if the passed offset is not "in bounds" as
45+
defined by LLVM.
46+
* All FFI functions are `unsafe` because the other language can do arbitrary
47+
operations that the Rust compiler can't check.
5148

5249
As of Rust 1.0 there are exactly two unsafe traits:
5350

54-
* `Send` is a marker trait (it has no actual API) that promises implementors
55-
are safe to send (move) to another thread.
56-
* `Sync` is a marker trait that promises that threads can safely share
57-
implementors through a shared reference.
58-
59-
The need for unsafe traits boils down to the fundamental property of safe code:
60-
61-
**No matter how completely awful Safe code is, it can't cause Undefined
62-
Behavior.**
63-
64-
This means that Unsafe Rust, **the royal vanguard of Undefined Behavior**, has to be
65-
*super paranoid* about generic safe code. To be clear, Unsafe Rust is totally free to trust
66-
specific safe code. Anything else would degenerate into infinite spirals of
67-
paranoid despair. In particular it's generally regarded as ok to trust the standard library
68-
to be correct. `std` is effectively an extension of the language, and you
69-
really just have to trust the language. If `std` fails to uphold the
70-
guarantees it declares, then it's basically a language bug.
71-
72-
That said, it would be best to minimize *needlessly* relying on properties of
73-
concrete safe code. Bugs happen! Of course, I must reinforce that this is only
74-
a concern for Unsafe code. Safe code can blindly trust anyone and everyone
75-
as far as basic memory-safety is concerned.
76-
77-
On the other hand, safe traits are free to declare arbitrary contracts, but because
78-
implementing them is safe, unsafe code can't trust those contracts to actually
79-
be upheld. This is different from the concrete case because *anyone* can
80-
randomly implement the interface. There is something fundamentally different
81-
about trusting a particular piece of code to be correct, and trusting *all the
82-
code that will ever be written* to be correct.
83-
84-
For instance Rust has `PartialOrd` and `Ord` traits to try to differentiate
85-
between types which can "just" be compared, and those that actually implement a
86-
total ordering. Pretty much every API that wants to work with data that can be
87-
compared wants Ord data. For instance, a sorted map like BTreeMap
88-
*doesn't even make sense* for partially ordered types. If you claim to implement
89-
Ord for a type, but don't actually provide a proper total ordering, BTreeMap will
90-
get *really confused* and start making a total mess of itself. Data that is
91-
inserted may be impossible to find!
92-
93-
But that's okay. BTreeMap is safe, so it guarantees that even if you give it a
94-
completely garbage Ord implementation, it will still do something *safe*. You
95-
won't start reading uninitialized or unallocated memory. In fact, BTreeMap
96-
manages to not actually lose any of your data. When the map is dropped, all the
97-
destructors will be successfully called! Hooray!
98-
99-
However BTreeMap is implemented using a modest spoonful of Unsafe Rust (most collections
100-
are). That means that it's not necessarily *trivially true* that a bad Ord
101-
implementation will make BTreeMap behave safely. BTreeMap must be sure not to rely
102-
on Ord *where safety is at stake*. Ord is provided by safe code, and safety is not
103-
safe code's responsibility to uphold.
104-
105-
But wouldn't it be grand if there was some way for Unsafe to trust some trait
106-
contracts *somewhere*? This is the problem that unsafe traits tackle: by marking
107-
*the trait itself* as unsafe to implement, unsafe code can trust the implementation
108-
to uphold the trait's contract. Although the trait implementation may be
109-
incorrect in arbitrary other ways.
110-
111-
For instance, given a hypothetical UnsafeOrd trait, this is technically a valid
112-
implementation:
51+
* `Send` is a marker trait (a trait with no API) that promises implementors are
52+
safe to send (move) to another thread.
53+
* `Sync` is a marker trait that promises threads can safely share implementors
54+
through a shared reference.
55+
56+
Much of the Rust standard library also uses Unsafe Rust internally, although
57+
these implementations are rigorously manually checked, and the Safe Rust
58+
interfaces provided on top of these implementations can be assumed to be safe.
59+
60+
The need for all of this separation boils down a single fundamental property
61+
of Safe Rust:
62+
63+
**No matter what, Safe Rust can't cause Undefined Behavior.**
64+
65+
The design of the safe/unsafe split means that Safe Rust inherently has to
66+
trust that any Unsafe Rust it touches has been written correctly (meaning
67+
the Unsafe Rust actually maintains whatever contracts it is supposed to
68+
maintain). On the other hand, Unsafe Rust has to be very careful about
69+
trusting Safe Rust.
70+
71+
As an example, Rust has the `PartialOrd` and `Ord` traits to differentiate
72+
between types which can "just" be compared, and those that provide a total
73+
ordering (where every value of the type is either equal to, greater than,
74+
or less than any other value of the same type). The sorted map type
75+
`BTreeMap` doesn't make sense for partially-ordered types, and so it
76+
requires that any key type for it implements the `Ord` trait. However,
77+
`BTreeMap` has Unsafe Rust code inside of its implementation, and this
78+
Unsafe Rust code cannot assume that any `Ord` implementation it gets makes
79+
sense. The unsafe portions of `BTreeMap`'s internals have to be careful to
80+
maintain all necessary contracts, even if a key type's `Ord` implementation
81+
does not implement a total ordering.
82+
83+
Unsafe Rust cannot automatically trust Safe Rust. When writing Unsafe Rust,
84+
you must be careful to only rely on specific Safe Rust code, and not make
85+
assumptions about potential future Safe Rust code providing the same
86+
guarantees.
87+
88+
This is the problem that `unsafe` traits exist to resolve. The `BTreeMap`
89+
type could theoretically require that keys implement a new trait called
90+
`UnsafeOrd`, rather than `Ord`, that might look like this:
11391

11492
```rust
115-
# use std::cmp::Ordering;
116-
# struct MyType;
117-
# unsafe trait UnsafeOrd { fn cmp(&self, other: &Self) -> Ordering; }
118-
unsafe impl UnsafeOrd for MyType {
119-
fn cmp(&self, other: &Self) -> Ordering {
120-
Ordering::Equal
121-
}
93+
use std::cmp::Ordering;
94+
95+
unsafe trait UnsafeOrd {
96+
fn cmp(&self, other: &Self) -> Ordering;
12297
}
12398
```
12499

125-
But it's probably not the implementation you want.
126-
127-
Rust has traditionally avoided making traits unsafe because it makes Unsafe
128-
pervasive, which is not desirable. The reason Send and Sync are unsafe is because thread
129-
safety is a *fundamental property* that unsafe code cannot possibly hope to defend
130-
against in the same way it would defend against a bad Ord implementation. The
131-
only way to possibly defend against thread-unsafety would be to *not use
132-
threading at all*. Making every load and store atomic isn't even sufficient,
133-
because it's possible for complex invariants to exist between disjoint locations
134-
in memory. For instance, the pointer and capacity of a Vec must be in sync.
135-
136-
Even concurrent paradigms that are traditionally regarded as Totally Safe like
137-
message passing implicitly rely on some notion of thread safety -- are you
138-
really message-passing if you pass a pointer? Send and Sync therefore require
139-
some fundamental level of trust that Safe code can't provide, so they must be
140-
unsafe to implement. To help obviate the pervasive unsafety that this would
141-
introduce, Send (resp. Sync) is automatically derived for all types composed only
142-
of Send (resp. Sync) values. 99% of types are Send and Sync, and 99% of those
143-
never actually say it (the remaining 1% is overwhelmingly synchronization
144-
primitives).
145-
146-
147-
100+
Then, a type would use `unsafe` to implement `UnsafeOrd`, indicating that
101+
they've ensured their implementation maintains whatever contracts the
102+
trait expects. In this situation, the Unsafe Rust in the internals of
103+
`BTreeMap` could trust that the key type's `UnsafeOrd` implementation is
104+
correct. If it isn't, it's the fault of the unsafe trait implementation
105+
code, which is consistent with Rust's safety guarantees.
106+
107+
The decision of whether to mark a trait `unsafe` is an API design choice.
108+
Rust has traditionally avoided marking traits unsafe because it makes Unsafe
109+
Rust pervasive, which is not desirable. `Send` and `Sync` are marked unsafe
110+
because thread safety is a *fundamental property* that unsafe code can't
111+
possibly hope to defend against in the way it could defend against a bad
112+
`Ord` implementation. The decision of whether to mark your own traits `unsafe`
113+
depends on the same sort of consideration. If `unsafe` code cannot reasonably
114+
expect to defend against a bad implementation of the trait, then marking the
115+
trait `unsafe` is a reasonable choice.
116+
117+
As an aside, while `Send` and `Sync` are `unsafe` traits, they are
118+
automatically implemented for types when such derivations are provably safe
119+
to do. `Send` is automatically derived for all types composed only of values
120+
whose types also implement `Send`. `Sync` is automatically derived for all
121+
types composed only of values whose types also implement `Sync`.
122+
123+
This is the dance of Safe Rust and Unsafe Rust. It is designed to make using
124+
Safe Rust as ergonomic as possible, but requires extra effort and care when
125+
writing Unsafe Rust. The rest of the book is largely a discussion of the sort
126+
of care that must be taken, and what contracts it is expected of Unsafe Rust
127+
to uphold.
148128

149129
[drop flags]: drop-flags.html
150130
[conversions]: conversions.html
131+

0 commit comments

Comments
 (0)