Skip to content

Commit 689b66c

Browse files
authored
Initial stab at documenting the kind system (ocaml-flambda#3559)
Initial kind docs
1 parent 0f1d0aa commit 689b66c

File tree

6 files changed

+1043
-339
lines changed

6 files changed

+1043
-339
lines changed

jane/doc/extensions/kinds/index.md

+362
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,362 @@
1+
# The kind system
2+
3+
OxCaml includes a system of _kinds_, which classify types. Roughly, the kind
4+
system adds an extra level to the type system: terms are classified by types,
5+
and types are classified by kinds. This page provides an overview of the kind
6+
system, but several other pages flesh out the details:
7+
8+
* [Syntax for kind annotations](syntax.md)
9+
* [The non-modal bounds](non-modal.md)
10+
* [How to compute the kinds of a type](types.md)
11+
12+
Kinds capture properties of types along
13+
several different dimensions. For example, kinds can be used to identify which
14+
types have values that are passed in floating point registers, or are safely
15+
ignored by the garbage collector.
16+
17+
The kind of a type has four components: the _layout_, the _modal bounds_, the
18+
_with-bounds_, and the _non-modal bounds_. The layout describes the shape of the
19+
data at runtime, and is used to support unboxed types. The modal bounds describe
20+
how different types interact with our mode system. In particular, some types
21+
don't have interesting interactions with some modes, so values of these types
22+
can safely ignore those modal axes. However, container types (among other
23+
parameterized types) have modal bounds that depend on the bounds of the element
24+
type; this dependency is captured in the with-bounds. The non-modal bounds
25+
capture a grab-bag of other properties.
26+
27+
Kinds are related by a _sub-kinding_ relation, described in more detail
28+
below. This allows to use a type with a more precise kind where a type with a
29+
less precise kind is expected.
30+
31+
This page describes the kind system at a high level, and contains complete
32+
details for the non-modal bounds. It does not exhaustively describe the possible
33+
layouts (which are documented on the [unboxed types
34+
page](../unboxed-types/index.md)) or the modal axes (which are documented on the
35+
[modes page]()), but does explain how those components appear in kinds,
36+
including how the modal bounds are affected by the with-bounds.
37+
38+
CR ccasinghino: add links to modes documentation after moving it here.
39+
40+
CR reisenberg: Where should/do we document mode crossing? There is some text
41+
in proposals/unboxed-types/kinds.md that might be useful.
42+
43+
# The basic structure of kinds
44+
45+
Basic kinds have the form:
46+
47+
```
48+
<layout> mod <bounds>
49+
```
50+
51+
For example, the type `(int -> int) option` has the kind `value mod contended
52+
immutable non_float`. Here, the layout `value` indicates that members of this
53+
type have the shape of normal OCaml data, and the
54+
bounds `contended immutable` indicates that they can safely ignore the
55+
*contention* (any value of this type can be treated as if it were created in the
56+
current thread) and *visibility* axes (no part of this type is mutable, and so
57+
read/write protections do not matter). The `non_float` bound says that no value
58+
that has type `(int -> int) option` is a pointer to a floating-point number
59+
block.
60+
61+
As we see in this example, the *bounds* portion of the kind often has multiple
62+
components, and includes both modal and non-modal bounds. The type `int` has all
63+
the bounds:
64+
65+
```
66+
value mod global contended portable aliased many unyielding immutable stateless
67+
non_float external_
68+
```
69+
70+
This kind indicates that `int` mode crosses on all eight of our modal axes. In
71+
addition, `int`s are not `float`s and they do not need to be garbage-collected
72+
(they are `external_` to the garbage collector).
73+
74+
# The meaning of kinds
75+
76+
In additional to `value`, OxCaml supports layouts like `float64` (unboxed
77+
floating point numbers that are passed in SIMD registers), `bits64`
78+
and `bits32` (for types represented by unboxed/untagged integers) and product
79+
layouts like `float64 & bits32` (an unboxed pair that is passed in two
80+
registers). More detail on layouts and the unboxed types language feature can be
81+
found [here](../unboxed-types/index.md).
82+
83+
Modal bounds all correspond to modal axes, which are described in more detail in
84+
the [modes documentation](). The logic for which types can cross on which axes
85+
is specific to each axis, often involving both the semantic meaning of the mode
86+
and details of the implementation of related features in the OxCaml runtime. See
87+
the documentation for each mode to understand which types cross on its axis.
88+
89+
Formally, these are called modal _bounds_ because the represent upper or lower
90+
bounds on the appropriate modal axes. For _future_ modal axes (like portability
91+
and linearity), the kind records an upper bound on the mode of values of this
92+
type. For example, `int` is `mod portable` because if you have an `int` that is
93+
`nonportable`, it's safe to treat it as `portable`. For _past_ modal axes
94+
(like contention and uniqueness), the kind records a lower bound on
95+
_expectations_. For example, `int` is `mod contended` because in a place where
96+
an `uncontended` value is expected, it's still safe to use a `contended` int.
97+
98+
Why do past and future modal axes get different treatment in kinds? This is
99+
covered in the "Advanced Topics" section below, but isn't essential to
100+
understand for day-to-day use of the system.
101+
102+
The non-modal bounds encode several different properties; see the section
103+
"Non-modal bounds" below. They are called bounds because each non-modal axis
104+
still supports a sub-kinding relationship, where a type of a more specific kind
105+
can be used in place of a variable of a less specific kind.
106+
107+
# Subkinding
108+
109+
There is a partial order on kinds, which we'll write `k1 <= k2`. The
110+
relationship `k1 <= k2` holds when `k2` tells us less about a type than `k1`.
111+
Thus, it is always safe to use a type of kind `k1` where a type of kind `k2` is
112+
expected. There is a maximum kind, written `any`: this is the maximum layout
113+
but with no bounds.
114+
115+
As an example, `value mod portable <= value`. This means that if we know a type
116+
is represented as a normal OCaml value and mode crosses on the portability axis,
117+
it's safe to use the type where we just need a regular OCaml type but don't care
118+
how it interacts with portability. This relation forms a partial order because
119+
some kinds are unrelated, like `float64 mod contended` and `bits64 mod
120+
contended`, or `value mod portable` and `value mod aliased`.
121+
122+
Adding bounds to a kind always makes the kind more specific, or lower. That is,
123+
for any kind `k`, `k mod <bounds> <= k`.
124+
125+
Along the future modal axes, a lower mode leads to a lower kind. So `value mod
126+
stateless <= value mod observing`, and bounding by the maximum mode has no
127+
effect. However, along the past modal axes, a _higher_ mode leads to a lower
128+
kind. So `value mod contended <= value mod sharing` and bounding by the minimum
129+
mode has no effect. We can think of the past axes as flipped, when used in a
130+
kind. This is because `value mod contended` is more restrictive than `value mod
131+
sharing` (the former contains types that do not care at all about the value of
132+
the contention axis, while the latter contains types that still care about the
133+
distinction between `contended` and `sharing`/`uncontended`), and so we must
134+
flip these past axes (somewhat unfortunately).
135+
136+
If you want to get nerdy about it, each individual piece of a kind (the layout
137+
and each possible axis of bounds) is a join semilattice, and the order we're
138+
talking about here is the one corresponding to the product join semilattice that
139+
combines all these components. The order for each component is described in the
140+
documentation for that component.
141+
142+
# Inclusion and variance
143+
144+
This is accepted:
145+
146+
```ocaml
147+
module M1 : sig
148+
type t : value
149+
end = struct
150+
type t = int
151+
end
152+
```
153+
154+
This makes sense because the kind of `int` is `immediate`, which is a subkind of
155+
`value`. Even though users of `M1.t` will be expecting a `value`, the
156+
`immediate` they get works great. Thus, the kinds of type declarations are
157+
*covariant* in the module inclusion check: a module type `S1` is included in
158+
`S2` when the kind of a type `t` in `S1` is included in the kind of `t` in `S2`.
159+
160+
Similarly, this is accepted:
161+
162+
```ocaml
163+
module M2 : sig
164+
type ('a : immediate) t
165+
end = struct
166+
type ('a : value) t
167+
end
168+
```
169+
170+
This makes sense because users of `M2.t` are required to supply an `immediate`;
171+
even though the definition of `M2.t` expects a `value`, the `immediate` it gets
172+
works great. Thus, the kinds of type declaration arguments are *contravariant*
173+
in the module inclusion check: a module type `S1` is included in `S2` when the
174+
kind of the argument to a type `t` in `S2` is included in the kind of that
175+
argument in `S1`.
176+
177+
Contravariance in type arguments allows us to have
178+
179+
```ocaml
180+
module Array : sig
181+
type ('a : any) t = 'a array
182+
(* ... *)
183+
end
184+
```
185+
186+
and still pass `Array` to functors expecting a `type 'a t`, which assumes `('a :
187+
value)`.
188+
189+
Relatedly, a `with type` constraint on a signature can fill in a type with one
190+
that has a more specific kind. For example, this is legal:
191+
```ocaml
192+
module type S_any = sig
193+
type t : any
194+
end
195+
196+
module type S_imm = S_any with type t = int
197+
```
198+
199+
This can be particularly useful for common signatures that might be implemented
200+
by types with any kind (e.g., `Sexpable`).
201+
202+
# With-bounds
203+
204+
Sometimes the kind of a type constructor depends on the kinds of the types that
205+
instantiate its parameters. For example, the type `'a list` can mode cross on
206+
the portability axis if `'a` does.
207+
208+
We could have a `list` whose kind is restricted to work on types that more cross
209+
on the portability axis, to record this fact, as in:
210+
```ocaml
211+
type ('a : value mod portable) portable_list : value mod portable
212+
```
213+
But it would be annoying to have many different list types, and we certainly
214+
don't want to restrict the normal `list` type to work only on a subset of
215+
`value`s.
216+
217+
The solution to this problem is the with-bounds: a with-bound in a kind of a
218+
type makes that type not mode-cross whenever the with-bound also does not
219+
mode-cross. Add a bound to the `mod` section always *lowers* a kind, while
220+
adding a with-bound always *raises* a kind.
221+
222+
Here is the full kind of `'a list`:
223+
224+
```ocaml
225+
type 'a list
226+
: value mod contended portable many immutable stateless unyielding with 'a
227+
```
228+
229+
also written (see our [syntax page](syntax.md) for details on `immutable_data`)
230+
231+
```ocaml
232+
type 'a list : immutable_data with 'a
233+
```
234+
235+
We can think of this as saying that the `list` type itself is safe to cross all
236+
those axes, but still contains data of type `'a`. Because modes are deep,
237+
allowing `'a list` to mode-cross just because the `list` structure itself can
238+
mode-cross would be wrong: the elements would cross along with the list! We thus
239+
state in the with-bounds that `'a list` contains `'a` -- that's the intuition
240+
behind the `with` syntax.
241+
242+
Looking at examples of `list`, we would have `int list : immutable_data`
243+
(because `int` mode-crosses everything) but `(int -> int) ref list : value`,
244+
because `(int -> int) ref` mode-crosses nothing.
245+
246+
## Modalities in with-bounds
247+
248+
Fields in a record or constructor can contain *modalities*, as described in
249+
our [modes documentation](). To get maximal mode-crossing, these modalities need
250+
to be reflected in the with-bounds as well. For example, we can have
251+
252+
```ocaml
253+
type 'a portended = Portend of 'a @@ portable contended
254+
```
255+
256+
The kind of this is `immutable_data with 'a @@ portable contended`. The
257+
modalities in the with-bound indicate that we know `'a` must be portable and
258+
contended, and thus it does not matter whether the type substituted for `'a`
259+
crosses these modes. So, for example, `int ref portended` still crosses to
260+
`contended`, because the `@@ contended` protects the `with int ref @@ portable
261+
contended` from affecting the contention axis.
262+
263+
## Kind equivalence
264+
265+
Now that we have with-bounds, we can see that there is a rich equivalence
266+
relation on kinds. For example, `value mod portable with int` is equivalent to
267+
`value mod portable`, because `int` mode-crosses portability. Similarly, `value
268+
mod portable with (int -> int) @@ portable` is also equivalent to `value mod
269+
portable`, because the `@@ portable` says that the with-bound cannot affect
270+
portability. Another example is that `value mod portable with (int -> int)` is
271+
equivalent to `value`, because `int -> int` does *not* cross portability.
272+
273+
The OxCaml compiler aggressively *normalizes* kinds to find a minimal kind that
274+
is equivalent to an original one. This normalization procedure is also used
275+
during kind-checking to tell whether one kind is a subkind of another.
276+
277+
An interesting case in normalization is around abstract types. If we have `type
278+
'a abstract` and we have a kind `value mod portable with (int -> int) abstract`,
279+
can we normalize to `value`? We cannot know without knowing what `abstract`
280+
might become. Perhaps we substitute it for a type that ignores its type
281+
argument; then `value mod portable with (int -> int) abstract` is equivalent to
282+
`value mod portable`. On the other hand, perhaps we substitute it for a type
283+
that stores its argument. Then `value mod portable with (int -> int) abstract`
284+
is equivalent to `value`. Accordingly, you might see abstract types appear in
285+
with-bounds, because we are unable to normalize them away until we actually
286+
learn what type is being used.
287+
288+
# Advanced topics
289+
290+
## Why are past and future modal bounds different?
291+
292+
Historically, we thought of modal bounds in kinds as being upper bounds on the
293+
mode of values of the type. This turns out to be the correct meaning only for
294+
the future axes. For past axes, modal bounds in kinds are instead _lower bounds_
295+
on _expected modes_.
296+
297+
This distinction only matters for modal axes with at least three values.
298+
Consider the contention axis, where `uncontended < shared < contended`. What
299+
does a kind like `value mod shared` mean?
300+
301+
Our answer is that it's the kind of a type like:
302+
```ocaml
303+
type 'a t = { shared : 'a @@ shared } [@@unboxed]
304+
```
305+
That is, the meaning of `value mod shared` is related to types using `shared` as
306+
a modality.
307+
308+
The type `t` is a "box" containing something we know is shared. How do you use
309+
this type? In short, this is a box that knows that its contents are `shared`,
310+
even when the box itself is `uncontended`. If we have a `t @ uncontended`, the
311+
mode of `t.shared` is `shared`. But if we have a `t @ contended`, the mode of
312+
`t.shared` must still be `contended` - this `t` may have come from another
313+
thread, and for thread safety its contents must now be treated as contended
314+
regardless of their mode when they went into the box.
315+
316+
So, for this type, we don't care about the difference between `uncontended` and
317+
`shared`, but we do care about the difference between `shared` and `contended`.
318+
If we thought of `mod shared` as representing a upper bound on the mode, that
319+
would suggest we can treat `contended` things as `shared`, which we've just seen
320+
is wrong.
321+
322+
Instead, for past axes, a modal bound is _lower bound_ on expectations: `mod
323+
shared` means that even if you're expecting to get an `uncontended` thing of
324+
this type, a `shared` thing is just as good.
325+
326+
## Existential types in with-bounds
327+
328+
Consider the type
329+
330+
```ocaml
331+
type t = K : 'a * ('a -> int) -> t
332+
```
333+
334+
What should the kind of this type be? Naively, we just put all the fields
335+
in a type in its with-bounds, meaning we'd infer
336+
337+
```ocaml
338+
type t : immutable_data with 'a with ('a -> int)
339+
```
340+
341+
But this would be bad: that `'a` is not in scope. Instead, we recognize that
342+
we will never learn more about `'a` (e.g. by substitutions), and thus that
343+
the kind of `t` should just consider `'a` to be any old `value`. In this case,
344+
then, we can produce the kind `value` for t.
345+
346+
However, what about this case:
347+
348+
```ocaml
349+
type 'a abstract
350+
type t = K : 'a abstract -> t
351+
```
352+
353+
Now what can we do? We want to say that `'a` is like any old `value`, but what
354+
concretely does that look like? Our solution is to invent a way to say "any old
355+
`value`", which we write `(type : value)`. So the kind of this `t` is
356+
`immutable_data with (type : value) abstract`. This way, if we later learn that
357+
`abstract` ignores its argument, we can get a kind of `immutable_data`. If
358+
`abstract` stores its argument, we can get a kind of `value`.
359+
360+
The type `(type : <<kind>>)` can actually be used anywhere a type can be
361+
written, but the type is uninhabited and useful only in the context of a
362+
with-bound.

0 commit comments

Comments
 (0)