Skip to content

Clean up skolemization with placeholder #217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 25, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/appendix/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ rib | a data structure in the name resolver that keeps trac
sess | the compiler session, which stores global data used throughout compilation
side tables | because the AST and HIR are immutable once created, we often carry extra information about them in the form of hashtables, indexed by the id of a particular node.
sigil | like a keyword but composed entirely of non-alphanumeric tokens. For example, `&` is a sigil for references.
skolemization | a way of handling subtyping around "for-all" types (e.g., `for<'a> fn(&'a u32)`) as well as solving higher-ranked trait bounds (e.g., `for<'a> T: Trait<'a>`). See [the chapter on skolemization and universes](../borrow_check/region_inference.html#skol) for more details.
placeholder | **NOTE: skolemization is deprecated by placeholder** a way of handling subtyping around "for-all" types (e.g., `for<'a> fn(&'a u32)`) as well as solving higher-ranked trait bounds (e.g., `for<'a> T: Trait<'a>`). See [the chapter on placeholder and universes](../borrow_check/region_inference.html#placeholder) for more details.
soundness | soundness is a technical term in type theory. Roughly, if a type system is sound, then if a program type-checks, it is type-safe; i.e. I can never (in safe rust) force a value into a variable of the wrong type. (see "completeness").
span | a location in the user's source code, used for error reporting primarily. These are like a file-name/line-number/column tuple on steroids: they carry a start/end point, and also track macro expansions and compiler desugaring. All while being packed into a few bytes (really, it's an index into a table). See the Span datatype for more.
substs | the substitutions for a given generic type or item (e.g. the `i32`, `u32` in `HashMap<i32, u32>`)
Expand Down
83 changes: 41 additions & 42 deletions src/borrow_check/region_inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,19 +78,19 @@ The kinds of region elements are as follows:
etc) control-flow graph.
- Similarly, there is an element denoted `end('static)` corresponding
to the remainder of program execution after this function returns.
- There is an element `!1` for each skolemized region `!1`. This
- There is an element `!1` for each placeholder region `!1`. This
corresponds (intuitively) to some unknown set of other elements –
for details on skolemization, see the section
[skolemization and universes](#skol).
for details on placeholders, see the section
[placeholders and universes](#placeholder).

## Causal tracking

*to be written* – describe how we can extend the values of a variable
with causal tracking etc

<a name="skol"></a>
<a name="placeholder"></a>

## Skolemization and universes
## Placeholders and universes

(This section describes ongoing work that hasn't landed yet.)

Expand All @@ -117,7 +117,7 @@ for its argument, and `bar` wants to be given a function that that
accepts **any** reference (so it can call it with something on its
stack, for example). But *how* do we reject it and *why*?

### Subtyping and skolemization
### Subtyping and Placeholders

When we type-check `main`, and in particular the call `bar(foo)`, we
are going to wind up with a subtyping relationship like this one:
Expand All @@ -129,10 +129,9 @@ the type of `foo` the type `bar` expects
```

We handle this sort of subtyping by taking the variables that are
bound in the supertype and **skolemizing** them: this means that we
replace them with
bound in the supertype and replacing them with
[universally quantified](../appendix/background.html#quantified)
representatives, written like `!1`. We call these regions "skolemized
representatives, written like `!1`. We call these regions "placeholder
regions" – they represent, basically, "some unknown region".

Once we've done that replacement, we have the following relation:
Expand Down Expand Up @@ -163,7 +162,7 @@ should yield up an error (eventually).

### What is a universe

In the previous section, we introduced the idea of a skolemized
In the previous section, we introduced the idea of a placeholder
region, and we denoted it `!1`. We call this number `1` the **universe
index**. The idea of a "universe" is that it is a set of names that
are in scope within some type or at some point. Universes are formed
Expand Down Expand Up @@ -198,7 +197,7 @@ fn bar<'a, T>(t: &'a T) {
```

Here, the name `'b` is not part of the root universe. Instead, when we
"enter" into this `for<'b>` (e.g., by skolemizing it), we will create
"enter" into this `for<'b>` (e.g., by replacing it with a placeholder), we will create
a child universe of the root, let's call it U1:

```text
Expand Down Expand Up @@ -274,25 +273,25 @@ Here, the only way for the two foralls to interact would be through X,
but neither Y nor Z are in scope when X is declared, so its value
cannot reference either of them.

### Universes and skolemized region elements
### Universes and placeholder region elements

But where does that error come from? The way it happens is like this.
When we are constructing the region inference context, we can tell
from the type inference context how many skolemized variables exist
from the type inference context how many placeholder variables exist
(the `InferCtxt` has an internal counter). For each of those, we
create a corresponding universal region variable `!n` and a "region
element" `skol(n)`. This corresponds to "some unknown set of other
elements". The value of `!n` is `{skol(n)}`.
element" `placeholder(n)`. This corresponds to "some unknown set of other
elements". The value of `!n` is `{placeholder(n)}`.

At the same time, we also give each existential variable a
**universe** (also taken from the `InferCtxt`). This universe
determines which skolemized elements may appear in its value: For
example, a variable in universe U3 may name `skol(1)`, `skol(2)`, and
`skol(3)`, but not `skol(4)`. Note that the universe of an inference
determines which placeholder elements may appear in its value: For
example, a variable in universe U3 may name `placeholder(1)`, `placeholder(2)`, and
`placeholder(3)`, but not `placeholder(4)`. Note that the universe of an inference
variable controls what region elements **can** appear in its value; it
does not say region elements **will** appear.

### Skolemization and outlives constraints
### Placeholders and outlives constraints

In the region inference engine, outlives constraints have the form:

Expand All @@ -313,44 +312,44 @@ are present in `value(V2)` and we add those nodes to `value(V1)`. If
we reach a return point, we add in any `end(X)` elements. That part
remains unchanged.

But then *after that* we want to iterate over the skolemized `skol(x)`
But then *after that* we want to iterate over the placeholder `placeholder(x)`
elements in V2 (each of those must be visible to `U(V2)`, but we
should be able to just assume that is true, we don't have to check
it). We have to ensure that `value(V1)` outlives each of those
skolemized elements.
placeholder elements.

Now there are two ways that could happen. First, if `U(V1)` can see
the universe `x` (i.e., `x <= U(V1)`), then we can just add `skol(x)`
the universe `x` (i.e., `x <= U(V1)`), then we can just add `placeholder(x)`
to `value(V1)` and be done. But if not, then we have to approximate:
we may not know what set of elements `skol(x)` represents, but we
we may not know what set of elements `placeholder(x)` represents, but we
should be able to compute some sort of **upper bound** B for it –
some region B that outlives `skol(x)`. For now, we'll just use
some region B that outlives `placeholder(x)`. For now, we'll just use
`'static` for that (since it outlives everything) – in the future, we
can sometimes be smarter here (and in fact we have code for doing this
already in other contexts). Moreover, since `'static` is in the root
universe U0, we know that all variables can see it – so basically if
we find that `value(V2)` contains `skol(x)` for some universe `x`
we find that `value(V2)` contains `placeholder(x)` for some universe `x`
that `V1` can't see, then we force `V1` to `'static`.

### Extending the "universal regions" check

After all constraints have been propagated, the NLL region inference
has one final check, where it goes over the values that wound up being
computed for each universal region and checks that they did not get
'too large'. In our case, we will go through each skolemized region
and check that it contains *only* the `skol(u)` element it is known to
'too large'. In our case, we will go through each placeholder region
and check that it contains *only* the `placeholder(u)` element it is known to
outlive. (Later, we might be able to know that there are relationships
between two skolemized regions and take those into account, as we do
between two placeholder regions and take those into account, as we do
for universal regions from the fn signature.)

Put another way, the "universal regions" check can be considered to be
checking constraints like:

```text
{skol(1)}: V1
{placeholder(1)}: V1
```

where `{skol(1)}` is like a constant set, and V1 is the variable we
where `{placeholder(1)}` is like a constant set, and V1 is the variable we
made to represent the `!1` region.

## Back to our example
Expand All @@ -365,7 +364,7 @@ fn(&'static u32) <: fn(&'!1 u32) @ P // this point P is not imp't here
The region inference engine will create a region element domain like this:

```text
{ CFG; end('static); skol(1) }
{ CFG; end('static); placeholder(1) }
--- ------------ ------- from the universe `!1`
| 'static is always in scope
all points in the CFG; not especially relevant here
Expand All @@ -377,7 +376,7 @@ will have initial values like so:

```text
Vs = { CFG; end('static) } // it is in U0, so can't name anything else
V1 = { skol(1) }
V1 = { placeholder(1) }
```

From the subtyping constraint above, we would have an outlives constraint like
Expand All @@ -390,7 +389,7 @@ To process this, we would grow the value of V1 to include all of Vs:

```text
Vs = { CFG; end('static) }
V1 = { CFG; end('static), skol(1) }
V1 = { CFG; end('static), placeholder(1) }
```

At that point, constraint propagation is complete, because all the
Expand All @@ -411,7 +410,7 @@ for<'a> fn(&'a u32, &'a u32)
for<'b, 'c> fn(&'b u32, &'c u32)
```

Here we would skolemize the supertype, as before, yielding:
Here we would replace the bound region in the supertype with a placeholder, as before, yielding:

```text
for<'a> fn(&'a u32, &'a u32)
Expand Down Expand Up @@ -476,7 +475,7 @@ an error. That's good: the problem is that we've gone from a fn that promises
to return one of its two arguments, to a fn that is promising to return the
first one. That is unsound. Let's see how it plays out.

First, we skolemize the supertype:
First, we replace the bound region in the supertype with a placeholder:

```text
for<'a> fn(&'a u32, &'a u32) -> &'a u32
Expand Down Expand Up @@ -512,26 +511,26 @@ V3: V1
Those variables will have these initial values:

```text
V1 in U1 = {skol(1)}
V2 in U2 = {skol(2)}
V1 in U1 = {placeholder(1)}
V2 in U2 = {placeholder(2)}
V3 in U2 = {}
```

Now because of the `V3: V1` constraint, we have to add `skol(1)` into `V3` (and
Now because of the `V3: V1` constraint, we have to add `placeholder(1)` into `V3` (and
indeed it is visible from `V3`), so we get:

```text
V3 in U2 = {skol(1)}
V3 in U2 = {placeholder(1)}
```

then we have this constraint `V2: V3`, so we wind up having to enlarge
`V2` to include `skol(1)` (which it can also see):
`V2` to include `placeholder(1)` (which it can also see):

```text
V2 in U2 = {skol(1), skol(2)}
V2 in U2 = {placeholder(1), placeholder(2)}
```

Now constraint propagation is done, but when we check the outlives
relationships, we find that `V2` includes this new element `skol(1)`,
relationships, we find that `V2` includes this new element `placeholder(1)`,
so we report an error.

2 changes: 1 addition & 1 deletion src/traits/associated-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ The key point is that, on success, unification can also give back to
us a set of subgoals that still remain to be proven (it can also give
back region constraints, but those are not relevant here).

Whenever unification encounters an (unskolemized!) associated type
Whenever unification encounters an (un-placeholder!) associated type
projection P being equated with some other type T, it always succeeds,
but it produces a subgoal `ProjectionEq(P = T)` that is propagated
back up. Thus it falls to the ordinary workings of the trait system
Expand Down
6 changes: 3 additions & 3 deletions src/traits/caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@ but *replay* its effects on the type variables.
## An example

The high-level idea of how the cache works is that we first replace
all unbound inference variables with skolemized versions. Therefore,
all unbound inference variables with placeholder versions. Therefore,
if we had a trait reference `usize : Foo<$t>`, where `$t` is an unbound
inference variable, we might replace it with `usize : Foo<$0>`, where
`$0` is a skolemized type. We would then look this up in the cache.
`$0` is a placeholder type. We would then look this up in the cache.

If we found a hit, the hit would tell us the immediate next step to
take in the selection process (e.g. apply impl #22, or apply where
Expand All @@ -37,7 +37,7 @@ we would [confirm] `ImplCandidate(22)`, which would (as a side-effect) unify
[confirm]: ./resolution.html#confirmation

Now, at some later time, we might come along and see a `usize :
Foo<$u>`. When skolemized, this would yield `usize : Foo<$0>`, just as
Foo<$u>`. When replaced with a placeholder, this would yield `usize : Foo<$0>`, just as
before, and hence the cache lookup would succeed, yielding
`ImplCandidate(22)`. We would confirm `ImplCandidate(22)` which would
(as a side-effect) unify `$u` with `isize`.
Expand Down
32 changes: 16 additions & 16 deletions src/traits/hrtb.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ bounds*. An example of such a bound is `for<'a> MyTrait<&'a isize>`.
Let's walk through how selection on higher-ranked trait references
works.

## Basic matching and skolemization leaks
## Basic matching and placeholder leaks

Suppose we have a trait `Foo`:

Expand Down Expand Up @@ -36,20 +36,20 @@ to the subtyping for higher-ranked types (which is described [here][hrsubtype]
and also in a [paper by SPJ]. If you wish to understand higher-ranked
subtyping, we recommend you read the paper). There are a few parts:

**TODO**: We should define _skolemize_.

1. _Skolemize_ the obligation.
2. Match the impl against the skolemized obligation.
3. Check for _skolemization leaks_.
1. Replace bound regions in the obligation with placeholders.
2. Match the impl against the [placeholder] obligation.
3. Check for _placeholder leaks_.

[placeholder]: ../appendix/glossary.html#appendix-c-glossary
[hrsubtype]: https://github.com/rust-lang/rust/tree/master/src/librustc/infer/higher_ranked/README.md
[paper by SPJ]: http://research.microsoft.com/en-us/um/people/simonpj/papers/higher-rank/

So let's work through our example.

1. The first thing we would do is to
skolemize the obligation, yielding `AnyInt : Foo<&'0 isize>` (here `'0`
represents skolemized region #0). Note that we now have no quantifiers;
replace the bound region in the obligation with a placeholder, yielding
`AnyInt : Foo<&'0 isize>` (here `'0` represents placeholder region #0).
Note that we now have no quantifiers;
in terms of the compiler type, this changes from a `ty::PolyTraitRef`
to a `TraitRef`. We would then create the `TraitRef` from the impl,
using fresh variables for it's bound regions (and thus getting
Expand All @@ -59,10 +59,10 @@ using fresh variables for it's bound regions (and thus getting
we relate the two trait refs, yielding a graph with the constraint
that `'0 == '$a`.

3. Finally, we check for skolemization "leaks" – a
leak is basically any attempt to relate a skolemized region to another
skolemized region, or to any region that pre-existed the impl match.
The leak check is done by searching from the skolemized region to find
3. Finally, we check for placeholder "leaks" – a
leak is basically any attempt to relate a placeholder region to another
placeholder region, or to any region that pre-existed the impl match.
The leak check is done by searching from the placeholder region to find
the set of regions that it is related to in any way. This is called
the "taint" set. To pass the check, that set must consist *solely* of
itself and region variables from the impl. If the taint set includes
Expand All @@ -78,7 +78,7 @@ impl Foo<&'static isize> for StaticInt;

We want the obligation `StaticInt : for<'a> Foo<&'a isize>` to be
considered unsatisfied. The check begins just as before. `'a` is
skolemized to `'0` and the impl trait reference is instantiated to
replaced with a placeholder `'0` and the impl trait reference is instantiated to
`Foo<&'static isize>`. When we relate those two, we get a constraint
like `'static == '0`. This means that the taint set for `'0` is `{'0,
'static}`, which fails the leak check.
Expand Down Expand Up @@ -111,16 +111,16 @@ Now let's say we have a obligation `Baz: for<'a> Foo<&'a isize>` and we match
this impl. What obligation is generated as a result? We want to get
`Baz: for<'a> Bar<&'a isize>`, but how does that happen?

After the matching, we are in a position where we have a skolemized
After the matching, we are in a position where we have a placeholder
substitution like `X => &'0 isize`. If we apply this substitution to the
impl obligations, we get `F : Bar<&'0 isize>`. Obviously this is not
directly usable because the skolemized region `'0` cannot leak out of
directly usable because the placeholder region `'0` cannot leak out of
our computation.

What we do is to create an inverse mapping from the taint set of `'0`
back to the original bound region (`'a`, here) that `'0` resulted
from. (This is done in `higher_ranked::plug_leaks`). We know that the
leak check passed, so this taint set consists solely of the skolemized
leak check passed, so this taint set consists solely of the placeholder
region itself plus various intermediate region variables. We then walk
the trait-reference and convert every region in that taint set back to
a late-bound region, so in this case we'd wind up with
Expand Down