Skip to content

Path Dependent Types

Michael Coblenz edited this page Jun 19, 2017 · 8 revisions

Some sort of path-dependent type (whereby values can have type members) seems empirically useful in contracts. A general case where this issue arises is when a contract C defines some resource R; the simplest example of this pattern is a Money/Treasury contract, whereby a Treasury distributes currency (in practice, presumably with some underlying value).

owned contract Money {
  int value;
  /* assume that the ability to call this is restricted to the Treasury */
  protected Money(int v) {
    value = v;
  }

  /* when called on [m1], modifies [m1] to [m1'] and returns [m2],
   * such that [m2.value == amount] and [m1.value == m1'.value + m2.value].
   * If [amount > value], throws an exception. */
  transaction split(int amount) returns Money { ... }

  /* when called on [m1], consumes [m2] and modifies [m1] to [m1'],
   * such that [m1'.value == m1.value + m2.value]. */
  transaction merge(Money m2) { ... }
}

shared contract Treasury {
  Money reserves;

  /* the treasury will mint new money if it's out of money */
  transaction mint_new_money() {
    if (reserves.value == 0) {
      reserves.merge(new Money(10000));
    }
  }

  /* The treasury will give 1 Money to anyone who asks */
  transaction distribute_money() returns Money {
    return reserves.split(1);
  }
}

The above example is subtly problematic, however, because, although the Treasury contract can be constructed multiple times, all Treasury contracts share the same Money type. This defies what one would intuitively expect, namely that each instance of Treasury has its own type of Money.

There are several high-level options for adding support for this in the language.

Option 1: Manual Dynamic Checks

In the absence of any language specific mechanism for value-specific types, the programmer can still distinguish at run-time whether two instances of Money are the same by storing a reference back to their Treasury:

owned contract Money {
  int value;
  Treasury treasury;
  protected Money(int v, Treasury t) {
    value = v;
    treasury = t;
  }

  ...

  transaction merge(Money m2) {
    if (this.treasury != m2.treasury) throw;
    ...  
  }
}

Option 2: Compiler-Generated Dynamic Checks

First of all, to generate checks automatically, it must be known to the compiler which types are related in this way. One way to do this simply with nesting:

contract Treasury {
  contract Money {
    ...
  }
  ...
}

The desired semantics of this is that each instance of t : Treasury has its own distinct type t.Money (even though the implementation for t1.Money and t2.Money may be the same).

We can make two reasonable assumptions here that allow us to generate checks automatically:

  1. Assume recursive references to other members of Money inside the definition of Money must be associated with the same Treasury instance as this.
  2. Assume any reference to Money inside the definition of Treasury refers to this.Money.

Assumption (1) would allow roughly the following check to be generated in the Money contract:

contract Money {
  ...
  transaction merge(Money m2) {
    ...
  }
}

==>

contract Money {
  Treasury t;
  ...
  transaction merge(Money m2) {
    if (this.treasury != m2.treasury) throw;
    ...  
  }
}

On the other hand, assumption (2) would allow roughly the following:

contract Treasury {
  contract Money { ... }
  transaction take_back_money(Money m) { ... }
  ...
}

==>

contract Treasury {
  contract Money {
    Treasury t;
    ...
  }
  transaction take_back_money(Money m) {
    if (m.t != this) throw;
    ...
  }
  ...
}

In practice, these assumptions likely hold true for most contracts. In the presumably rare case where these assumptions don't hold, a user can define the contracts separately and manually add checks (Option 1).

With these generated checks, the Money and Treasury contracts are insulated from mistypings, in the sense that dynamic errors are transparent to these contracts, and are only seen by callers when they pass incorrectly typed arguments. For other contracts that depend on Money/Treasury, this is perhaps not the case:

contract C {
  state No_Money {
    transaction add_money(Money m) { ->One_Money with {m1 = m}; }
  }
  state One_Money {
    Money m1;
    transaction add_money(Money m) { ->Two_Money with {m2 = m}; }
  }
  state Two_Money {
    Money m1, m2;
    transaction combine_money() { ->One_Money with {m1 = m1.merge(m2)}; }
  }
}

In the state Two_Money, the values of m1 and m2 may be different Money types, but it's not until they are merged in the combine_money transaction that this will actually manifest as an error. Indeed, if this is the case, the contract becomes stuck in the Two_Money state, which is non-obvious from looking at the code itself.

A slight improvement to this scheme might allow a contract to specify statically which instance of Money they mean (perhaps at deployment time):

/* at deployment time, this might be disambiguated to be specifically [t.Money] for some [t] */
import Money;

contract C {
  transaction t(Money m) {
    /* we can generate checks here that [m] is type [t.Money], much like inside of [Treasury] */
    ...
  }
}

Option 3: Statically Checked Types

Up until now, we have been referring to types of the form t.Money where t : Treasury only in the description of the options, rather than in the language itself. We can also have these in the language, however, and allow the programmer to use these types explicitly.

This doesn't realistically change anything for the Money and Treasury contracts when compared with Option 2: we can simply allow the type Money in these contracts to refer to the type of this and the type this.Money, respectively. However, the programmer can also refer to another instance of Money by explicitly using the type t_other.Money.

The same sort of elision can be made when another contract uses Money, but statically disambiguates which Treasury instance the type comes from. In other cases, however, the contract programmer will be required to specify t.Money for some value t. Using an example from before:

contract C {
  Treasury t;
  state No_Money {
    transaction add_money(t.Money m) { ->One_Money with {m1 = m}; }
  }
  state One_Money {
    t.Money m1;
    transaction add_money(t.Money m) { ->Two_Money with {m2 = m}; }
  }
  state Two_Money {
    t.Money m1, m2;
    transaction combine_money() { ->One_Money with {m1 = m1.merge(m2)}; }
  }
}

Since m1 and m2 are guaranteed to be the same type in the state Two_Money, the combine_money transaction must succeed.

Care must be taken that t does not change when a type predicated on t is in scope, although the object referred to by t may be mutated. This seems reasonably easy to statically check.

Some sort of local alias analysis could be used to infer whether two types t1.Money and t2.Money are the same, despite having different names. For example:

contract C {
  state S1 {
    transaction do_stuff(Treasury t, t.Money m) {
      /* although it's not immediately clear that [m] is of type [treasury.Money],
       * this can be inferred in this case by noting that [treasury == t] */
      ->S2 with { treasury = t, money = m };
    }
  }
  state S2 {
    Treasury treasury;
    treasury.Money money;
  }
}

Options

There are various design questions to consider here.

Subtyping

contract Client {
    Bank b1 = new Bank();
    Bank b2 = new Bank();
    b1.Money m1 = b1.withdraw(1); // OK
    Money m = m1; // *Is this allowed? Is Money a supertype of b1.Money?*
    b2.deposit(m1); // COMPILE ERROR: b1.Money is not convertible to b2.Money
    b2.deposit(m); // COMPILE ERROR: Money is not convertible to b2.Money
}

Ownership

The examples above use scoping to denote path dependency. I (mcoblenz) prefer to do this more explicitly so that inner contracts can be used for types that are not path-dependent.

Here, resource contract A owned by B specifies that each B has a distinct type A.

resource contract Money owned by Bank;
contract Bank {
    transaction deposit (Money m);
    transaction withdraw (int amount) returns Money;
}

The approach below specifies the path dependency at the use of the type, admitting non-path-dependent uses of the Money type:

resource contract Money;
contract Bank {
    dependent type BankMoney is kind of Money;

    transaction deposit (BankMoney m);
    transaction withdraw (int amount) returns BankMoney;
    transaction exchange (Money m) returns BankMoney; // takes any kind of money and converts it to BankMoney
}

Orthogonal Issues/Extensions

  • contracts could have arbitrary type members, not just ones that are defined inside the contract. For example:
contract C {
  type T;
  T t;
  C(Treasury t) {
    T = t.Money;
    t = t.distribute_money();
  }
}

Indeed, this may be necessary in order to faithfully represent a contract via an interface.

  • If the cases where we want path dependent types coincide with the cases where we want to limit construction, we may combine these two concepts. For example:
contract C {
  contract R { ... }
  ...
}

The above code might make the precise type of R dependent on C, while also disallowing construction of c.R outside of transactions on c. Circumstantially, it seems like these two concepts often align, in that the motivation for making R dependent on the value of c : C seems primarily to prevent another instance c' : C from forging instances of c.R: but if c.R could be constructed freely anyways (i.e. if construction wasn't limited to the transactions of c) then the restrictions introduced by dependently typing R wouldn't make any substantial difference.

Informal Evaluation

Option 2 is strictly more capable than Option 1, so there's little reason to select Option 1, unless we find that the assumptions made in Option 2 to generate checks are confusing for users.

Option 3 adds some additional annotation burden to the mix, but the kind of dynamic errors possible in Option 2 would likely go overlooked, especially by new users. To some extent, there are orthogonal factors that determine how usable Option 3 will be (in particular, the thoroughness of the local aliasing analysis that we employ).

Option 3 has other issues as well. In some sense, it breaks encapsulation, since arguments and return values might refer to a specific value. This is okay from the contract's perspective:

contract C {
  Treasury t;
  transaction do_stuff(t.Money m) returns t.Money { return m; }
}

but, when viewed from an interface:

interface C {
  /* [t.Money] doesn't make sense here without defining [t],
   * but defining fields in an interface seems wrong */
  Treasury t;
  transaction do_stuff(t.Money m) returns t.Money;
}

interface C {
  /* We could abstract away the actual field [t], but then we have types
   * dependent on entire expressions, which seems problematic in the general
   * case: e.g. it's not clear from the definition below that [get_treasury]
   * will return the same thing every time */
  transaction do_stuff(get_treasury().Money m) returns get_treasury().Money;
  function get_treasury() returns Treasury;
}

interface C {
  /* the particular type [t.Money] could also be made a sort of type member,
   * but this requires support for arbitrary type members, and a way of saying
   * that a type member satisfies the interface  */
  type t_Money implements Money;
  transaction do_stuff(t_Money m) returns t_Money;
}

An additional concern: although soundness doesn't appear to be an issue for Option 3 at the moment, it may become an issue if we add other complex language features like inheritance, etc.