Imp: types

This is a part of a series - start at the beginning.

Before looking at the interpreter, I'm going to introduce a very simple type system. The type system has two jobs.

It rules out sets that contain tuples of varied lengths. The denotational semantics allow sets like (1 x 2) | 3, but it's confusing in practice and banning it allows us to make several simplifying assumptions in the interpreter.
It tracks constructive proofs that a value is finite and disallows certain operations on possibly-infinite values. This both guides the evaluation strategy and allows the programmer to reason about what kinds of expressions can be evaluated without having to look at the entire program.

I'll define everything formally first, but feel free to skip to the interactive examples at the bottom.

Semantics

The type system is really just an abstract interpreter. Every type corresponds to a set of possible values.

\[ \gdef\D#1{\llbracket #1 \rrbracket} \gdef\type#1{\mathsf{type}(#1)} \gdef\typeenv#1{\mathsf{type}_{\env}(#1)} \gdef\env{\mathsf{env}} \gdef\values#1{\llbracket #1 \rrbracket ^ T} \gdef\Expression{\mathsf{Expression}} \gdef\Name{\mathsf{Name}} \gdef\ValueType{\mathsf{ValueType}} \gdef\Value{\mathsf{Value}} \gdef\error{\mathsf{error}} \gdef\ScalarType{\mathsf{ScalarType}} \gdef\code#1{\textcolor{darkblue}{\texttt{#1}}} \gdef\match{\mathsf{match}} \gdef\indent{\space\space\space\space\space\space\space\space} \]

\[ \begin{aligned} \ScalarType =&\space \code{any} \\ \ValueType =&\space \code{none} \\|&\space \code{maybe} \\|&\space \ScalarType \code{ x } \ValueType \\|&\space \ScalarType \code{ -> } \ValueType \end{aligned} \\ \space\\ \llbracket \cdot \rrbracket ^ T:\space \ValueType \rightarrow 2^\Value \\ \space\\ \begin{aligned} \values{\code{none}} &= \{ \{\} \} \\ \values{\code{maybe}} &= \{ \{\} , \{()\} \} \\ \values{\code{any x T}} &= \{v \mid v \in \Value, \\ &\space\space\space\space\space\space\space\space\space\space\space\space\space \forall t \in v \ldotp (\mathsf{length}(t) \geq 1), \\ &\space\space\space\space\space\space\space\space\space\space\space\space\space \forall s \in \mathsf{Scalar} \ldotp \mathsf{project}(v, s) \in \values{\code{T}} \}, \\ &\space\space\space\space\space\space\space\space\space\space\space\space\space \mathsf{is\_finite}(\mathsf{first}(v)) \} \\ \values{\code{any -> T}} &= \{v \mid v \in \Value, \\ &\space\space\space\space\space\space\space\space\space\space\space\space\space \forall t \in v \ldotp (\mathsf{length}(t) \geq 1), \\ &\space\space\space\space\space\space\space\space\space\space\space\space\space \forall s \in \mathsf{Scalar} \ldotp \mathsf{project}(v, s) \in \values{\code{T}} \} \\ \space\\ \mathsf{first}(v) &= \{ t[1] \mid t \in v \} \\ \mathsf{project}(v, s) &= \{ t[2..] \mid t \in v, t[1] = s \} \\ \end{aligned}\\ \]

If the type system is correct, we should find that the value of every expression is contained in the type of that expression.

\[ \forall \code{e} \ldotp \D{\code{e}} \in \values{\type{\code{e}}} \]

Inference

The type system is simple enough to define it as a function rather than as a set of inference rules.

I'll use a magical pseudo-Rust that can quasiquote imp expressions like expr!(a | b) and imp types like typ!(any x t), and where foo? is the standard Rust syntax which propagates errors to the calling function - it expands to roughly:

match foo {
  Ok(value) => value,
  Err(err) => return Err(err),
}

Without further ado:

fn typ(env: Fn(Name) -> ValueType, expr: Expression) -> Result<ValueType, Error> {
    Ok(match expr {
        expr!(none) => typ!(none),
        expr!(some) => typ!(maybe),
        expr!(s) if is_scalar(s) => typ!(any x maybe),
        expr!(a | b) => typ_union(typ(env, a)?, typ(env, b)?)?,
        expr!(a & b) => typ_intersect(typ(env, a)?, typ(env, b)?)?,
        expr!(a x b) => typ_product(typ(env, a)?, typ(env, b)?),
        expr!(!a) => {
            match typ(env, a)? {
              typ!(_ -> _) => Err(..)?,
              _ => typ!(maybe),
            }
        }
        expr!(a = b) => {
            match (typ(env, a)?, typ(env, b)?) {
              (typ!(_ -> _), _) => Err(..)?,
              (_, typ!(_ -> _)) => Err(..)?
              _ => typ!(maybe),
            }
        }
        expr!(n) if is_name(n) => env(n),
        expr!(let n = a in b) => {
            let ta = typ(env, a)?;
            let env2 = |n| if n == n { ta } else { env(n) };
            typ(env2, expr!(b))
        }
        expr!(a b) => typ_apply(typ(env, a)?, typ(env, b)?)?,
        expr!(n -> a) => {
            let ta = typ(env, a)?;
            match ta {
              typ!(none) -> typ!(none),
              _ => typ!(any -> ta),
            }
        }
    })
}

fn typ_union(ta: ValueType, tb: ValueType) -> Result<ValueType, Error> {
    Ok(match (ta, tb) {
        (typ!(none), tb) => tb,
        (ta, typ!(none)) => ta,
        (typ!(maybe), typ!(maybe)) => typ!(maybe),
        // union of finite values is finite
        (typ!(any x ta2), typ!(any x tb2)) => {
            let t2 = typ_union(ta2, tb2)?;
            typ!(any x t2)
        }
        // union of one or more maybe-infinite values is maybe-infinite
        (typ!(any -> ta2), typ!(any x tb2))
        | (typ!(any x ta2), typ!(any -> tb2))
        | (typ!(any -> ta2), typ!(any -> tb2)) => {
            let t2 = typ_union(ta2, tb2)?;
            typ!(any -> t2)
        }
        // mismatched arities result in type error
        _ => Err(..)?,
    })
}

fn typ_intersect(ta: ValueType, tb: ValueType) -> Result<ValueType, Error> {
    Ok(match (ta, tb) {
        (typ!(none), _) | (_, typ!(none)) => typ!(none),
        (typ!(maybe), typ!(maybe)) => typ!(maybe),
        // intersection of finite values is finite
        (typ!(any x ta2), typ!(any x tb2)) => {
            let t2 = typ_intersect(ta2, tb2)?;
            typ!(any x t2)
        }
        // intersection of one or more maybe-infinite values is maybe-infinite
        (typ!(any -> ta2), typ!(any x tb2))
        | (typ!(any x ta2), typ!(any -> tb2))
        | (typ!(any -> ta2), typ!(any -> tb2)) => {
            let t2 = typ_intersect(ta2, tb2)?;
            typ!(any -> t2)
        }
        // mismatched arities result in type error
        _ => Err(..)?,
    })
}

fn typ_product(ta: ValueType, tb: ValueType) -> ValueType {
    match ta {
        typ!(none) => typ!(none),
        typ!(maybe) => tb,
        typ!(any x ta2) => {
            let t2 = typ_product(ta2, tb);
            match t2 {
                typ!(none) => typ!(none),
                // surprising normalization rule!
                typ!(_ -> _) => typ!(any -> t2),
                _ => typ!(any x t2),
            }
        }
        typ!(any -> ta2) => {
            let t2 = typ_product(ta2, tb);
            match t2 {
                typ!(none) => typ!(none),
                _ => typ!(any -> t2),
            }
        }
    }
}

fn typ_apply(ta: ValueType, tb: ValueType) -> Result<ValueType, Error> {
    Ok(match (ta, tb) {
        (typ!(none), _) | (_, typ!(none)) => typ!(none),
        (ta, typ!(maybe)) => ta,
        (typ!(maybe), tb) => tb,
        (typ!(any x ta2), typ!(any x tb2))
        | (typ!(any -> ta2), typ!(any x tb2))
        | (typ!(any x ta2), typ!(any -> tb2)) => typ_apply(ta2, tb2),
        // apply fails if we can't prove that one side is finite
        (typ!(any -> _), typ!(any -> _)) => Err(..)?,
    })
}

Examples

Let's see this in action.

imp requires javascript!

The type of an expression tells you the arity of the value you will get on evaluating it.

Mixed-arity expressions are not allowed.

It isn't clear without context what the arity of a none should be, so it gets its own unique type.

Whereas all sets that definitely have arity 0 get the type maybe, even if they might evaluate to the same value as none.

Application chops off the common prefix of both types, eventually resulting in maybe:

The only way the -> type gets introduced is via the -> expression.

The type above tells you that this may be an infinite set, but when applied to a single scalar will definitely produce a finite set.

Applying maybe-infinite sets to finite sets, or vice versa, is fine, but applying a maybe-infinite set to another maybe-infinite set is a type error.

Other operations on maybe-infinite sets produce more maybe-infinite sets.

Only product has a surprising inference rule:

We could give this expression a narrower type (any x any -> any x maybe), but we're going to use types to determine how the interpreter represents values, and allowing x to the left of -> in types would require a more complex representation.

All of this will be a little less abstract in the next post, where we'll see how the interpreter makes use of types.