Imp: core language

This is a part of a series - start at the beginning.

What do I want out of the core language?

simple denotational semantics
no evaluation order
logical data independence
amenable to parallel evaluation
amenable to incremental maintenance

The datalog family of languages fits the bill, but they struggle to express and compress many kinds of patterns that we take for granted in other languages. So let's start in that vicinity and see how much expressiveness we can claw back without sacrificing the goals above.

This page is running a (very naive) imp interpreter. You'll need a reasonably modern browser with javascript enabled to see the examples.

imp requires javascript!

All the examples are editable - just click on the input text and type.

Sets

The value of an imp expression is always a set of tuples of scalar values (integers, strings etc).

If you just type in a single scalar value, you get back a set containing one tuple containing that scalar value.

The repl displays a set of tuples as a table, where each row is one tuple.

We can build more complex sets by using union (|), product (x) and intersect (&).

We can name things.

Names are lexically scoped, can shadow older names and do not allow recursive definitions.

There are two important sets that have their own names - none is the empty set and some is the set containing a single empty tuple.

Coincidentally, these sets behave like booleans, where none is falsey and some is truthy.

We can check if a set is empty or not using negate (!).

We can use none and some to imitate conditionals.

This is a bit hard to follow though, so we'll add some sugar.

Lastly, the equals (=) operator allows testing if two sets are equal.

Apply

So far, pretty simple. The next operator is a weird one. Here is the pseudocode.

fn apply(set_a, set_b) {
  let mut result = Set::new();
  for tuple_a in set_a {
    for tuple_b in set_b {
      let n = min(tuple_a.len(), tuple_b.len());
      if tuple_a[0..n] == tuple_b[0..n] {
        result.insert(
          product(
            tuple_a[n..],
            tuple_b[n..],
          )
        )
      }
    }
  }
  result
}

It will make more sense with some examples.

Applying a scalar to a scalar effectively just tests if they are equal.

(Remember that we think none is falsey and some is truthy.)

Applying a set to a scalar behaves like a lookup.

Applying a set to another set lets us lookup multiple values at once.

We can keep on applying values to test if a particular tuple is in a set.

Apply works in either direction, so we can also use it to chain lookups.

(Apply is left-associative, so "apples" colors fancy means ("apples" colors) fancy.)

Apply is a pretty flexible operator. Maybe too flexible. We'll see.

Abstract

The abstract operator creates functions.

Hang on...

Isn't every expression supposed to return a set?

Well, the value of inc is a set of tuples. It just happens to be an infinite set, containing the tuples 0 x 1 | 1 x 2 | 2 x 3 ... and so on. When we say inc 0 we're just looking up the value 0 in that infinite set and finding a 1.

Infinite sets aren't very space efficient though, so it happens that the interpreter stores inc as a closure.

But that doesn't change the important fact that we have a clearly defined meaning for inc. Which means that if someone writes something crazy like...

...we have a way to decide whether or not that was the correct result. Or at least we will in the next post when I actually write down the denotational semantics.

In the meantime, let's see what functions are good for.

We can project or rearrange the values within a set.

That last two might be surprising.

What does fruit -> some mean? It's a set where for any value we look up, we got some ie that value is in the set. So fruit -> some must be the set of all possible tuples of length 1.

Similarly fruit -> none is a set where for any value we look up, we got none ie that value is not in the set. So fruit -> none must be the empty set.

~~The interpreter is not smart enough to calculate that directly though. Not yet.~~

We can also use functions to define sql-style queries.

SELECT name, salary
FROM employees, departments, salaries
WHERE employees.name = departments.name
AND empoyees.name = salaries.name
AND departments.department = 'marketing'

Here is the direct translation.

Not bad, but we can do better. This isn't datalog, after all.

That's it for now. Come back next time for semantics and interpreters.