Skip to content
Viktor Palmkvist edited this page Mar 23, 2020 · 2 revisions

Notation

I'll be using the notation term[focus -> a] to refer to the type term, except where every occurrence of focus is replaced by a. For example, given a type Stm:

syn Stm =
| If {condition : Exp, then : Stm, else : Stm}
| Block {statements : [Stm]}
| VarDecls {declarations : [(Identifier, Exp)]}

We would write Stm[Exp -> a] as:

syn Stm[Exp -> a] =
| If {condition : a, then : Stm, else : Stm}
| Block {statements : [Stm]}
| VarDecls {declarations : [(Identifier, a)]}

and Stm[Stm -> a] as:

syn Stm[Stm -> a] =
| If {condition : Exp, then : a, else : a}
| Block {statements : [a]}
| VarDecls {declarations : [(Identifier, Exp)]}

Finally, in the interest of brevity I'll be using some functional psuedo-code for the examples (something close to OCaml I guess) as opposed to mcore, since it's a bit verbose for pattern matching especially.

The basic components

smap : (a -> b) -> term[focus -> a] -> term[focus -> b] is for cases where you want to apply exactly the same function to every occurrence of focus within term, shallowly.

sfold : (a -> b -> a) -> a -> term[focus -> b] -> a is for cases where you want to combine values at every focused position, but you don't really care about the internal structure of the term.

Possible addition/replacement: saccumMap that maps over the focused terms while threading an accumulator. More on this at the end of this page if you're interested.

Code written using smap and sfold can thus more easily omit the things that don't matter, the boilerplate. Typically you'd write code as normal for the interesting cases, whatever they may be, then use some combination of smap and sfold for the rest, most commonly as a single case in a match.

As a corollary, if you need to do different things in every case you probably won't get all that much benefit from using smap and sfold.

Concretely in this project smap and sfold are implemented manually as semantic functions with names of the form smap_term_focus and sfold_term_focus (e.g. smap_Expr_Expr and sfold_Stm_Expr). Later on we will want them to be automatically generated and to use some form of ad-hoc polymorphism/overloading.

Concrete examples

This section lists a couple of examples where smap and sfold are useful, to use as inspiration for solving similar problems. I have attempted to make each example brief, so please add an issue if something is too brief to be understandable.

De Bruijn indices

Here we are traversing an AST replacing names (strings) with integers. We need an environment containing a mapping from names to ints, which is modified by parent AST nodes, i.e., we're passing an environment down the tree. Most nodes will be unchanged, only those pertaining to name binding should be affected. As such the code becomes:

let de_bruijn env tm = match tm with
  | Lambda {name, body} =
    let body' = de_bruijn (add_name name env) body
    in Lambda {name = name, body = body'}
  | Var {name} =
    Var {name = name, dbidx = lookup_dbidx env name}
  | _ = smap (de_bruijn env) tm

Note that all cases that do not affect binding are handled by the last case.

List of all nodes

Here we want to collect all the nodes of an AST in a list. We don't really care about the internal structure of a node here, only about adding it to a list, and also getting a corresponding list from each child, thus we can use sfold directly. We need no additional information while processing a node so we can also use smap directly, thus:

let allNodesPreorder tm =
  sfold concat [tm] (smap allNodesPreorder tm)

let allNodesPostorder tm =
  concat
    (sfold concat [] (smap allNodesPostorder tm))
    [tm]

Note that neither of these use any particulars of the AST, i.e., they work for any AST. For the same reason they would most likely also need a type annotation or two to clarify which types we are recursing into in a system with type inference.

Desugaring of let-expressions

Here we want to find every occurrence of a let in an AST and replace it with a lambda application, while leaving every other expression unchanged.

let desugarLetOnce tm = match tm with
  | Let {name, value, body} = App
    { lhs = Lambda {name = name, body = body}
    , rhs = value }
  | _ = tm

let bottomUp f a = f (smap (bottomUp f) a)

let desugarLet = bottomUp desugarLetOnce

Note that we use the fact that the desugaring won't introduce new opportunities for desugaring, i.e., it is enough to do a single bottom-up pass. Note also that bottomUp is entirely generic, and that topDown is similarly easy to define:

let topDown f a = smap (topDown f) (f a)

Constant-folding

Here we want to express a few optimizations, then apply them exhaustively to ensure that there are no optimization opportunities left:

let optimizeOnce tm = match tm with
  | Add {lhs = Number{value = 0}, rhs} = Some rhs
  | Add {lhs, rhs = Number{value = 0}} = Some lhs
  | _ = None

let innermost f a1 =
  let a2 = smap (innermost f) a1 in
  match f a2 with
  | Some a3 = innermost f a3
  | None = a2

let constantfold = innermost optimizeOnce

Count nodes satisfying a predicate

Assuming the predicate requires no contextual information, we can write this as:

let isLambda tm = match tm with
  | Lambda _ = True
  | _ = False

let countPred p tm =
  let childResults = sfold (+) 0 (smap (countPred p) tm)
  in if p tm then 1 + childResults else childResults

let numLambdas = countPred isLambda

straverse/saccumMap

There are cases where the recursion of a function is mostly boilerplate, but we still need something more/else than smap and sfold, namely when we want to pass information between adjacent children. For example, if we need to generate new names and we want to make sure that they're unique across the entire tree, then we would want to track previously generated names.

In Haskell this would be done by putting such state in a monad, then using it to thread information around. Thus we would want a shallow traverse, i.e., a mapping function that uses a monadic/applicative action a -> m b instead of a normal function a -> b. The type signature would thus be straverse : Applicative m => (a -> m b) -> term[focus -> a] -> m term[focus -> b].

An alternative would be to have a function saccumMap : (a -> b -> (a, c)) -> a -> term[focus -> b] -> (a, term[focus -> c]). This function is equivalent with straverse in terms of expressibility, either can be use to implement the other, and either can be used to implement both smap and sfold. I expect that in most cases it is more pleasant to use smap and sfold directly though.

Finally, straverse and saccumMap might not be needed depending on what we settle on for effects, thus those are not included at the moment.