term-algebra.tex

\subsection{Term Algebras}
\begin{comment}
We differentiate between $4$ different forms of term languages that differ in their expressive power, as shown in Figure~\ref{fig:term-lang-4-forms}. 
\begin{figure}
    \input{figures/term-lang-4-forms.tex}
    \caption{The term language of \lstmath{Monoid} expressed in $4$ different ways.}
    \label{fig:term-lang-4-forms}
\end{figure}

The \lstmath{Basic} term language defines expressions created using the function symbols of the theory. At this level of abstraction, referring to elements of the carrier is not possible.
Considering, for example, the binary operation of the basic \lstmath{Monoid} term language in Figure~\ref{fig:term-lang-4-forms}, \lstmath{opL}. Its arguments are either the constant \lstmath{eL} or another call for \lstmath{opL}. 
\lstmath{Closed} term languages solve this problem by providing the \lstmath{sing} constructor, abbreviation for \emph{singleton}, that lifts element of type \lstmath{A} into instances of the closed term language. Assuming that the carrier is the type of natural numbers \lstmath{Nat}, a possible term in the language would be \lstmath{opCl (sing (suc zero)) eCl}. 

\lstmath{Open} term languages provides extra constructor to represent variables. A convenient representation of variables is using the \lstmath{Fin} type. 
The two open term languages are shown on the right hand side of Figure~\ref{fig:term-lang-4-forms}.  
\end{comment} 
We capture the $4$ forms by the type \lstmath{Term}. 
\begin{hscode}
data Term = Basic
          | Closed CarrierName
          | BasicOpen NumOfVars
          | Open NumOfVars CarrierName
\end{hscode}
\noindent The arguments to the constructors reflect the arguments of the term language in every case. For example, the type \lstmath{OpMonoidTerm2} in Figure~\ref{fig:term-lang-4-forms} has the type \lstmath{Open "n" "A"}. 

In Section~\ref{subsec:term_lang} we discuss the generation of the $4$ different forms of term languages. We also generate some functions related to the term languages;  functions for simplifying terms of the language, evaluating them, constructing the induction principles, and constructing the staged version of the term language. 
To generate these functions, we need to generate their types and the definitions which consists of patterns and expressions evaluating the value of the functions at the given pattern. The types of the functions are generated by implementing a \lstmath{typeSig} function for each of them. Each of their declarations has the form: 
\begin{hscode}
FunDef Name [Pattern] FunDefBody
\end{hscode}
\noindent The patterns and expressions of every declaration is defined using the \lstmath{patternsExprs} function. In cases when there is more than one argument, some adjustments to the patterns and/or expressions may be needed which are defined within the \lstmath{adjustPatterns} or \lstmath{adjustFunCalls} functions. Finally, each one of the $4$ forms of a term language will have its own \lstmath{oneX} function that generates the function \lstmath{X}. These functions serve as the interface for defining functions on the term language. We describe each one of these functions in Sections~\ref{sec:generation:simplifier} --~\ref{sec:generation:staged}. 
We also generate a staged version of the term language based on a representation type, which we discuss in Section~\ref{sec:generation:tagless}.  

\subsubsection{Term Language}
\label{subsec:term_lang}
A term language represents the type of terms described by the theory. We use the \lstmath{TermLang} type to represent term languages.  
\begin{hscode}
data TermLang = TermLang {
  termTy  :: Term,    -- One of the 4 forms 
  tname   :: Name_,   -- The name of the term language 
  params  :: Params,  -- The parameters of the type 
  cons    :: [Constr] -- The constructors of the type 
}
\end{hscode}
Starting from an \lstmath{EqTheory}, the function \lstmath{tlang} generates a \lstmath{TermLang}. The parameters are decided depending on the value of \lstmath{termTy}. The constructors of the type are declared based on both the type of the term and the fields of the theory. A \lstmath{Closed} or \lstmath{Open} term language would have constructors for constants
\begin{hscode} 
Constr (mkName singConstrNm)
    (Fun (App [mkArg carrierNm]) declType)
\end{hscode} 
\noindent A \lstmath{BasicOpen} and \lstmath{Open} term languages would have constructors for variables. 
\begin{hscode} 
 let fin = App [mkArg "Fin", mkArg natVarNm]
 in Constr (mkName vconstrNm) (Fun fin expr) 
\end{hscode} 
\noindent In all cases, a constructor is generated for every function symbol of the theory 
\begin{hscode} 
constrs = map (constructorsHelper ~$\$$~ termType thryNm t) cs
\end{hscode} 
\noindent where \lstmath{cs} is the list of fields of the theory, 
 \lstmath{termTyp} generates the type of one argument of the function by calling \lstmath{liftType'} that applies the name of the type to its arguments.
\lstmath{constructorsHelper} repeats this type for as many times as needed for the type of the constructor. 

\subsubsection{Simplifiers}
\label{sec:generation:simplifier}
Some simplification rules can be generated from theory presentations based on the axioms, i.e. rewriting some terms into simpler forms based on equality axioms. For every term language of a theory, we generate a simplification function based on this idea. 
%While the simplification rules reflected by these functions are the same, their types are different. 

For a term language \lstmath{L}, the simplifier has the type \lstmath{L $\;\to\;$ L}. In cases when \lstmath{L} is parametrized, the type is preceded by the bindings. The bindings and the type expressions are computed by calling \lstmath{tlangInstance} which uses the \lstmath{tinstance} function described in Section~\ref{subsec:generation:datatypes}. The construction of the type aftewards is straightforward as follows 
\begin{hscode} 
 (_,binds,typApp) = tlangInstance tl
 typeExpr Basic = Fun typApp typApp
 typeExpr _     = Pi (Tel binds) (Fun typApp typApp)
\end{hscode} 

The simplification rules are then generated by the \lstmath{simpRules} function. For every equation \lstmath{t$_1$ = t$_2$}, the simplifier need to decide if any of the two terms of the equations is simpler than the other. For this purpose, a well-founded ordering relation is needed. We choose a very simple relation that produces a basic simplifer, i.e. we do not guarantee to reach the simplest form of the term. The relation we use is the length of the term in the sense of its number of literals, computed using the following expression: 
\begin{hscode} 
explength :: Expr -> Int
explength e = everything (+) (mkQ 0 ~$\$$~ \ (Name ~$\_$~) -> 1) e
\end{hscode} 
The function \lstmath{minMax} make sure the expressions are oriented in the right way: 
\begin{hscode}
minMax :: Expr -> Expr -> Maybe (Expr,Expr)
minMax e1 e2 =
  if (explength e1 == explength e2) then Nothing
  else if explength e1 < explength e2 then Just (e1,e2)
  else Just (e2,e1) 
\end{hscode} 
The longer term is converted to an element of type \lstmath{Pattern} and used as input to the simplifier that maps to the shorter term.  
\begin{hscode} 
simpRules :: EqTheory -> Term -> [(Pattern,Expr)]
simpRules thry term =
 let mpng = Map.toList ~$\$$~ mapping thry term
     axms = map (foldrenConstrs mpng) (thry ^. axioms) 
 in mapMaybe simplify axms
\end{hscode} 
The choice of the simplification function can be changed to reflect a more complex relation, by providing an alternative function that has the same type as \lstmath{minMax}. 

For simplification to be effective, one needs to traverse the expression looking for subexpressions that can be simplified. The declarations that does the traversals is generated by \lstmath{simpDecls}. For each constructor, the function generate a pattern using \lstmath{mkPattern} and a term using the \lstmath{fapp} function as discussed in Section~\ref{subsec:generation:functions}. The recursive calls on the arguments of the expression is generated by calling \lstmath{adjustFuncCalls}. 
\begin{hscode}
simpDecls :: Term -> [Constr] -> [(Pattern,Expr)]
simpDecls term ftyps =
   zipWith ((,)) patterns fundefs
   where patterns = map mkPattern ftyps
         fundefs = map (functor' (adjustFuncCalls term) . fappExpr) ftyps
\end{hscode}

Lastly, if there are singleton or variable constructors they need to be returned as is using the \lstmath{simpVarsConsts} function. 
\begin{hscode} 
simpVarsConsts :: [Constr] -> [(Pattern,Expr)]
simpVarsConsts cs =
  zipWith ((,)) (map mkPattern cs) (map fappExpr cs)
\end{hscode}  

The declarations of the simplifier is the result of concatenating all these declarations as follows 
\begin{hscode} 
 simpRules thry term
 ++ simpDecls term (filter (not . isConstOrVar) cs)
 ++ simpVarsConsts (filter isConstOrVar cs) 
\end{hscode} 

Note that we do not generate the simplification functions for the \lstmath{Basic} term language. In some cases, like in \lstmath{Magma}, the \lstmath{Basic} term language does not have a base case, and therefore a termination proof of the simplification function is not trivial. Some theorem provers, like Lean, would not accept this definition. 

\subsubsection{Evaluators}
\label{sec:generation:evaluator}
The evaluator generates $4$ functions, one for every term language. In the simplest case, the \lstmath{Basic} term language, the evaluation function for \lstmath{Monoid} will have the following type 
\begin{agdacode}
 evalB : {A  : Set}  ~$\to$~ Monoid A  ~$\to$~ MonoidTerm  ~$\to$~ A
\end{agdacode}
An expression of type \lstmath{MonoidTerm} is evaluated to an element of a carrier \lstmath{A} on which a monoid structure exists. The constructors of the language is mapped to operations of the theory, in a way opposite to what was done to generate the term language. Therefore, the function that generates the evaluator needs to deal with both the equational theory and the term language. The types of the evaluator functions are generated by the \lstmath{ftype} function. The first step is to generate the definition of instances of both 
\begin{hscode}
(eqbind,eqinst) = eqInstance thry Nothing
(tbind,tinst) = tinstance (tlToDecl termlang) Nothing
newBinds = unionBindings eqbind tbind
\end{hscode}
The functions \lstmath{eqInstance} and \lstmath{tinstance} generate instances of the theory and the term languages, as explained in Sections~\ref{subsec:generation:eqTheories} and~\ref{subsec:generation:datatypes}. Both instances might be parameterized, in which case some bindings need to be defined before they can be declared. Those bindings are defined in \lstmath{eqbind} and \lstmath{tbind}. The bindings of the function are the union of the two bindings. 

Function declarations are defined for variables, constants and function symbols. In case of variables, a call to the \lstmath{lookup'} function is performed as follows 
\begin{hscode}
[FunDef (mkName ~$\$$~ evalFuncName term) -- call for vars
        (concatMap (cpattern instName term) vs)
        (lookup' envName) 
        | not (null vs)] 
\end{hscode}
\noindent where \lstmath{vs} is the list of variable declarations. The function \lstmath{cpattern} creates the pattern for a constructor. \lstmath{lookup'} creates a call to the \lstmath{lookup} function. Creating the function declaration for constants look very similar, except it returns the constant itself. 
\begin{hscode}
[FunDef (mkName ~$\$$~ evalFuncName term) -- call for constants 
         (concatMap (cpattern instName term) constants)
         constFunc 
         | not (null constants)]    
\end{hscode}
For every other constructor in the type, a pattern of it is created using \lstmath{cpattern} and assigned to one of the declarations of the theory using \lstmath{funcDef}. 
\begin{hscode}
zipWith (FunDef (mkName ~$\$$~  evalFuncName term)) 
        (map (cpattern instName term) tDecls)
        (map (funcDef eq instName term) eqDecls)
\end{hscode}
The value of the expression at each constructor is mapped to the corresponding function symbol in the same order. This make sense as we deal with theories as telescopes, and so order matters. When we generate the term language we do not change the order. Now that we are assigning back those declarations, the order is used to map them back. 


\subsubsection{Induction Principle}
\label{sec:generation:induction}
Induction principles are defined over sets with well-founded relations. In the case of structural induction, they are based on the subterm relation. Structural induction requires a base case, a constant or variable symbol in the language. In cases like the \lstmath{Basic} term language of \lstmath{magma}, the variables \lstmath{x} and \lstmath{y} of an expression \lstmath{op x y} can only be substituted by other \lstmath{op} expressions and therefore never gets smaller. One can argue that in this case the induction principle is not defined. Despite that, we run the following experiment in Coq, which automatically generates the induction principle for types declared as \lstmath{Inductive}. 
 \begin{minted}{coq}
Inductive magma : Set := op : magma -> magma -> magma.
Check magma_ind.
\end{minted}
The following induction principle is generated 
\begin{minted}{coq}
magma_ind : forall P : magma -> Prop,
 (forall m : magma, P m ->
    forall m0 : magma, P m0 -> P (op m m0)) ->
 forall m : magma, P m
\end{minted}
\noindent A possible explanation is that the type \lstmath{magma} is empty, and therefore it's fine to have its induction principle generated. Based on this observation, 
we decided to generate the induction principle for all term languages. 

For every constructor of the term language, we use \lstmath{fapp} to generate the term resulting from applying this function symbol to some bindings, along with these bindings. To generate the induction principle for a predicate \lstmath{P}, we need to generate a type stating that given proofs of \lstmath{P} applied to the bindings, we can induce the proof of \lstmath{P} applied to the term. For every constructor the function \lstmath{typeFun} does that. 
\begin{hscode} 
 if null binds then applyPred fexpr
 else Pi (Tel binds) ~$\$$~
         curryExpr ~$\$$~ concatMap applyPredToBindings binds 
                     ++ [applyPred fexpr]
\end{hscode}
In cases when the term language has a singleton or variable constructor, those ones also need to be included in the type, but their construction is straightforward.  

\subsubsection{Staged Term Languages}
\label{sec:generation:staged}
Systems that support multi-stage programming (MSP) enables staging the evaluation of expressions between a current (\lstmath{Now}) stage and a future (\lstmath{Later}) one. An expression that is staged for a \lstmath{Later} stage, is dealt with as \lstmath{Code}. The details of MSP is discussed in Section~\ref{sec:background:msp}. 
\begin{comment}
Our framework provides some prelude definitions that are used to define the staged term language. First, we define two stages; The code of an expression that lives in stage \lstmath{s0} belong to stage \lstmath{s1}.   
\begin{togcode}
data Stage  : Set where
  s0 : Stage 
  s1 : Stage  
\end{togcode}
switching between stages is done via the \lstmath{code} and \lstmath{uncode} functions. 
\begin{togcode}
code : {A  : Set }  ~$\to$~ CodeRep A s0 ~$\to$~ CodeRep A s1 
code x  = (Q x )    
uncode : {A  : Set }  ~$\to$~ CodeRep A s1 ~$\to$~ CodeRep A s0
uncode (Q x )  = x 
\end{togcode}
where \lstmath{CodeRep} is a type for switching between levels 
\begin{togcode}
data Wrap (A  : Set )  : Set where
  Q : (A  ~$\to$~ (Wrap A ))     
CodeRep : (A  : Set) (s : Stage) ~$\to$~ Set
CodeRep A s0  = A 
CodeRep A s1  = (Wrap (CodeRep A s0 ) )
\end{togcode}
The evaluation of a quoted expression is done via the \lstmath{run} function 
\begin{togcode}
run : {A  : Set }  ~$\to$~ CodeRep A s1  ~$\to$~ A
run (Q x)  = x 
\end{togcode}
Now we can define the \lstmath{Staged} type 
\begin{togcode}
data Staged (A  : Set )  : Set where
  Now : A ~$\to$~ Staged A
  Later : Comp A s1 ~$\to$~ Staged A 
\end{togcode}
\end{comment}
%with functions that lifts an expression of type \lstmath{A} into expressions of type \lstmath{CodeRep A s1} and other that lifts \lstmath{CodeRep A s1} into \lstmath{Staged A}. All this is part of the prelude of tog and is used by the generator to create the staged type for term language of a theory \lstmath{t}. 

We generate functions to automatically add staging annotations to terms of the term language of the theory, as follows
\begin{itemize}
\item constants (whether elements of the carrier or $0$-ary functions) has values at the current stage. 
\item variables do not have values until runtime. 
\item A function symbol can be computed if all its parameters have values at compile time. 
\end{itemize}
The generator depends on functions \lstmath{stage1}, and \lstmath{stage2} that provides the lifting of unary and binary expressions based on the status of their arguments. In cases when the expression are annotated for a \lstmath{Later} stage, one need to be able to talk about their \lstmath{Code} as expressions instead of their values. Therefore, we need functions \lstmath{codeLift1} and \lstmath{codeLift2} to lift an expression to its \lstmath{Code} version. To give more clarity, the type of stage and codeLift function for unary operations are 
\begin{togcode} 
codeLift1 : {A B  : Set }  ~$\to$~ (A  ~$\to$~ B )  ~$\to$~ 
                       (CodeRep A s1 ~$\to$~ CodeRep B s1)
codeLift1 f (Q x )  = Q (f x )

stage1 : {A B  : Set } ~$\to$~ (A ~$\to$~ B ) ~$\to$~ 
                       (CodeRep A s1 ~$\to$~ CodeRep B s1 ) ~$\to$~ 
                       Staged A ~$\to$~ Staged B
stage1 f g (Now x)  = Now (f x )
stage1 f g (Later (Computation _ x ))  =  Later (Computation Expr (g x))
\end{togcode} 

The \lstmath{codeLift} functions expects functions of specific arities. We have those functions as declarations within a record, instead of function definitions within the module. Tog does not treat them the same way; therefore we had to generate function declarations for each constructor of the term langauge, in order to pass them to the theory. 
%The function \lstmath{opDeclToFunc} is able to do that. 

The \lstmath{codeLift} and \lstmath{stage} functions interplay together as follows: 
\begin{hscode} 
case exprArity expr of
  0 -> App [mkArg "Now",mkArg ~$\$$~ n ^. name]
  1 -> stageH "stage1" "codeLift1"
  2 -> stageH "stage2" "codeLift2"
  _ -> error "Cannot stage term, provide a staging function" 
\end{hscode} 

\begin{comment}
Constants and variable are staged with The automatic annotation is done by Lifting constants from term language to staged term language is done by adding the \lstmath{Now} constructor of the \lstmath{Staged} before the constant symbol 
\begin{hscode} 
liftConstant :: Constr -> (Pattern,Expr) 
liftConstant c = (mkPattern c, App [mkArg "Now",Arg ~$\$$~ fappExpr c])
\end{hscode} 
In a similar way, a variable of the form \lstmath{v x} is lifted to the \lstmath{Staged} expression \\
\lstmath{(Later (Computation Const x))}.
\end{comment}

\subsubsection{Representation types}
\label{sec:generation:tagless}
Inspied by the tagless language embedding technique~\cite{carette2009finally}, we use representation types to abstract over stages. Consider the following type based on the term language of \lstmath{Monoid} 
\begin{togcode}
record StagedRepr (A : Set) (Repr : Set ~$\to$~ Set) : Set where
 constructor repr
 field
   opT : Repr A ~$\to$~ Repr A ~$\to$~ Repr A
   eT  : Repr A
\end{togcode}
By instantiating \lstmath{Repr} type with \lstmath{Staged}, we can get the staged type for the terms of \lstmath{Monoid} as 
\begin{togcode}
taglsMon : StagedRepr MonoidTerm Staged 
taglsMon = record {eT = Now e ; opT = stage2 op (codeLift2 op)}
\end{togcode}

The type \lstmath{Repr} is defined internally as:  
\begin{hscode} 
Bind [mkArg reprTypeName] ~$\$$~ Fun (App [mkArg "Set"]) (App [mkArg "Set"])]
\end{hscode} 
The fields of the record are all generated by the following expression 
\begin{hscode} 
map (liftConstr reprTypeName) fdecls
\end{hscode} 
\noindent where \lstmath{liftConstr} would lift a type \lstmath{A} into \lstmath{Repr A}.