diff --git a/tesi/gabriel/part4.pdf b/tesi/gabriel/part4.pdf index a91659e..fc83632 100644 Binary files a/tesi/gabriel/part4.pdf and b/tesi/gabriel/part4.pdf differ diff --git a/tesi/gabriel/part4.tex b/tesi/gabriel/part4.tex index 017c251..59affe7 100644 --- a/tesi/gabriel/part4.tex +++ b/tesi/gabriel/part4.tex @@ -1,4 +1,4 @@ -% Created 2020-04-11 Sat 00:31 +% Created 2020-04-12 Sun 22:50 % Intended LaTeX compiler: pdflatex \documentclass[11pt]{article} \usepackage[utf8]{inputenc} @@ -64,11 +64,11 @@ xleftmargin=1em, \begin{document} \section{Translation validation of the Pattern Matching Compiler} -\label{sec:org9d32dd4} +\label{sec:org02550ae} \subsection{Source program} -\label{sec:org0032637} -The algorithm takes as its input a source program and translates it +\label{sec:org8c14d2a} +Our algorithm takes as its input a source program and translates it into an algebraic data structure which type we call \emph{decision\_tree}. \begin{verbatim} @@ -88,15 +88,15 @@ We distinguish \item Leaf: a value matching this part results into the evaluation of a source black box of code \end{itemize} - -The algorithm doesn't support type-declaration-based analysis +Our algorithm doesn't support type-declaration-based analysis to know the list of constructors at a given type. Let's consider some trivial examples: -\begin{verbatim} -function true -> 1 -\end{verbatim} - +\begin{center} +\begin{tabular}{l} +function true -> 1\\ +\end{tabular} +\end{center} is translated to \begin{center} \begin{tabular}{l} @@ -104,30 +104,38 @@ Switch ([(true, Leaf 1)], Failure)\\ \end{tabular} \end{center} while -\begin{verbatim} -function -true -> 1 -| false -> 2 -\end{verbatim} +\begin{center} +\begin{tabular}{l} +function\\ +\(\vert{}\) true -> 1\\ +\(\vert{}\) false -> 2\\ +\end{tabular} +\end{center} will be translated to + \begin{center} \begin{tabular}{l} Switch ([(true, Leaf 1); (false, Leaf 2)])\\ \end{tabular} \end{center} - It is possible to produce Unreachable examples by using refutation clauses (a "dot" in the right-hand-side) -\begin{verbatim} -function -true -> 1 -| false -> 2 -| _ -> . -\end{verbatim} +\begin{center} +\begin{tabular}{l} +function\\ +\(\vert{}\) true -> 1\\ +\(\vert{}\) false -> 2\\ +\(\vert{}\) \_ -> .\\ +\end{tabular} +\end{center} that gets translated into -Switch ([(true, Leaf 1); (false, Leaf 2)], Unreachable) +\begin{center} +\begin{tabular}{l} +Switch ([(true, Leaf 1); (false, Leaf 2)], Unreachable)\\ +\end{tabular} +\end{center} -We trust this annotation, which is reasonable as the type-checker +We trust this annotation, which is reasonable as the type-checker verifies that it indeed holds. Guard nodes of the tree are emitted whenever a guard is found. Guards @@ -139,7 +147,6 @@ true. \begin{comment} [ ] Finisci con Switch [ ] Spiega del fallback -[ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le Lambda \end{comment} The source code of a pattern matching function has the @@ -184,8 +191,7 @@ pattern1 ::= "\(\vert{}\)" clause\\ clause ::= lexpr "->" rexpr\\ lexpr ::= rule ($\varepsilon$\(\vert{}\)condition)\\ rexpr ::= \_code ;; arbitrary code\\ -rule ::= wildcard\(\vert{}\)variable\(\vert{}\)constructor\_pattern\(\vert{}\)\{\}or\_pattern ;;\\ -;; rules\\ +rule ::= wildcard\(\vert{}\)variable\(\vert{}\)constructor\_pattern\(\vert{}\) or\_pattern ;;\\ wildcard ::= "\_"\\ variable ::= identifier\\ constructor\_pattern ::= constructor (rule\(\vert{}\)$\varepsilon$) (assignment\(\vert{}\)$\varepsilon$)\\ @@ -196,59 +202,155 @@ assignment ::= "as" id\\ bexpr ::= \_code ;; arbitrary code\\ \end{tabular} \end{center} - -A source program t$_S$ is a collection of pattern clauses pointing to -\emph{bb} terms. Running a program t$_S$ against an input value v$_S$ produces as -a result \emph{r}: +The source program is parsed using the ocaml-compiler-libs library. +The result of parsing, when successful, results in a list of clauses +and a list of type declarations. +Every clause consists of three objects: a left-hand-side that is the +kind of pattern expressed, an option guard and a right-hand-side expression. +Patterns are encoded in the following way: \begin{center} -\begin{tabular}{l} -t$_S$ ::= (p $\to$ bb)\(^{\text{i$\in $I}}\)\\ -p ::= \(\vert{}\) K(p$_i$)$^i$, i $\in $ I \(\vert{}\) (p\(\vert{}\)q) \(\vert{}\) n $\in $ $\mathbb{N}$\\ -r ::= guard list * (Match bb \(\vert{}\) NoMatch \(\vert{}\) Absurd)\\ -t$_S$(v$_S$) $\to$ r\\ +\begin{tabular}{ll} +pattern & type\\ +\hline +\_ & Wildcard\\ +p$_1$ as x & Assignment\\ +c(p$_1$,p$_2$,\ldots{},p$_n$) & Constructor\\ +(p$_1$\(\vert{}\) p$_2$) & Orpat\\ \end{tabular} \end{center} -TODO: argument on what it means to run a source program -\emph{guard} and \emph{bb} expressions are treated as blackboxes of OCaml code. -A sound approach for treating these blackboxes would be to inspect the -OCaml compiler during translation to Lambda code and extract the -blackboxes compiled in their Lambda representation. -This would allow to test for equality with the respective blackbox at -the target level. -Given that this level of introspection is currently not possibile, we -decided to restrict the structure of blackboxes to the following (valid) OCaml -code: +Once parsed, the type declarations and the list of clauses are encoded in the form of a matrix +that is later evaluated using a matrix decomposition algorithm. -\begin{verbatim} -external guard : 'a -> 'b = "guard" -external observe : 'a -> 'b = "observe" -\end{verbatim} +Patterns are of the form +\begin{center} +\begin{tabular}{ll} +pattern & type of pattern\\ +\hline +\_ & wildcard\\ +x & variable\\ +c(p$_1$,p$_2$,\ldots{},p$_n$) & constructor pattern\\ +(p$_1$\(\vert{}\) p$_2$) & or-pattern\\ +\end{tabular} +\end{center} -We assume these two external functions \emph{guard} and \emph{observe} with a valid -type that lets the user pass any number of arguments to them. -All the guards are of the form \texttt{guard }, where the - are expressed using the OCaml pattern matching language. -Similarly, all the right-hand-side expressions are of the form -\texttt{observe ...} with the same constraints on arguments. +The pattern \emph{p} matches a value \emph{v}, written as p $\preccurlyeq$ v, when one of the +following rules apply -\begin{verbatim} -type t = K1 | K2 of t (* declaration of an algebraic and recursive datatype t *) +\begin{center} +\begin{tabular}{llll} +\hline +\_ & $\preccurlyeq$ & v & $\forall$v\\ +x & $\preccurlyeq$ & v & $\forall$v\\ +(p$_1$ \(\vert{}\) p$_2$) & $\preccurlyeq$ & v & iff p$_1$ $\preccurlyeq$ v or p$_2$ $\preccurlyeq$ v\\ +c(p$_1$, p$_2$, \ldots{}, p$_a$) & $\preccurlyeq$ & c(v$_1$, v$_2$, \ldots{}, v$_a$) & iff (p$_1$, p$_2$, \ldots{}, p$_a$) $\preccurlyeq$ (v$_1$, v$_2$, \ldots{}, v$_a$)\\ +(p$_1$, p$_2$, \ldots{}, p$_a$) & $\preccurlyeq$ & (v$_1$, v$_2$, \ldots{}, v$_a$) & iff p$_i$ $\preccurlyeq$ v$_i$ $\forall$i $\in $ [1..a]\\ +\hline +\end{tabular} +\end{center} -let _ = function - | K1 -> observe 0 - | K2 K1 -> observe 1 - | K2 x when guard x -> observe 2 - | K2 (K2 x) as y when guard x y -> observe 3 - | K2 _ -> observe 4 -\end{verbatim} +When a value \emph{v} matches pattern \emph{p} we say that \emph{v} is an \emph{instance} of \emph{p}. +During compilation by the translators, expressions at the +right-hand-side are compiled into +Lambda code and are referred as lambda code actions l$_i$. + +We define the \emph{pattern matrix} P as the matrix |m x n| where m bigger +or equal than the number of clauses in the source program and n is +equal to the arity of the constructor with the gratest arity. +\begin{equation*} +P = +\begin{pmatrix} +p_{1,1} & p_{1,2} & \cdots & p_{1,n} \\ +p_{2,1} & p_{2,2} & \cdots & p_{2,n} \\ +\vdots & \vdots & \ddots & \vdots \\ +p_{m,1} & p_{m,2} & \cdots & p_{m,n} ) +\end{pmatrix} +\end{equation*} +every row of \emph{P} is called a pattern vector + \(\vec{p_i}\) = (p$_1$, p$_2$, \ldots{}, p$_n$); In every instance of P pattern +vectors appear normalized on the length of the longest pattern vector. +Considering the pattern matrix P we say that the value vector +\(\vec{v}\) = (v$_1$, v$_2$, \ldots{}, v$_i$) matches the pattern vector p$_i$ in P if and only if the following two +conditions are satisfied: +\begin{itemize} +\item p\(_{\text{i,1}}\), p\(_{\text{i,2}}\), \(\cdots{}\), p\(_{\text{i,n}}\) $\preccurlyeq$ (v$_1$, v$_2$, \ldots{}, v$_i$) +\item $\forall$j < i p\(_{\text{j,1}}\), p\(_{\text{j,2}}\), \(\cdots{}\), p\(_{\text{j,n}}\) $\npreceq$ (v$_1$, v$_2$, \ldots{}, v$_i$) +\end{itemize} + +We can define the following three relations with respect to patterns: +\begin{itemize} +\item Pattern p is less precise than pattern q, written p $\preccurlyeq$ q, when all +instances of q are instances of p +\item Pattern p and q are equivalent, written p $\equiv$ q, when their instances +are the same +\item Patterns p and q are compatible when they share a common instance +\end{itemize} + +Wit the support of two auxiliary functions +\begin{itemize} +\item tail of an ordered family +\begin{center} +\begin{tabular}{l} +tail((x$_i$)\(^{\text{i $\in $ I}}\)) := (x$_i$)\(^{\text{i $\ne$ min(I)}}\)\\ +\end{tabular} +\end{center} +\item first non-$\bot$ element of an ordered family +\begin{center} +\begin{tabular}{l} +First((x$_i$)$^i$) := $\bot$ \quad \quad \quad if $\forall$i, x$_i$ = $\bot$\\ +First((x$_i$)$^i$) := x\_min\{i \(\vert{}\) x$_i$ $\ne$ $\bot$\} \quad if $\exists$i, x$_i$ $\ne$ $\bot$\\ +\end{tabular} +\end{center} +\end{itemize} +we now define what it means to run a pattern row against a value +vector of the same length, that is (p$_i$)$^i$(v$_i$)$^i$ +\begin{center} +\begin{tabular}{lll} +p$_i$ & v$_i$ & result\(_{\text{pat}}\)\\ +\hline +$\varnothing$ & ($\varnothing$) & []\\ +(\_, tail(p$_i$)$^i$) & (v$_i$) & tail(p$_i$)$^i$(tail(v$_i$)$^i$)\\ +(x, tail(p$_i$)$^i$) & (v$_i$) & $\sigma$[x$\mapsto$v$_0$] if tail(p$_i$)$^i$(tail(v$_i$)$^i$) = $\sigma$\\ +(K(q$_j$)$^j$, tail(p$_i$)$^i$) & (K(v'$_j$)$^j$,tail(v$_j$)$^j$) & ((q$_j$)$^j$ \sout{+} tail(p$_i$)$^i$)((v'$_j$)$^j$ \sout{+} tail(v$_i$)$^i$)\\ +(K(q$_j$)$^j$, tail(p$_i$)$^i$) & (K'(v'$_l$)$^l$,tail(v$_j$)$^j$) & $\bot$ if K $\ne$ K'\\ +(q$_1$\(\vert{}\)q$_2$, tail(p$_i$)$^i$) & (v$_i$)$^i$ & First((q$_1$,tail(p$_i$)$^i$)(v$_i$)$^i$, (q$_2$,tail(p$_i$)$^i$)(v$_i$)$^i$)\\ +\end{tabular} +\end{center} + +A source program t$_S$ is a collection of pattern clauses pointing to +\emph{bb} terms. Running a program t$_S$ against an input value v$_S$, written +t$_S$(v$_S$) produces a result \emph{r} belonging to the following grammar: +\begin{center} +\begin{tabular}{l} +t$_S$ ::= (p $\to$ bb)\(^{\text{i$\in $I}}\)\\ +t$_S$(v$_S$) $\to$ r\\ +r ::= guard list * (Match bb \(\vert{}\) NoMatch \(\vert{}\) Absurd)\\ +\end{tabular} +\end{center} +\begin{comment} +TODO: understand how to explain guard +\end{comment} +We can define what it means to run an input value v$_S$ against a source +program t$_S$: +\begin{center} +\begin{tabular}{l} +t$_S$(v$_S$) := $\bot$ \quad \quad \quad if $\forall$i, p$_i$(v$_S$) = $\bot$\\ +First((x$_i$)$^i$) := x\_min\{i \(\vert{}\) x$_i$ $\ne$ $\bot$\} \quad if $\exists$i, x$_i$ $\ne$ $\bot$\\ +t$_S$(v$_S$) = Absurd if bb\_min\{p$_i$ $\to$ bb$_i$ \(\vert{}\) p$_i$(v$_S$) $\ne$ $\bot$\} = \emph{refutation clause}\\ +t$_S$(v$_S$) = Match bb\_min\{p$_i$ $\to$ bb$_i$ \(\vert{}\) p$_i$(v$_S$) $\ne$ $\bot$\}\\ +\end{tabular} +\end{center} + +[ \ldots{} ] Big part that I think doesn't need revision from you [ \ldots{} ] In our prototype we make use of accessors to encode stored values. -\begin{minipage}{0.2\linewidth} + +\begin{minipage}{0.6\linewidth} \begin{verbatim} + let value = 1 :: 2 :: 3 :: [] (* that can also be written *) let value = [] @@ -351,49 +453,86 @@ because a source matrix in the case of empty rows returns the first expression and (Leaf bb)(v) := Match bb \item\relax [| (a$_j$)$^j$, $\varnothing$ |] $\equiv$ Failure \end{enumerate} - Regarding non base cases: -Let's first define +Let's first define some auxiliary functions +\begin{itemize} +\item The index family of a constructor \begin{center} \begin{tabular}{l} -let Idx(k) := [0; arity(k)[\\ -let First($\varnothing$) := $\bot$\\ -let First((a$_j$)$^j$) := a\(_{\text{min(j$\in $J$\ne$$\varnothing$)}}\)\\ +Idx(K) := [0; arity(K)[\\ \end{tabular} \end{center} -\[ -m := ((a_i)^i ((p_{ij})^i \to e_j)^{ij}) -\] -\[ - (k_k)^k := headconstructor(p_{i0})^i -\] -\begin{equation} - Groups(m) := ( k_k \to ((a)_{0.l})^{l \in Idx(k_k)} +++ (a_i)^{i \in I\DZ}), \\ - ( if p_{0j} is k(q_l) then \\ - (q_l)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\ - if p_{0j} is \_ then \\ - (\_)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\ - else \bot )^j ), \\ - ((a_i)^{i \in I\DZ}, ((p_{ij})^{i \in I\DZ} \to e_j if p_{0j} is \_ else \bot)^{j \in J}) -\end{equation} - -Groups(m) is an auxiliary function that source a matrix m into +\item head of an ordered family (we write x for any object here, value, pattern etc.) +\begin{center} +\begin{tabular}{l} +head((x$_i$)\(^{\text{i $\in $ I}}\)) = x\_min(I)\\ +\end{tabular} +\end{center} +\item tail of an ordered family +\begin{center} +\begin{tabular}{l} +tail((x$_i$)\(^{\text{i $\in $ I}}\)) := (x$_i$)\(^{\text{i $\ne$ min(I)}}\)\\ +\end{tabular} +\end{center} +\item head constructor of a value or pattern +\begin{center} +\begin{tabular}{l} +constr(K(x$_i$)$^i$) = K\\ +constr(\_) = $\bot$\\ +constr(x) = $\bot$\\ +\end{tabular} +\end{center} +\item first non-$\bot$ element of an ordered family +\begin{center} +\begin{tabular}{l} +First((x$_i$)$^i$) := $\bot$ \quad \quad \quad if $\forall$i, x$_i$ = $\bot$\\ +First((x$_i$)$^i$) := x\_min\{i \(\vert{}\) x$_i$ $\ne$ $\bot$\} \quad if $\exists$i, x$_i$ $\ne$ $\bot$\\ +\end{tabular} +\end{center} +\item definition of group decomposition: +\end{itemize} +\begin{center} +\begin{tabular}{l} +let constrs((p$_i$)\(^{\text{i $\in $ I}}\)) = \{ K \(\vert{}\) K = constr(p$_i$), i $\in $ I \}\\ +let Groups(m) where m = ((a$_i$)$^i$ ((p$_i$$_j$)$^i$ $\to$ e$_j$)$^i$$^j$) =\\ +\quad \quad let (K$_k$)$^k$ = constrs(p$_i$$_0$)$^i$ in\\ +\quad \quad ( K$_k$ $\to$\\ +\quad \quad \quad \quad ((a$_0$.$_l$)$^l$ \sout{+} tail(a$_i$)$^i$)\\ +\quad \quad \quad \quad (\\ +\quad \quad \quad \quad if p$_o$$_j$ is K$_k$(q$_l$) then\\ +\quad \quad \quad \quad \quad \quad (q$_l$)$^l$ \sout{+} tail(p$_i$$_j$)$^i$ $\to$ e$_j$\\ +\quad \quad \quad \quad if p$_o$$_j$ is \_ then\\ +\quad \quad \quad \quad \quad \quad (\_)$^l$ \sout{+} tail(p$_i$$_j$)$^i$ $\to$ e$_j$\\ +\quad \quad \quad \quad else $\bot$\\ +\quad \quad \quad \quad )$^j$\\ +\quad \quad ), (\\ +\quad \quad \quad \quad tail(a$_i$)$^i$, (tail(p$_i$$_j$)$^i$ $\to$ e$_j$ if p$_0$$_j$ is \_ else $\bot$)$^j$\\ +\quad \quad )\\ +\end{tabular} +\end{center} +Groups(m) is an auxiliary function that decomposes a matrix m into submatrices, according to the head constructor of their first pattern. -Groups(m) returns one submatrix m\_r for each head constructor k that +Groups(m) returns one submatrix m\_r for each head constructor K that occurs on the first row of m, plus one "wildcard submatrix" m\(_{\text{wild}}\) that matches on all values that do not start with one of those head constructors. - Intuitively, m is equivalent to its decomposition in the following -sense: if the first pattern of an input vector (v$_i$)$^i$ starts with one -of the head constructors k, then running (v$_i$)$^i$ against m is the same -as running it against the submatrix m$_k$; otherwise (its head -constructor is none of the k) it is equivalent to running it against +sense: if the first pattern of an input vector (v\_i)\^{}i starts with one +of the head constructors K$_k$, then running (v\_i)\^{}i against m is the same +as running it against the submatrix m\(_{\text{K$_k$}}\); otherwise (its head +constructor $\notin$ (K$_k$)$^k$) it is equivalent to running it against the wildcard submatrix. -We formalize this intuition as follows: -Lemma (Groups): -Let \[m\] be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\]. +We formalize this intuition as follows +\begin{enumerate} +\item Lemma (Groups): +\label{sec:org3d5a08a} +Let \emph{m} be a matrix with +\begin{center} +\begin{tabular}{l} +Groups(m) = (k\_r \(\to\) m\_r)\^{}k, m\(_{\text{wild}}\)\\ +\end{tabular} +\end{center} For any value vector \((v_i)^l\) such that \(v_0 = k(v'_l)^l\) for some constructor k, we have: @@ -410,69 +549,99 @@ if k = k$_k$ \text{ for some k then}\\ TODO: fix \{0} \end{comment} -\begin{enumerate} \item Proof: -\label{sec:orgb73fb4f} -Let \(m\) be a matrix with \[Group(m) = (k_r \to m_r)^k, m_{wild}\] -Let \((v_i)^i\) be an input matrix with \(v_0 = k(v'_l)^l\) for some k. -We proceed by case analysis: +\label{sec:org416e0ec} +Let \emph{m} be a matrix ((a$_i$)$^i$, ((p$_i$$_j$)$^i$ $\to$ e$_j$)$^j$) with +\begin{center} +\begin{tabular}{l} +Groups(m) = (K$_k$ $\to$ m$_k$)$^k$, m\(_{\text{wild}}\)\\ +\end{tabular} +\end{center} +Below we are going to assume that m is a simplified matrix such that +the first row does not contain an or-pattern or a binding to a +variable. + +Let (v$_i$)$^i$ be an input matrix with v$_0$ = K$_v$(v'\(_{\text{l}}\))$^l$ for some constructor K$_v$. +We have to show that: \begin{itemize} -\item either k is one of the k$_k$ for some k -\item or k is none of the (k$_k$)$^k$ +\item if K$_k$ = K$_v$ for some K$_k$ $\in $ constrs(p$_0$$_j$)$^j$, then +m(v$_i$)$^i$ = m$_k$((v'$_l$)$^l$ \sout{+} tail(v$_i$)$^i$) +\item otherwise +m(v$_i$)$^i$ = m\(_{\text{wild}}\)(tail(v$_i$)$^i$) \end{itemize} +Let us call (r$_k$$_j$) the j-th row of the submatrix m$_k$, and r$_j$\(_{\text{wild}}\) +the j-th row of the wildcard submatrix m\(_{\text{wild}}\). -Both m(v$_i$)$^i$ and m$_k$(v$_k$)$^k$ are defined as the first matching result of -a family over each row r$_j$ of a matrix - -We know, from the definition of -Groups(m), that m$_k$ is +Our goal contains same-behavior equalities between matrices, for +a fixed input vector (v$_i$)$^i$. It suffices to show same-behavior +equalities between each row of the matrices for this input +vector. We show that for any j, +\begin{itemize} +\item if K$_k$ = K$_v$ for some K$_k$ $\in $ constrs(p$_0$$_j$)$^j$, then \begin{center} \begin{tabular}{l} -((a)\{0.l\})\(^{\text{l$\in $Idx(k$_k$)}}\) \sout{+} (a$_i$)\(^{\text{i$\in $I}\DZ}\)),\\ -(\\ -\quad if p\(_{\text{0j}}\) is k(q$_l$) then\\ -\quad \quad (q$_l$)$^l$ \sout{+} (p\(_{\text{ij}}\))\(^{\text{i$\in $I}\DZ\ }\) $\to$ e$_j$\\ -\quad if p\(_{\text{0j}}\) is \_ then\\ -\quad \quad (\_)$^l$ \sout{+} (p\(_{\text{ij}}\))\(^{\text{i$\in $I}\DZ}\) $\to$ e$_j$\\ -\quad else $\bot$\\ -)\(^{\text{j$\in $J}}\)\\ +(p$_i$$_j$)$^i$(v$_i$)$^i$ = r$_k$$_j$((v'$_l$)$^l$ \sout{+} tail(v$_i$)$^i$\\ +\end{tabular} +\end{center} +\item otherwise +\begin{center} +\begin{tabular}{l} +(p$_i$$_j$)$^i$(v$_i$)$^i$ = r$_j$\(_{\text{wild}}\) tail(v$_i$)$^i$\\ +\end{tabular} +\end{center} +\end{itemize} +In the first case (K$_v$ is K$_k$ for some K$_k$ $\in $ constrs(p$_0$$_j$)$^j$), we +have to prove that +\begin{center} +\begin{tabular}{l} +(p$_i$$_j$)$^i$(v$_i$)$^i$ = r$_k$$_j$((v'$_l$)$^l$ \sout{+} tail(v$_i$)$^i$\\ +\end{tabular} +\end{center} +By definition of m$_k$ we know that r$_k$$_j$ is equal to +\begin{center} +\begin{tabular}{l} +if p$_o$$_j$ is K$_k$(q$_l$) then\\ +\quad (q$_l$)$^l$ \sout{+} tail(p$_i$$_j$)$^i$ $\to$ e$_j$\\ +if p$_o$$_j$ is \_ then\\ +\quad (\_)$^l$ \sout{+} tail(p$_i$$_j$)$^i$ $\to$ e$_j$\\ +else $\bot$\\ +\end{tabular} +\end{center} +\begin{comment} +Maybe better as a table? +| p$_o$$_j$ | r$_k$$_j$ | +|--------+---------------------------| +| K$_k$(q$_l$) | (q$_l$)$^l$ +++ tail(p$_i$$_j$)$^i$ $\to$ e$_j$ | +| _ | (_)$^l$ +++ tail(p$_i$$_j$)$^i$ $\to$ e$_j$ | +| else | $\bot$ | +\end{comment} +By definition of (p$_i$)$^i$(v$_i$)$^i$ we know that (p$_i$$_j$)$^i$(v$_i$) is equal to +\begin{center} +\begin{tabular}{l} +(K(q$_j$)$^j$, tail(p$_i$$_j$)$^i$) (K(v'$_l$)$^l$,tail(v$_i$)$^i$) := ((q$_j$)$^j$ \sout{+} tail(p$_i$$_j$)$^i$)((v'$_l$)$^l$ \sout{+} tail(v$_i$)$^i$)\\ +(\_, tail(p$_i$$_j$)$^i$) (v$_i$) := tail(p$_i$$_j$)$^i$(tail(v$_i$)$^i$)\\ +(K(q$_j$)$^j$, tail(p$_i$$_j$)$^i$) (K'(v'$_l$)$^l$,tail(v$_j$)$^j$) := $\bot$ if K $\ne$ K'\\ \end{tabular} \end{center} -By definition, m(v$_i$)$^i$ is +We prove this first case by a second case analysis on p$_0$$_j$. + +TODO + +In the second case (K$_v$ is distinct from K$_k$ for all K$_k$ $\in $ constrs(p$_o$$_j$)$^j$), +we have to prove that \begin{center} \begin{tabular}{l} -m(v$_i$)$^i$ = First(r$_j$(v$_i$)$^i$)$^j$ for m = ((a$_i$)$^i$, (r$_j$)$^j$)\\ -(p$_i$)$^i$ (v$_i$)$^i$ = \{\\ -\quad if p$_0$ = k(q$_l$)$^l$, v$_0$ = k'(v'$_k$)$^k$, k=Idx(k') and l=Idx(k)\\ -\quad \quad if k $\ne$ k' then $\bot$\\ -if k = k' then ((q$_l$)$^l$ \sout{+} (p$_i$)\(^{\text{i$\in $I}\DZ}\)) ((v'$_k$)$^k$ \sout{+} (v$_i$)\(^{\text{i$\in $I}\DZ}\) )\\ -if p$_0$ = (q$_1$\(\vert{}\)q$_2$) then\\ -First( (q$_1$p$_i$\(^{\text{i$\in $I}\DZ}\)) v$_i$\(^{\text{i$\in $I}\DZ}\), (q$_2$p$_i$\(^{\text{i$\in $I }\DZ}\)) v$_i$\(^{\text{i$\in $I}\DZ}\))\}\\ +(p$_i$$_j$)$^i$(v$_i$)$^i$ = r$_j$\(_{\text{wild}}\) tail(v$_i$)$^i$\\ \end{tabular} \end{center} -For this reason, if we can prove that -\begin{center} -\begin{tabular}{l} -$\forall$j, r$_j$(v$_i$)$^i$ = r'$_j$((v'$_k$)$^k$ ++ (v$_i$)$^i$)\\ -\end{tabular} -\end{center} -it follows that -\begin{center} -\begin{tabular}{l} -m(v$_i$)$^i$ = m$_k$((v'$_k$)$^k$ ++ (v$_i$)$^i$)\\ -\end{tabular} -\end{center} -from the above definition. - -We can also show that a$_i$ = (a\(_{\text{0.l}}\))$^l$ \sout{+} a\(_{\text{i$\in $I}\DZ}\) because v(a$_0$) = K(v(a)\{0.l\})$^l$) +TODO \end{enumerate} - \subsection{Target translation} -\label{sec:org0337fa8} +\label{sec:org7394f38} The target program of the following general form is parsed using a parser generated by Menhir, a LR(1) parser generator for the OCaml programming language. @@ -560,12 +729,12 @@ $\forall$v$_T$, t$_T$(v$_T$) = $\llbracket$t$_T$$\rrbracket$$_T$(v$_T$)\\ \subsection{Equivalence checking} -\label{sec:org7268798} +\label{sec:org7194f33} The equivalence checking algorithm takes as input a domain of possible values \emph{S} and a pair of source and target decision trees and in case the two trees are not equivalent it returns a counter example. -The algorithm respects the following correctness statement: +Our algorithm respects the following correctness statement: \begin{comment} TODO: we have to define what \coversTEX mean for readers to understand the specifications @@ -699,22 +868,21 @@ We say that C$_T$ covers the input space \emph{S}, written \emph{covers(C$_T$, S)} when every value v$_S$$\in $S is a valid input to the decision tree C$_T$. (TODO: rephrase) Given an input space \emph{S} and a couple of decision trees, where -the target decision tree C$_T$ covers the input space \emph{S}, we say that -the two decision trees are equivalent when: +the target decision tree C$_T$ covers the input space \emph{S} we can define equivalence: \begin{center} \begin{tabular}{l} equiv(S, C$_S$, C$_T$, gs) = Yes $\wedge$ covers(C$_T$, S) $\to$ $\forall$v$_S$$\simeq$v$_T$ $\in $ S, C$_S$(v$_S$) $\simeq$gs C$_T$(v$_T$)\\ \end{tabular} \end{center} Similarly we say that a couple of decision trees in the presence of -an input space \emph{S} are \emph{not} equivalent when: +an input space \emph{S} are \emph{not} equivalent in the following way: \begin{center} \begin{tabular}{l} equiv(S, C$_S$, C$_T$, gs) = No(v$_S$,v$_T$) $\wedge$ covers(C$_T$, S) $\to$ v$_S$$\simeq$v$_T$ $\in $ S $\wedge$ C$_S$(v$_S$) $\ne$gs C$_T$(v$_T$)\\ \end{tabular} \end{center} Corollary: For a full input space \emph{S}, that is the universe of the -target program we say: +target program: \begin{center} \begin{tabular}{l} equiv(S, $\llbracket$t$_S$$\rrbracket$$_S$, $\llbracket$t$_T$$\rrbracket$$_T$, $\varnothing$) = Yes $\Leftrightarrow$ t$_S$ $\simeq$ t$_T$\\ @@ -891,7 +1059,7 @@ equiv(S, Failure as C$_S$, Switch(a, ($\pi$$_i$ $\to$ C$_T$$_i$)\(^{\text{i$\in equiv(S, Leaf bb$_S$ as C$_S$, Switch(a, ($\pi$$_i$ $\to$ C$_T$$_i$)\(^{\text{i$\in $I}}\))) := Forall(equiv( S$\cap$a$\to$$\pi$(k$_i$)), C$_S$, C$_T$$_i$)\(^{\text{i$\in $I}}\))\\ \end{tabular} \end{center} -The algorithm either returns Yes for every sub-input space S$_i$ := S$\cap$(a$\to$$\pi$(k$_i$)) and +Our algorithm either returns Yes for every sub-input space S$_i$ := S$\cap$(a$\to$$\pi$(k$_i$)) and subtree C$_T$$_i$ \begin{center} \begin{tabular}{l} @@ -924,7 +1092,7 @@ values not covered but the union of the constructors k$_i$ $\pi$\(_{\text{fallback}}\) = ¬\bigcup\limits\(_{\text{i$\in $I}}\)$\pi$(k$_i$)\\ \end{tabular} \end{center} -The algorithm proceeds by trimming +Our algorithm proceeds by trimming \begin{center} \begin{tabular}{l} equiv(S, Switch(a, (k$_i$ $\to$ C$_S$$_i$)\(^{\text{i$\in $I}}\), C\(_{\text{sf}}\)), C$_T$) :=\\ diff --git a/tesi/gabriel/part4_unicode.org b/tesi/gabriel/part4_unicode.org index 5083f82..2017ccb 100644 --- a/tesi/gabriel/part4_unicode.org +++ b/tesi/gabriel/part4_unicode.org @@ -46,7 +46,7 @@ * Translation validation of the Pattern Matching Compiler ** Source program -The algorithm takes as its input a source program and translates it +Our algorithm takes as its input a source program and translates it into an algebraic data structure which type we call /decision_tree/. #+BEGIN_SRC @@ -64,38 +64,30 @@ We distinguish - Failure: a value matching this part results in an error - Leaf: a value matching this part results into the evaluation of a source black box of code - -The algorithm doesn't support type-declaration-based analysis +Our algorithm doesn't support type-declaration-based analysis to know the list of constructors at a given type. Let's consider some trivial examples: -#+BEGIN_SRC -function true -> 1 -#+END_SRC - +| function true -> 1 is translated to |Switch ([(true, Leaf 1)], Failure) while -#+BEGIN_SRC -function -true -> 1 -| false -> 2 -#+END_SRC +| function +| \vert{} true -> 1 +| \vert{} false -> 2 will be translated to -|Switch ([(true, Leaf 1); (false, Leaf 2)]) +|Switch ([(true, Leaf 1); (false, Leaf 2)]) It is possible to produce Unreachable examples by using refutation clauses (a "dot" in the right-hand-side) -#+BEGIN_SRC -function -true -> 1 -| false -> 2 -| _ -> . -#+END_SRC +|function +|\vert{} true -> 1 +|\vert{} false -> 2 +|\vert{} _ -> . that gets translated into -Switch ([(true, Leaf 1); (false, Leaf 2)], Unreachable) +| Switch ([(true, Leaf 1); (false, Leaf 2)], Unreachable) -We trust this annotation, which is reasonable as the type-checker +We trust this annotation, which is reasonable as the type-checker verifies that it indeed holds. Guard nodes of the tree are emitted whenever a guard is found. Guards @@ -107,7 +99,6 @@ true. \begin{comment} [ ] Finisci con Switch [ ] Spiega del fallback -[ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le Lambda \end{comment} The source code of a pattern matching function has the @@ -142,8 +133,7 @@ This BNF grammar describes formally the grammar of the source program: | clause ::= lexpr "->" rexpr | lexpr ::= rule (ε\vert{}condition) | rexpr ::= _code ;; arbitrary code -| rule ::= wildcard\vert{}variable\vert{}constructor_pattern\vert{}or_pattern ;; -| ;; rules +| rule ::= wildcard\vert{}variable\vert{}constructor_pattern\vert or_pattern ;; | wildcard ::= "_" | variable ::= identifier | constructor_pattern ::= constructor (rule\vert{}ε) (assignment\vert{}ε) @@ -152,55 +142,117 @@ This BNF grammar describes formally the grammar of the source program: | condition ::= "when" bexpr | assignment ::= "as" id | bexpr ::= _code ;; arbitrary code +The source program is parsed using the ocaml-compiler-libs library. +The result of parsing, when successful, results in a list of clauses +and a list of type declarations. +Every clause consists of three objects: a left-hand-side that is the +kind of pattern expressed, an option guard and a right-hand-side expression. +Patterns are encoded in the following way: +| pattern | type | +|-----------------+-------------| +| _ | Wildcard | +| p₁ as x | Assignment | +| c(p₁,p₂,...,pₙ) | Constructor | +| (p₁\vert p₂) | Orpat | + + +Once parsed, the type declarations and the list of clauses are encoded in the form of a matrix +that is later evaluated using a matrix decomposition algorithm. + +Patterns are of the form +| pattern | type of pattern | +|-----------------+---------------------| +| _ | wildcard | +| x | variable | +| c(p₁,p₂,...,pₙ) | constructor pattern | +| (p₁\vert p₂) | or-pattern | + +The pattern /p/ matches a value /v/, written as p ≼ v, when one of the +following rules apply + +|--------------------+---+--------------------+-------------------------------------------| +| _ | ≼ | v | ∀v | +| x | ≼ | v | ∀v | +| (p₁ \vert p₂) | ≼ | v | iff p₁ ≼ v or p₂ ≼ v | +| c(p₁, p₂, ..., pₐ) | ≼ | c(v₁, v₂, ..., vₐ) | iff (p₁, p₂, ..., pₐ) ≼ (v₁, v₂, ..., vₐ) | +| (p₁, p₂, ..., pₐ) | ≼ | (v₁, v₂, ..., vₐ) | iff pᵢ ≼ vᵢ ∀i ∈ [1..a] | +|--------------------+---+--------------------+-------------------------------------------| + +When a value /v/ matches pattern /p/ we say that /v/ is an /instance/ of /p/. + + +During compilation by the translators, expressions at the +right-hand-side are compiled into +Lambda code and are referred as lambda code actions lᵢ. + +We define the /pattern matrix/ P as the matrix |m x n| where m bigger +or equal than the number of clauses in the source program and n is +equal to the arity of the constructor with the gratest arity. +\begin{equation*} +P = +\begin{pmatrix} +p_{1,1} & p_{1,2} & \cdots & p_{1,n} \\ +p_{2,1} & p_{2,2} & \cdots & p_{2,n} \\ +\vdots & \vdots & \ddots & \vdots \\ +p_{m,1} & p_{m,2} & \cdots & p_{m,n} ) +\end{pmatrix} +\end{equation*} +every row of /P/ is called a pattern vector + $\vec{p_i}$ = (p₁, p₂, ..., pₙ); In every instance of P pattern +vectors appear normalized on the length of the longest pattern vector. +Considering the pattern matrix P we say that the value vector +$\vec{v}$ = (v₁, v₂, ..., vᵢ) matches the pattern vector pᵢ in P if and only if the following two +conditions are satisfied: +- p_{i,1}, p_{i,2}, \cdots, p_{i,n} ≼ (v₁, v₂, ..., vᵢ) +- ∀j < i p_{j,1}, p_{j,2}, \cdots, p_{j,n} ⋠ (v₁, v₂, ..., vᵢ) + +We can define the following three relations with respect to patterns: +- Pattern p is less precise than pattern q, written p ≼ q, when all + instances of q are instances of p +- Pattern p and q are equivalent, written p ≡ q, when their instances + are the same +- Patterns p and q are compatible when they share a common instance + +Wit the support of two auxiliary functions +- tail of an ordered family + | tail((xᵢ)^{i ∈ I}) := (xᵢ)^{i ≠ min(I)} +- first non-⊥ element of an ordered family + | First((xᵢ)ⁱ) := ⊥ \quad \quad \quad if ∀i, xᵢ = ⊥ + | First((xᵢ)ⁱ) := x_min{i \vert{} xᵢ ≠ ⊥} \quad if ∃i, xᵢ ≠ ⊥ +we now define what it means to run a pattern row against a value +vector of the same length, that is (pᵢ)ⁱ(vᵢ)ⁱ +| pᵢ | vᵢ | result_{pat} | +|--------------------------+----------------------+-------------------------------------------------| +| ∅ | (∅) | [] | +| (_, tail(pᵢ)ⁱ) | (vᵢ) | tail(pᵢ)ⁱ(tail(vᵢ)ⁱ) | +| (x, tail(pᵢ)ⁱ) | (vᵢ) | σ[x↦v₀] if tail(pᵢ)ⁱ(tail(vᵢ)ⁱ) = σ | +| (K(qⱼ)ʲ, tail(pᵢ)ⁱ) | (K(v'ⱼ)ʲ,tail(vⱼ)ʲ) | ((qⱼ)ʲ +++ tail(pᵢ)ⁱ)((v'ⱼ)ʲ +++ tail(vᵢ)ⁱ) | +| (K(qⱼ)ʲ, tail(pᵢ)ⁱ) | (K'(v'ₗ)ˡ,tail(vⱼ)ʲ) | ⊥ if K ≠ K' | +| (q₁\vert{}q₂, tail(pᵢ)ⁱ) | (vᵢ)ⁱ | First((q₁,tail(pᵢ)ⁱ)(vᵢ)ⁱ, (q₂,tail(pᵢ)ⁱ)(vᵢ)ⁱ) | A source program tₛ is a collection of pattern clauses pointing to -/bb/ terms. Running a program tₛ against an input value vₛ produces as -a result /r/: +/bb/ terms. Running a program tₛ against an input value vₛ, written +tₛ(vₛ) produces a result /r/ belonging to the following grammar: | tₛ ::= (p → bb)^{i∈I} -| p ::= \vert K(pᵢ)ⁱ, i ∈ I \vert (p\vert{}q) \vert n ∈ ℕ -| r ::= guard list * (Match bb \vert{} NoMatch \vert{} Absurd) | tₛ(vₛ) → r +| r ::= guard list * (Match bb \vert{} NoMatch \vert{} Absurd) +\begin{comment} +TODO: understand how to explain guard +\end{comment} +We can define what it means to run an input value vₛ against a source +program tₛ: +| tₛ(vₛ) := ⊥ \quad \quad \quad if ∀i, pᵢ(vₛ) = ⊥ +| First((xᵢ)ⁱ) := x_min{i \vert{} xᵢ ≠ ⊥} \quad if ∃i, xᵢ ≠ ⊥ +| tₛ(vₛ) = Absurd if bb_min{pᵢ → bbᵢ \vert{} pᵢ(vₛ) ≠ ⊥} = /refutation clause/ +| tₛ(vₛ) = Match bb_min{pᵢ → bbᵢ \vert{} pᵢ(vₛ) ≠ ⊥} -TODO: argument on what it means to run a source program - -/guard/ and /bb/ expressions are treated as blackboxes of OCaml code. -A sound approach for treating these blackboxes would be to inspect the -OCaml compiler during translation to Lambda code and extract the -blackboxes compiled in their Lambda representation. -This would allow to test for equality with the respective blackbox at -the target level. -Given that this level of introspection is currently not possibile, we -decided to restrict the structure of blackboxes to the following (valid) OCaml -code: - -#+BEGIN_SRC -external guard : 'a -> 'b = "guard" -external observe : 'a -> 'b = "observe" -#+END_SRC - -We assume these two external functions /guard/ and /observe/ with a valid -type that lets the user pass any number of arguments to them. -All the guards are of the form \texttt{guard }, where the - are expressed using the OCaml pattern matching language. -Similarly, all the right-hand-side expressions are of the form -\texttt{observe ...} with the same constraints on arguments. - -#+BEGIN_SRC -type t = K1 | K2 of t (* declaration of an algebraic and recursive datatype t *) - -let _ = function - | K1 -> observe 0 - | K2 K1 -> observe 1 - | K2 x when guard x -> observe 2 - | K2 (K2 x) as y when guard x y -> observe 3 - | K2 _ -> observe 4 -#+END_SRC - - +[ ... ] Big part that I think doesn't need revision from you [ ... ] In our prototype we make use of accessors to encode stored values. -\begin{minipage}{0.2\linewidth} + +\begin{minipage}{0.6\linewidth} \begin{verbatim} + let value = 1 :: 2 :: 3 :: [] (* that can also be written *) let value = [] @@ -277,95 +329,121 @@ Base cases: because a source matrix in the case of empty rows returns the first expression and (Leaf bb)(v) := Match bb 2. [| (aⱼ)ʲ, ∅ |] ≡ Failure - Regarding non base cases: -Let's first define -| let Idx(k) := [0; arity(k)[ -| let First(∅) := ⊥ -| let First((aⱼ)ʲ) := a_{min(j∈J≠∅)} -\[ -m := ((a_i)^i ((p_{ij})^i \to e_j)^{ij}) -\] -\[ - (k_k)^k := headconstructor(p_{i0})^i -\] -\begin{equation} - Groups(m) := ( k_k \to ((a)_{0.l})^{l \in Idx(k_k)} +++ (a_i)^{i \in I\DZ}), \\ - ( if p_{0j} is k(q_l) then \\ - (qₗ)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\ - if p_{0j} is \_ then \\ - (\_)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\ - else \bot )^j ), \\ - ((a_i)^{i \in I\DZ}, ((p_{ij})^{i \in I\DZ} \to eⱼ if p_{0j} is \_ else \bot)^{j \in J}) -\end{equation} - -Groups(m) is an auxiliary function that source a matrix m into +Let's first define some auxiliary functions +- The index family of a constructor + | Idx(K) := [0; arity(K)[ +- head of an ordered family (we write x for any object here, value, pattern etc.) + | head((xᵢ)^{i ∈ I}) = x_min(I) +- tail of an ordered family + | tail((xᵢ)^{i ∈ I}) := (xᵢ)^{i ≠ min(I)} +- head constructor of a value or pattern + | constr(K(xᵢ)ⁱ) = K + | constr(_) = ⊥ + | constr(x) = ⊥ +- first non-⊥ element of an ordered family + | First((xᵢ)ⁱ) := ⊥ \quad \quad \quad if ∀i, xᵢ = ⊥ + | First((xᵢ)ⁱ) := x_min{i \vert{} xᵢ ≠ ⊥} \quad if ∃i, xᵢ ≠ ⊥ +- definition of group decomposition: +| let constrs((pᵢ)^{i ∈ I}) = { K \vert{} K = constr(pᵢ), i ∈ I } +| let Groups(m) where m = ((aᵢ)ⁱ ((pᵢⱼ)ⁱ → eⱼ)ⁱʲ) = +| \quad \quad let (Kₖ)ᵏ = constrs(pᵢ₀)ⁱ in +| \quad \quad ( Kₖ → +| \quad \quad \quad \quad ((a₀.ₗ)ˡ +++ tail(aᵢ)ⁱ) +| \quad \quad \quad \quad ( +| \quad \quad \quad \quad if pₒⱼ is Kₖ(qₗ) then +| \quad \quad \quad \quad \quad \quad (qₗ)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ +| \quad \quad \quad \quad if pₒⱼ is _ then +| \quad \quad \quad \quad \quad \quad (_)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ +| \quad \quad \quad \quad else ⊥ +| \quad \quad \quad \quad )ʲ +| \quad \quad ), ( +| \quad \quad \quad \quad tail(aᵢ)ⁱ, (tail(pᵢⱼ)ⁱ → eⱼ if p₀ⱼ is _ else ⊥)ʲ +| \quad \quad ) +Groups(m) is an auxiliary function that decomposes a matrix m into submatrices, according to the head constructor of their first pattern. -Groups(m) returns one submatrix m_r for each head constructor k that +Groups(m) returns one submatrix m_r for each head constructor K that occurs on the first row of m, plus one "wildcard submatrix" m_{wild} that matches on all values that do not start with one of those head constructors. - Intuitively, m is equivalent to its decomposition in the following -sense: if the first pattern of an input vector (vᵢ)ⁱ starts with one -of the head constructors k, then running (vᵢ)ⁱ against m is the same -as running it against the submatrix mₖ; otherwise (its head -constructor is none of the k) it is equivalent to running it against +sense: if the first pattern of an input vector (v_i)^i starts with one +of the head constructors Kₖ, then running (v_i)^i against m is the same +as running it against the submatrix m_{Kₖ}; otherwise (its head +constructor ∉ (Kₖ)ᵏ) it is equivalent to running it against the wildcard submatrix. -We formalize this intuition as follows: -Lemma (Groups): -Let \[m\] be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\]. -For any value vector $(v_i)^l$ such that $v_0 = k(v'_l)^l$ for some -constructor k, -we have: -| if k = kₖ \text{ for some k then} -| \quad m(vᵢ)ⁱ = mₖ((v_{l}')ˡ +++ (v_{i})^{i∈I\DZ}) -| \text{else} -| \quad m(vᵢ)ⁱ = m_{wild}(vᵢ)^{i∈I\DZ} +We formalize this intuition as follows +*** Lemma (Groups): + Let /m/ be a matrix with + | Groups(m) = (k_r \to m_r)^k, m_{wild} + For any value vector $(v_i)^l$ such that $v_0 = k(v'_l)^l$ for some + constructor k, + we have: + | if k = kₖ \text{ for some k then} + | \quad m(vᵢ)ⁱ = mₖ((v_{l}')ˡ +++ (v_{i})^{i∈I\DZ}) + | \text{else} + | \quad m(vᵢ)ⁱ = m_{wild}(vᵢ)^{i∈I\DZ} \begin{comment} TODO: fix \{0} \end{comment} *** Proof: -Let $m$ be a matrix with \[Group(m) = (k_r \to m_r)^k, m_{wild}\] -Let $(v_i)^i$ be an input matrix with $v_0 = k(v'_l)^l$ for some k. -We proceed by case analysis: -- either k is one of the kₖ for some k -- or k is none of the (kₖ)ᵏ +Let /m/ be a matrix ((aᵢ)ⁱ, ((pᵢⱼ)ⁱ → eⱼ)ʲ) with +| Groups(m) = (Kₖ → mₖ)ᵏ, m_{wild} +Below we are going to assume that m is a simplified matrix such that +the first row does not contain an or-pattern or a binding to a +variable. -Both m(vᵢ)ⁱ and mₖ(vₖ)ᵏ are defined as the first matching result of -a family over each row rⱼ of a matrix +Let (vᵢ)ⁱ be an input matrix with v₀ = Kᵥ(v'_{l})ˡ for some constructor Kᵥ. +We have to show that: +- if Kₖ = Kᵥ for some Kₖ ∈ constrs(p₀ⱼ)ʲ, then + m(vᵢ)ⁱ = mₖ((v'ₗ)ˡ +++ tail(vᵢ)ⁱ) +- otherwise + m(vᵢ)ⁱ = m_{wild}(tail(vᵢ)ⁱ) +Let us call (rₖⱼ) the j-th row of the submatrix mₖ, and rⱼ_{wild} +the j-th row of the wildcard submatrix m_{wild}. -We know, from the definition of -Groups(m), that mₖ is -| ((a){0.l})^{l∈Idx(kₖ)} +++ (aᵢ)^{i∈I\DZ}), -| ( -| \quad if p_{0j} is k(qₗ) then -| \quad \quad (qₗ)ˡ +++ (p_{ij})^{i∈I\DZ } → eⱼ -| \quad if p_{0j} is _ then -| \quad \quad (_)ˡ +++ (p_{ij})^{i∈I\DZ} → eⱼ -| \quad else ⊥ -| )^{j∈J} +Our goal contains same-behavior equalities between matrices, for +a fixed input vector (vᵢ)ⁱ. It suffices to show same-behavior +equalities between each row of the matrices for this input +vector. We show that for any j, +- if Kₖ = Kᵥ for some Kₖ ∈ constrs(p₀ⱼ)ʲ, then + | (pᵢⱼ)ⁱ(vᵢ)ⁱ = rₖⱼ((v'ₗ)ˡ +++ tail(vᵢ)ⁱ +- otherwise + | (pᵢⱼ)ⁱ(vᵢ)ⁱ = rⱼ_{wild} tail(vᵢ)ⁱ +In the first case (Kᵥ is Kₖ for some Kₖ ∈ constrs(p₀ⱼ)ʲ), we +have to prove that +| (pᵢⱼ)ⁱ(vᵢ)ⁱ = rₖⱼ((v'ₗ)ˡ +++ tail(vᵢ)ⁱ +By definition of mₖ we know that rₖⱼ is equal to +| if pₒⱼ is Kₖ(qₗ) then +| \quad (qₗ)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ +| if pₒⱼ is _ then +| \quad (_)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ +| else ⊥ +\begin{comment} +Maybe better as a table? +| pₒⱼ | rₖⱼ | +|--------+---------------------------| +| Kₖ(qₗ) | (qₗ)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ | +| _ | (_)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ | +| else | ⊥ | +\end{comment} +By definition of (pᵢ)ⁱ(vᵢ)ⁱ we know that (pᵢⱼ)ⁱ(vᵢ) is equal to +| (K(qⱼ)ʲ, tail(pᵢⱼ)ⁱ) (K(v'ₗ)ˡ,tail(vᵢ)ⁱ) := ((qⱼ)ʲ +++ tail(pᵢⱼ)ⁱ)((v'ₗ)ˡ +++ tail(vᵢ)ⁱ) +| (_, tail(pᵢⱼ)ⁱ) (vᵢ) := tail(pᵢⱼ)ⁱ(tail(vᵢ)ⁱ) +| (K(qⱼ)ʲ, tail(pᵢⱼ)ⁱ) (K'(v'ₗ)ˡ,tail(vⱼ)ʲ) := ⊥ if K ≠ K' -By definition, m(vᵢ)ⁱ is -| m(vᵢ)ⁱ = First(rⱼ(vᵢ)ⁱ)ʲ for m = ((aᵢ)ⁱ, (rⱼ)ʲ) -| (pᵢ)ⁱ (vᵢ)ⁱ = { -| \quad if p₀ = k(qₗ)ˡ, v₀ = k'(v'ₖ)ᵏ, k=Idx(k') and l=Idx(k) -| \quad \quad if k ≠ k' then ⊥ -| if k = k' then ((qₗ)ˡ +++ (pᵢ)^{i∈I\DZ}) ((v'ₖ)ᵏ +++ (vᵢ)^{i∈I\DZ} ) -| if p₀ = (q₁\vert{}q₂) then -| First( (q₁pᵢ^{i∈I\DZ}) vᵢ^{i∈I\DZ}, (q₂pᵢ^{i∈I \DZ}) vᵢ^{i∈I\DZ})} +We prove this first case by a second case analysis on p₀ⱼ. -For this reason, if we can prove that -| ∀j, rⱼ(vᵢ)ⁱ = r'ⱼ((v'ₖ)ᵏ ++ (vᵢ)ⁱ) -it follows that -| m(vᵢ)ⁱ = mₖ((v'ₖ)ᵏ ++ (vᵢ)ⁱ) -from the above definition. +TODO -We can also show that aᵢ = (a_{0.l})ˡ +++ a_{i∈I\DZ} because v(a₀) = K(v(a){0.l})ˡ) +In the second case (Kᵥ is distinct from Kₖ for all Kₖ ∈ constrs(pₒⱼ)ʲ), +we have to prove that +| (pᵢⱼ)ⁱ(vᵢ)ⁱ = rⱼ_{wild} tail(vᵢ)ⁱ +TODO ** Target translation @@ -436,7 +514,7 @@ respective decision tree produces the same result The equivalence checking algorithm takes as input a domain of possible values \emph{S} and a pair of source and target decision trees and in case the two trees are not equivalent it returns a counter example. -The algorithm respects the following correctness statement: +Our algorithm respects the following correctness statement: \begin{comment} TODO: we have to define what \coversTEX mean for readers to understand the specifications @@ -554,14 +632,13 @@ We say that Cₜ covers the input space /S/, written /covers(Cₜ, S)/ when every value vₛ∈S is a valid input to the decision tree Cₜ. (TODO: rephrase) Given an input space /S/ and a couple of decision trees, where -the target decision tree Cₜ covers the input space /S/, we say that -the two decision trees are equivalent when: +the target decision tree Cₜ covers the input space /S/ we can define equivalence: | equiv(S, Cₛ, Cₜ, gs) = Yes ∧ covers(Cₜ, S) → ∀vₛ≃vₜ ∈ S, Cₛ(vₛ) ≃gs Cₜ(vₜ) Similarly we say that a couple of decision trees in the presence of -an input space /S/ are /not/ equivalent when: +an input space /S/ are /not/ equivalent in the following way: | equiv(S, Cₛ, Cₜ, gs) = No(vₛ,vₜ) ∧ covers(Cₜ, S) → vₛ≃vₜ ∈ S ∧ Cₛ(vₛ) ≠gs Cₜ(vₜ) Corollary: For a full input space /S/, that is the universe of the -target program we say: +target program: | equiv(S, 〚tₛ〛ₛ, 〚tₜ〛ₜ, ∅) = Yes ⇔ tₛ ≃ tₜ @@ -663,7 +740,7 @@ I think the unreachable case should go at the end. 3. When we have a Leaf or a Failure at the left side: | equiv(S, Failure as Cₛ, Switch(a, (πᵢ → Cₜᵢ)^{i∈I})) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)^{i∈I}) | equiv(S, Leaf bbₛ as Cₛ, Switch(a, (πᵢ → Cₜᵢ)^{i∈I})) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)^{i∈I}) - The algorithm either returns Yes for every sub-input space Sᵢ := S∩(a→π(kᵢ)) and + Our algorithm either returns Yes for every sub-input space Sᵢ := S∩(a→π(kᵢ)) and subtree Cₜᵢ | equiv(Sᵢ, Cₛ, Cₜᵢ) = Yes ∀i or we have a counter example vₛ, vₜ for which @@ -676,7 +753,7 @@ I think the unreachable case should go at the end. 4. When we have a Switch on the right we define π_{fallback} as the domain of values not covered but the union of the constructors kᵢ | π_{fallback} = ¬\bigcup\limits_{i∈I}π(kᵢ) - The algorithm proceeds by trimming + Our algorithm proceeds by trimming | equiv(S, Switch(a, (kᵢ → Cₛᵢ)^{i∈I}, C_{sf}), Cₜ) := | Forall(equiv( S∩(a→π(kᵢ)^{i∈I}), Cₛᵢ, C_{t/a→π(kᵢ)})^{i∈I} +++ equiv(S∩(a→πₙ), Cₛ, C_{a→π_{fallback}})) The statement still holds and we show this by first analyzing the