correzioni Gabriel #2

This commit is contained in:
Francesco Mecca 2020-04-12 23:20:37 +02:00
parent e692378874
commit d2c73080a1
3 changed files with 536 additions and 291 deletions

Binary file not shown.

View file

@ -1,4 +1,4 @@
% Created 2020-04-11 Sat 00:31 % Created 2020-04-12 Sun 22:50
% Intended LaTeX compiler: pdflatex % Intended LaTeX compiler: pdflatex
\documentclass[11pt]{article} \documentclass[11pt]{article}
\usepackage[utf8]{inputenc} \usepackage[utf8]{inputenc}
@ -64,11 +64,11 @@ xleftmargin=1em,
\begin{document} \begin{document}
\section{Translation validation of the Pattern Matching Compiler} \section{Translation validation of the Pattern Matching Compiler}
\label{sec:org9d32dd4} \label{sec:org02550ae}
\subsection{Source program} \subsection{Source program}
\label{sec:org0032637} \label{sec:org8c14d2a}
The algorithm takes as its input a source program and translates it Our algorithm takes as its input a source program and translates it
into an algebraic data structure which type we call \emph{decision\_tree}. into an algebraic data structure which type we call \emph{decision\_tree}.
\begin{verbatim} \begin{verbatim}
@ -88,15 +88,15 @@ We distinguish
\item Leaf: a value matching this part results into the evaluation of a \item Leaf: a value matching this part results into the evaluation of a
source black box of code source black box of code
\end{itemize} \end{itemize}
Our algorithm doesn't support type-declaration-based analysis
The algorithm doesn't support type-declaration-based analysis
to know the list of constructors at a given type. to know the list of constructors at a given type.
Let's consider some trivial examples: Let's consider some trivial examples:
\begin{verbatim} \begin{center}
function true -> 1 \begin{tabular}{l}
\end{verbatim} function true -> 1\\
\end{tabular}
\end{center}
is translated to is translated to
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
@ -104,28 +104,36 @@ Switch ([(true, Leaf 1)], Failure)\\
\end{tabular} \end{tabular}
\end{center} \end{center}
while while
\begin{verbatim} \begin{center}
function \begin{tabular}{l}
true -> 1 function\\
| false -> 2 \(\vert{}\) true -> 1\\
\end{verbatim} \(\vert{}\) false -> 2\\
\end{tabular}
\end{center}
will be translated to will be translated to
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
Switch ([(true, Leaf 1); (false, Leaf 2)])\\ Switch ([(true, Leaf 1); (false, Leaf 2)])\\
\end{tabular} \end{tabular}
\end{center} \end{center}
It is possible to produce Unreachable examples by using It is possible to produce Unreachable examples by using
refutation clauses (a "dot" in the right-hand-side) refutation clauses (a "dot" in the right-hand-side)
\begin{verbatim} \begin{center}
function \begin{tabular}{l}
true -> 1 function\\
| false -> 2 \(\vert{}\) true -> 1\\
| _ -> . \(\vert{}\) false -> 2\\
\end{verbatim} \(\vert{}\) \_ -> .\\
\end{tabular}
\end{center}
that gets translated into that gets translated into
Switch ([(true, Leaf 1); (false, Leaf 2)], Unreachable) \begin{center}
\begin{tabular}{l}
Switch ([(true, Leaf 1); (false, Leaf 2)], Unreachable)\\
\end{tabular}
\end{center}
We trust this annotation, which is reasonable as the type-checker We trust this annotation, which is reasonable as the type-checker
verifies that it indeed holds. verifies that it indeed holds.
@ -139,7 +147,6 @@ true.
\begin{comment} \begin{comment}
[ ] Finisci con Switch [ ] Finisci con Switch
[ ] Spiega del fallback [ ] Spiega del fallback
[ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le Lambda
\end{comment} \end{comment}
The source code of a pattern matching function has the The source code of a pattern matching function has the
@ -184,8 +191,7 @@ pattern1 ::= "\(\vert{}\)" clause\\
clause ::= lexpr "->" rexpr\\ clause ::= lexpr "->" rexpr\\
lexpr ::= rule ($\varepsilon$\(\vert{}\)condition)\\ lexpr ::= rule ($\varepsilon$\(\vert{}\)condition)\\
rexpr ::= \_code ;; arbitrary code\\ rexpr ::= \_code ;; arbitrary code\\
rule ::= wildcard\(\vert{}\)variable\(\vert{}\)constructor\_pattern\(\vert{}\)\{\}or\_pattern ;;\\ rule ::= wildcard\(\vert{}\)variable\(\vert{}\)constructor\_pattern\(\vert{}\) or\_pattern ;;\\
;; rules\\
wildcard ::= "\_"\\ wildcard ::= "\_"\\
variable ::= identifier\\ variable ::= identifier\\
constructor\_pattern ::= constructor (rule\(\vert{}\)$\varepsilon$) (assignment\(\vert{}\)$\varepsilon$)\\ constructor\_pattern ::= constructor (rule\(\vert{}\)$\varepsilon$) (assignment\(\vert{}\)$\varepsilon$)\\
@ -196,59 +202,155 @@ assignment ::= "as" id\\
bexpr ::= \_code ;; arbitrary code\\ bexpr ::= \_code ;; arbitrary code\\
\end{tabular} \end{tabular}
\end{center} \end{center}
The source program is parsed using the ocaml-compiler-libs library.
A source program t$_S$ is a collection of pattern clauses pointing to The result of parsing, when successful, results in a list of clauses
\emph{bb} terms. Running a program t$_S$ against an input value v$_S$ produces as and a list of type declarations.
a result \emph{r}: Every clause consists of three objects: a left-hand-side that is the
kind of pattern expressed, an option guard and a right-hand-side expression.
Patterns are encoded in the following way:
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{ll}
t$_S$ ::= (p $\to$ bb)\(^{\text{i$\in $I}}\)\\ pattern & type\\
p ::= \(\vert{}\) K(p$_i$)$^i$, i $\in $ I \(\vert{}\) (p\(\vert{}\)q) \(\vert{}\) n $\in $ $\mathbb{N}$\\ \hline
r ::= guard list * (Match bb \(\vert{}\) NoMatch \(\vert{}\) Absurd)\\ \_ & Wildcard\\
t$_S$(v$_S$) $\to$ r\\ p$_1$ as x & Assignment\\
c(p$_1$,p$_2$,\ldots{},p$_n$) & Constructor\\
(p$_1$\(\vert{}\) p$_2$) & Orpat\\
\end{tabular} \end{tabular}
\end{center} \end{center}
TODO: argument on what it means to run a source program
\emph{guard} and \emph{bb} expressions are treated as blackboxes of OCaml code. Once parsed, the type declarations and the list of clauses are encoded in the form of a matrix
A sound approach for treating these blackboxes would be to inspect the that is later evaluated using a matrix decomposition algorithm.
OCaml compiler during translation to Lambda code and extract the
blackboxes compiled in their Lambda representation.
This would allow to test for equality with the respective blackbox at
the target level.
Given that this level of introspection is currently not possibile, we
decided to restrict the structure of blackboxes to the following (valid) OCaml
code:
\begin{verbatim} Patterns are of the form
external guard : 'a -> 'b = "guard" \begin{center}
external observe : 'a -> 'b = "observe" \begin{tabular}{ll}
\end{verbatim} pattern & type of pattern\\
\hline
\_ & wildcard\\
x & variable\\
c(p$_1$,p$_2$,\ldots{},p$_n$) & constructor pattern\\
(p$_1$\(\vert{}\) p$_2$) & or-pattern\\
\end{tabular}
\end{center}
We assume these two external functions \emph{guard} and \emph{observe} with a valid The pattern \emph{p} matches a value \emph{v}, written as p $\preccurlyeq$ v, when one of the
type that lets the user pass any number of arguments to them. following rules apply
All the guards are of the form \texttt{guard <arg> <arg> <arg>}, where the
<arg> are expressed using the OCaml pattern matching language.
Similarly, all the right-hand-side expressions are of the form
\texttt{observe <arg> <arg> ...} with the same constraints on arguments.
\begin{verbatim} \begin{center}
type t = K1 | K2 of t (* declaration of an algebraic and recursive datatype t *) \begin{tabular}{llll}
\hline
\_ & $\preccurlyeq$ & v & $\forall$v\\
x & $\preccurlyeq$ & v & $\forall$v\\
(p$_1$ \(\vert{}\) p$_2$) & $\preccurlyeq$ & v & iff p$_1$ $\preccurlyeq$ v or p$_2$ $\preccurlyeq$ v\\
c(p$_1$, p$_2$, \ldots{}, p$_a$) & $\preccurlyeq$ & c(v$_1$, v$_2$, \ldots{}, v$_a$) & iff (p$_1$, p$_2$, \ldots{}, p$_a$) $\preccurlyeq$ (v$_1$, v$_2$, \ldots{}, v$_a$)\\
(p$_1$, p$_2$, \ldots{}, p$_a$) & $\preccurlyeq$ & (v$_1$, v$_2$, \ldots{}, v$_a$) & iff p$_i$ $\preccurlyeq$ v$_i$ $\forall$i $\in $ [1..a]\\
\hline
\end{tabular}
\end{center}
let _ = function When a value \emph{v} matches pattern \emph{p} we say that \emph{v} is an \emph{instance} of \emph{p}.
| K1 -> observe 0
| K2 K1 -> observe 1
| K2 x when guard x -> observe 2
| K2 (K2 x) as y when guard x y -> observe 3
| K2 _ -> observe 4
\end{verbatim}
During compilation by the translators, expressions at the
right-hand-side are compiled into
Lambda code and are referred as lambda code actions l$_i$.
We define the \emph{pattern matrix} P as the matrix |m x n| where m bigger
or equal than the number of clauses in the source program and n is
equal to the arity of the constructor with the gratest arity.
\begin{equation*}
P =
\begin{pmatrix}
p_{1,1} & p_{1,2} & \cdots & p_{1,n} \\
p_{2,1} & p_{2,2} & \cdots & p_{2,n} \\
\vdots & \vdots & \ddots & \vdots \\
p_{m,1} & p_{m,2} & \cdots & p_{m,n} )
\end{pmatrix}
\end{equation*}
every row of \emph{P} is called a pattern vector
\(\vec{p_i}\) = (p$_1$, p$_2$, \ldots{}, p$_n$); In every instance of P pattern
vectors appear normalized on the length of the longest pattern vector.
Considering the pattern matrix P we say that the value vector
\(\vec{v}\) = (v$_1$, v$_2$, \ldots{}, v$_i$) matches the pattern vector p$_i$ in P if and only if the following two
conditions are satisfied:
\begin{itemize}
\item p\(_{\text{i,1}}\), p\(_{\text{i,2}}\), \(\cdots{}\), p\(_{\text{i,n}}\) $\preccurlyeq$ (v$_1$, v$_2$, \ldots{}, v$_i$)
\item $\forall$j < i p\(_{\text{j,1}}\), p\(_{\text{j,2}}\), \(\cdots{}\), p\(_{\text{j,n}}\) $\npreceq$ (v$_1$, v$_2$, \ldots{}, v$_i$)
\end{itemize}
We can define the following three relations with respect to patterns:
\begin{itemize}
\item Pattern p is less precise than pattern q, written p $\preccurlyeq$ q, when all
instances of q are instances of p
\item Pattern p and q are equivalent, written p $\equiv$ q, when their instances
are the same
\item Patterns p and q are compatible when they share a common instance
\end{itemize}
Wit the support of two auxiliary functions
\begin{itemize}
\item tail of an ordered family
\begin{center}
\begin{tabular}{l}
tail((x$_i$)\(^{\text{i $\in $ I}}\)) := (x$_i$)\(^{\text{i $\ne$ min(I)}}\)\\
\end{tabular}
\end{center}
\item first non-$\bot$ element of an ordered family
\begin{center}
\begin{tabular}{l}
First((x$_i$)$^i$) := $\bot$ \quad \quad \quad if $\forall$i, x$_i$ = $\bot$\\
First((x$_i$)$^i$) := x\_min\{i \(\vert{}\) x$_i$ $\ne$ $\bot$\} \quad if $\exists$i, x$_i$ $\ne$ $\bot$\\
\end{tabular}
\end{center}
\end{itemize}
we now define what it means to run a pattern row against a value
vector of the same length, that is (p$_i$)$^i$(v$_i$)$^i$
\begin{center}
\begin{tabular}{lll}
p$_i$ & v$_i$ & result\(_{\text{pat}}\)\\
\hline
$\varnothing$ & ($\varnothing$) & []\\
(\_, tail(p$_i$)$^i$) & (v$_i$) & tail(p$_i$)$^i$(tail(v$_i$)$^i$)\\
(x, tail(p$_i$)$^i$) & (v$_i$) & $\sigma$[x$\mapsto$v$_0$] if tail(p$_i$)$^i$(tail(v$_i$)$^i$) = $\sigma$\\
(K(q$_j$)$^j$, tail(p$_i$)$^i$) & (K(v'$_j$)$^j$,tail(v$_j$)$^j$) & ((q$_j$)$^j$ \sout{+} tail(p$_i$)$^i$)((v'$_j$)$^j$ \sout{+} tail(v$_i$)$^i$)\\
(K(q$_j$)$^j$, tail(p$_i$)$^i$) & (K'(v'$_l$)$^l$,tail(v$_j$)$^j$) & $\bot$ if K $\ne$ K'\\
(q$_1$\(\vert{}\)q$_2$, tail(p$_i$)$^i$) & (v$_i$)$^i$ & First((q$_1$,tail(p$_i$)$^i$)(v$_i$)$^i$, (q$_2$,tail(p$_i$)$^i$)(v$_i$)$^i$)\\
\end{tabular}
\end{center}
A source program t$_S$ is a collection of pattern clauses pointing to
\emph{bb} terms. Running a program t$_S$ against an input value v$_S$, written
t$_S$(v$_S$) produces a result \emph{r} belonging to the following grammar:
\begin{center}
\begin{tabular}{l}
t$_S$ ::= (p $\to$ bb)\(^{\text{i$\in $I}}\)\\
t$_S$(v$_S$) $\to$ r\\
r ::= guard list * (Match bb \(\vert{}\) NoMatch \(\vert{}\) Absurd)\\
\end{tabular}
\end{center}
\begin{comment}
TODO: understand how to explain guard
\end{comment}
We can define what it means to run an input value v$_S$ against a source
program t$_S$:
\begin{center}
\begin{tabular}{l}
t$_S$(v$_S$) := $\bot$ \quad \quad \quad if $\forall$i, p$_i$(v$_S$) = $\bot$\\
First((x$_i$)$^i$) := x\_min\{i \(\vert{}\) x$_i$ $\ne$ $\bot$\} \quad if $\exists$i, x$_i$ $\ne$ $\bot$\\
t$_S$(v$_S$) = Absurd if bb\_min\{p$_i$ $\to$ bb$_i$ \(\vert{}\) p$_i$(v$_S$) $\ne$ $\bot$\} = \emph{refutation clause}\\
t$_S$(v$_S$) = Match bb\_min\{p$_i$ $\to$ bb$_i$ \(\vert{}\) p$_i$(v$_S$) $\ne$ $\bot$\}\\
\end{tabular}
\end{center}
[ \ldots{} ] Big part that I think doesn't need revision from you [ \ldots{} ]
In our prototype we make use of accessors to encode stored values. In our prototype we make use of accessors to encode stored values.
\begin{minipage}{0.2\linewidth}
\begin{minipage}{0.6\linewidth}
\begin{verbatim} \begin{verbatim}
let value = 1 :: 2 :: 3 :: [] let value = 1 :: 2 :: 3 :: []
(* that can also be written *) (* that can also be written *)
let value = [] let value = []
@ -351,49 +453,86 @@ because a source matrix in the case of empty rows returns
the first expression and (Leaf bb)(v) := Match bb the first expression and (Leaf bb)(v) := Match bb
\item\relax [| (a$_j$)$^j$, $\varnothing$ |] $\equiv$ Failure \item\relax [| (a$_j$)$^j$, $\varnothing$ |] $\equiv$ Failure
\end{enumerate} \end{enumerate}
Regarding non base cases: Regarding non base cases:
Let's first define Let's first define some auxiliary functions
\begin{itemize}
\item The index family of a constructor
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
let Idx(k) := [0; arity(k)[\\ Idx(K) := [0; arity(K)[\\
let First($\varnothing$) := $\bot$\\
let First((a$_j$)$^j$) := a\(_{\text{min(j$\in $J$\ne$$\varnothing$)}}\)\\
\end{tabular} \end{tabular}
\end{center} \end{center}
\[ \item head of an ordered family (we write x for any object here, value, pattern etc.)
m := ((a_i)^i ((p_{ij})^i \to e_j)^{ij}) \begin{center}
\] \begin{tabular}{l}
\[ head((x$_i$)\(^{\text{i $\in $ I}}\)) = x\_min(I)\\
(k_k)^k := headconstructor(p_{i0})^i \end{tabular}
\] \end{center}
\begin{equation} \item tail of an ordered family
Groups(m) := ( k_k \to ((a)_{0.l})^{l \in Idx(k_k)} +++ (a_i)^{i \in I\DZ}), \\ \begin{center}
( if p_{0j} is k(q_l) then \\ \begin{tabular}{l}
(q_l)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\ tail((x$_i$)\(^{\text{i $\in $ I}}\)) := (x$_i$)\(^{\text{i $\ne$ min(I)}}\)\\
if p_{0j} is \_ then \\ \end{tabular}
(\_)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\ \end{center}
else \bot )^j ), \\ \item head constructor of a value or pattern
((a_i)^{i \in I\DZ}, ((p_{ij})^{i \in I\DZ} \to e_j if p_{0j} is \_ else \bot)^{j \in J}) \begin{center}
\end{equation} \begin{tabular}{l}
constr(K(x$_i$)$^i$) = K\\
Groups(m) is an auxiliary function that source a matrix m into constr(\_) = $\bot$\\
constr(x) = $\bot$\\
\end{tabular}
\end{center}
\item first non-$\bot$ element of an ordered family
\begin{center}
\begin{tabular}{l}
First((x$_i$)$^i$) := $\bot$ \quad \quad \quad if $\forall$i, x$_i$ = $\bot$\\
First((x$_i$)$^i$) := x\_min\{i \(\vert{}\) x$_i$ $\ne$ $\bot$\} \quad if $\exists$i, x$_i$ $\ne$ $\bot$\\
\end{tabular}
\end{center}
\item definition of group decomposition:
\end{itemize}
\begin{center}
\begin{tabular}{l}
let constrs((p$_i$)\(^{\text{i $\in $ I}}\)) = \{ K \(\vert{}\) K = constr(p$_i$), i $\in $ I \}\\
let Groups(m) where m = ((a$_i$)$^i$ ((p$_i$$_j$)$^i$ $\to$ e$_j$)$^i$$^j$) =\\
\quad \quad let (K$_k$)$^k$ = constrs(p$_i$$_0$)$^i$ in\\
\quad \quad ( K$_k$ $\to$\\
\quad \quad \quad \quad ((a$_0$.$_l$)$^l$ \sout{+} tail(a$_i$)$^i$)\\
\quad \quad \quad \quad (\\
\quad \quad \quad \quad if p$_o$$_j$ is K$_k$(q$_l$) then\\
\quad \quad \quad \quad \quad \quad (q$_l$)$^l$ \sout{+} tail(p$_i$$_j$)$^i$ $\to$ e$_j$\\
\quad \quad \quad \quad if p$_o$$_j$ is \_ then\\
\quad \quad \quad \quad \quad \quad (\_)$^l$ \sout{+} tail(p$_i$$_j$)$^i$ $\to$ e$_j$\\
\quad \quad \quad \quad else $\bot$\\
\quad \quad \quad \quad )$^j$\\
\quad \quad ), (\\
\quad \quad \quad \quad tail(a$_i$)$^i$, (tail(p$_i$$_j$)$^i$ $\to$ e$_j$ if p$_0$$_j$ is \_ else $\bot$)$^j$\\
\quad \quad )\\
\end{tabular}
\end{center}
Groups(m) is an auxiliary function that decomposes a matrix m into
submatrices, according to the head constructor of their first pattern. submatrices, according to the head constructor of their first pattern.
Groups(m) returns one submatrix m\_r for each head constructor k that Groups(m) returns one submatrix m\_r for each head constructor K that
occurs on the first row of m, plus one "wildcard submatrix" m\(_{\text{wild}}\) occurs on the first row of m, plus one "wildcard submatrix" m\(_{\text{wild}}\)
that matches on all values that do not start with one of those head that matches on all values that do not start with one of those head
constructors. constructors.
Intuitively, m is equivalent to its decomposition in the following Intuitively, m is equivalent to its decomposition in the following
sense: if the first pattern of an input vector (v$_i$)$^i$ starts with one sense: if the first pattern of an input vector (v\_i)\^{}i starts with one
of the head constructors k, then running (v$_i$)$^i$ against m is the same of the head constructors K$_k$, then running (v\_i)\^{}i against m is the same
as running it against the submatrix m$_k$; otherwise (its head as running it against the submatrix m\(_{\text{K$_k$}}\); otherwise (its head
constructor is none of the k) it is equivalent to running it against constructor $\notin$ (K$_k$)$^k$) it is equivalent to running it against
the wildcard submatrix. the wildcard submatrix.
We formalize this intuition as follows: We formalize this intuition as follows
Lemma (Groups): \begin{enumerate}
Let \[m\] be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\]. \item Lemma (Groups):
\label{sec:org3d5a08a}
Let \emph{m} be a matrix with
\begin{center}
\begin{tabular}{l}
Groups(m) = (k\_r \(\to\) m\_r)\^{}k, m\(_{\text{wild}}\)\\
\end{tabular}
\end{center}
For any value vector \((v_i)^l\) such that \(v_0 = k(v'_l)^l\) for some For any value vector \((v_i)^l\) such that \(v_0 = k(v'_l)^l\) for some
constructor k, constructor k,
we have: we have:
@ -410,69 +549,99 @@ if k = k$_k$ \text{ for some k then}\\
TODO: fix \{0} TODO: fix \{0}
\end{comment} \end{comment}
\begin{enumerate}
\item Proof: \item Proof:
\label{sec:orgb73fb4f} \label{sec:org416e0ec}
Let \(m\) be a matrix with \[Group(m) = (k_r \to m_r)^k, m_{wild}\] Let \emph{m} be a matrix ((a$_i$)$^i$, ((p$_i$$_j$)$^i$ $\to$ e$_j$)$^j$) with
Let \((v_i)^i\) be an input matrix with \(v_0 = k(v'_l)^l\) for some k. \begin{center}
We proceed by case analysis: \begin{tabular}{l}
Groups(m) = (K$_k$ $\to$ m$_k$)$^k$, m\(_{\text{wild}}\)\\
\end{tabular}
\end{center}
Below we are going to assume that m is a simplified matrix such that
the first row does not contain an or-pattern or a binding to a
variable.
Let (v$_i$)$^i$ be an input matrix with v$_0$ = K$_v$(v'\(_{\text{l}}\))$^l$ for some constructor K$_v$.
We have to show that:
\begin{itemize} \begin{itemize}
\item either k is one of the k$_k$ for some k \item if K$_k$ = K$_v$ for some K$_k$ $\in $ constrs(p$_0$$_j$)$^j$, then
\item or k is none of the (k$_k$)$^k$ m(v$_i$)$^i$ = m$_k$((v'$_l$)$^l$ \sout{+} tail(v$_i$)$^i$)
\item otherwise
m(v$_i$)$^i$ = m\(_{\text{wild}}\)(tail(v$_i$)$^i$)
\end{itemize} \end{itemize}
Let us call (r$_k$$_j$) the j-th row of the submatrix m$_k$, and r$_j$\(_{\text{wild}}\)
the j-th row of the wildcard submatrix m\(_{\text{wild}}\).
Both m(v$_i$)$^i$ and m$_k$(v$_k$)$^k$ are defined as the first matching result of Our goal contains same-behavior equalities between matrices, for
a family over each row r$_j$ of a matrix a fixed input vector (v$_i$)$^i$. It suffices to show same-behavior
equalities between each row of the matrices for this input
We know, from the definition of vector. We show that for any j,
Groups(m), that m$_k$ is \begin{itemize}
\item if K$_k$ = K$_v$ for some K$_k$ $\in $ constrs(p$_0$$_j$)$^j$, then
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
((a)\{0.l\})\(^{\text{l$\in $Idx(k$_k$)}}\) \sout{+} (a$_i$)\(^{\text{i$\in $I}\DZ}\)),\\ (p$_i$$_j$)$^i$(v$_i$)$^i$ = r$_k$$_j$((v'$_l$)$^l$ \sout{+} tail(v$_i$)$^i$\\
(\\ \end{tabular}
\quad if p\(_{\text{0j}}\) is k(q$_l$) then\\ \end{center}
\quad \quad (q$_l$)$^l$ \sout{+} (p\(_{\text{ij}}\))\(^{\text{i$\in $I}\DZ\ }\) $\to$ e$_j$\\ \item otherwise
\quad if p\(_{\text{0j}}\) is \_ then\\ \begin{center}
\quad \quad (\_)$^l$ \sout{+} (p\(_{\text{ij}}\))\(^{\text{i$\in $I}\DZ}\) $\to$ e$_j$\\ \begin{tabular}{l}
\quad else $\bot$\\ (p$_i$$_j$)$^i$(v$_i$)$^i$ = r$_j$\(_{\text{wild}}\) tail(v$_i$)$^i$\\
)\(^{\text{j$\in $J}}\)\\ \end{tabular}
\end{center}
\end{itemize}
In the first case (K$_v$ is K$_k$ for some K$_k$ $\in $ constrs(p$_0$$_j$)$^j$), we
have to prove that
\begin{center}
\begin{tabular}{l}
(p$_i$$_j$)$^i$(v$_i$)$^i$ = r$_k$$_j$((v'$_l$)$^l$ \sout{+} tail(v$_i$)$^i$\\
\end{tabular}
\end{center}
By definition of m$_k$ we know that r$_k$$_j$ is equal to
\begin{center}
\begin{tabular}{l}
if p$_o$$_j$ is K$_k$(q$_l$) then\\
\quad (q$_l$)$^l$ \sout{+} tail(p$_i$$_j$)$^i$ $\to$ e$_j$\\
if p$_o$$_j$ is \_ then\\
\quad (\_)$^l$ \sout{+} tail(p$_i$$_j$)$^i$ $\to$ e$_j$\\
else $\bot$\\
\end{tabular}
\end{center}
\begin{comment}
Maybe better as a table?
| p$_o$$_j$ | r$_k$$_j$ |
|--------+---------------------------|
| K$_k$(q$_l$) | (q$_l$)$^l$ +++ tail(p$_i$$_j$)$^i$ $\to$ e$_j$ |
| _ | (_)$^l$ +++ tail(p$_i$$_j$)$^i$ $\to$ e$_j$ |
| else | $\bot$ |
\end{comment}
By definition of (p$_i$)$^i$(v$_i$)$^i$ we know that (p$_i$$_j$)$^i$(v$_i$) is equal to
\begin{center}
\begin{tabular}{l}
(K(q$_j$)$^j$, tail(p$_i$$_j$)$^i$) (K(v'$_l$)$^l$,tail(v$_i$)$^i$) := ((q$_j$)$^j$ \sout{+} tail(p$_i$$_j$)$^i$)((v'$_l$)$^l$ \sout{+} tail(v$_i$)$^i$)\\
(\_, tail(p$_i$$_j$)$^i$) (v$_i$) := tail(p$_i$$_j$)$^i$(tail(v$_i$)$^i$)\\
(K(q$_j$)$^j$, tail(p$_i$$_j$)$^i$) (K'(v'$_l$)$^l$,tail(v$_j$)$^j$) := $\bot$ if K $\ne$ K'\\
\end{tabular} \end{tabular}
\end{center} \end{center}
By definition, m(v$_i$)$^i$ is We prove this first case by a second case analysis on p$_0$$_j$.
TODO
In the second case (K$_v$ is distinct from K$_k$ for all K$_k$ $\in $ constrs(p$_o$$_j$)$^j$),
we have to prove that
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
m(v$_i$)$^i$ = First(r$_j$(v$_i$)$^i$)$^j$ for m = ((a$_i$)$^i$, (r$_j$)$^j$)\\ (p$_i$$_j$)$^i$(v$_i$)$^i$ = r$_j$\(_{\text{wild}}\) tail(v$_i$)$^i$\\
(p$_i$)$^i$ (v$_i$)$^i$ = \{\\
\quad if p$_0$ = k(q$_l$)$^l$, v$_0$ = k'(v'$_k$)$^k$, k=Idx(k') and l=Idx(k)\\
\quad \quad if k $\ne$ k' then $\bot$\\
if k = k' then ((q$_l$)$^l$ \sout{+} (p$_i$)\(^{\text{i$\in $I}\DZ}\)) ((v'$_k$)$^k$ \sout{+} (v$_i$)\(^{\text{i$\in $I}\DZ}\) )\\
if p$_0$ = (q$_1$\(\vert{}\)q$_2$) then\\
First( (q$_1$p$_i$\(^{\text{i$\in $I}\DZ}\)) v$_i$\(^{\text{i$\in $I}\DZ}\), (q$_2$p$_i$\(^{\text{i$\in $I }\DZ}\)) v$_i$\(^{\text{i$\in $I}\DZ}\))\}\\
\end{tabular} \end{tabular}
\end{center} \end{center}
For this reason, if we can prove that TODO
\begin{center}
\begin{tabular}{l}
$\forall$j, r$_j$(v$_i$)$^i$ = r'$_j$((v'$_k$)$^k$ ++ (v$_i$)$^i$)\\
\end{tabular}
\end{center}
it follows that
\begin{center}
\begin{tabular}{l}
m(v$_i$)$^i$ = m$_k$((v'$_k$)$^k$ ++ (v$_i$)$^i$)\\
\end{tabular}
\end{center}
from the above definition.
We can also show that a$_i$ = (a\(_{\text{0.l}}\))$^l$ \sout{+} a\(_{\text{i$\in $I}\DZ}\) because v(a$_0$) = K(v(a)\{0.l\})$^l$)
\end{enumerate} \end{enumerate}
\subsection{Target translation} \subsection{Target translation}
\label{sec:org0337fa8} \label{sec:org7394f38}
The target program of the following general form is parsed using a parser The target program of the following general form is parsed using a parser
generated by Menhir, a LR(1) parser generator for the OCaml programming language. generated by Menhir, a LR(1) parser generator for the OCaml programming language.
@ -560,12 +729,12 @@ $\forall$v$_T$, t$_T$(v$_T$) = $\llbracket$t$_T$$\rrbracket$$_T$(v$_T$)\\
\subsection{Equivalence checking} \subsection{Equivalence checking}
\label{sec:org7268798} \label{sec:org7194f33}
The equivalence checking algorithm takes as input a domain of The equivalence checking algorithm takes as input a domain of
possible values \emph{S} and a pair of source and target decision trees and possible values \emph{S} and a pair of source and target decision trees and
in case the two trees are not equivalent it returns a counter example. in case the two trees are not equivalent it returns a counter example.
The algorithm respects the following correctness statement: Our algorithm respects the following correctness statement:
\begin{comment} \begin{comment}
TODO: we have to define what \coversTEX mean for readers to understand the specifications TODO: we have to define what \coversTEX mean for readers to understand the specifications
@ -699,22 +868,21 @@ We say that C$_T$ covers the input space \emph{S}, written
\emph{covers(C$_T$, S)} when every value v$_S$$\in $S is a valid input to the \emph{covers(C$_T$, S)} when every value v$_S$$\in $S is a valid input to the
decision tree C$_T$. (TODO: rephrase) decision tree C$_T$. (TODO: rephrase)
Given an input space \emph{S} and a couple of decision trees, where Given an input space \emph{S} and a couple of decision trees, where
the target decision tree C$_T$ covers the input space \emph{S}, we say that the target decision tree C$_T$ covers the input space \emph{S} we can define equivalence:
the two decision trees are equivalent when:
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
equiv(S, C$_S$, C$_T$, gs) = Yes $\wedge$ covers(C$_T$, S) $\to$ $\forall$v$_S$$\simeq$v$_T$ $\in $ S, C$_S$(v$_S$) $\simeq$gs C$_T$(v$_T$)\\ equiv(S, C$_S$, C$_T$, gs) = Yes $\wedge$ covers(C$_T$, S) $\to$ $\forall$v$_S$$\simeq$v$_T$ $\in $ S, C$_S$(v$_S$) $\simeq$gs C$_T$(v$_T$)\\
\end{tabular} \end{tabular}
\end{center} \end{center}
Similarly we say that a couple of decision trees in the presence of Similarly we say that a couple of decision trees in the presence of
an input space \emph{S} are \emph{not} equivalent when: an input space \emph{S} are \emph{not} equivalent in the following way:
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
equiv(S, C$_S$, C$_T$, gs) = No(v$_S$,v$_T$) $\wedge$ covers(C$_T$, S) $\to$ v$_S$$\simeq$v$_T$ $\in $ S $\wedge$ C$_S$(v$_S$) $\ne$gs C$_T$(v$_T$)\\ equiv(S, C$_S$, C$_T$, gs) = No(v$_S$,v$_T$) $\wedge$ covers(C$_T$, S) $\to$ v$_S$$\simeq$v$_T$ $\in $ S $\wedge$ C$_S$(v$_S$) $\ne$gs C$_T$(v$_T$)\\
\end{tabular} \end{tabular}
\end{center} \end{center}
Corollary: For a full input space \emph{S}, that is the universe of the Corollary: For a full input space \emph{S}, that is the universe of the
target program we say: target program:
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
equiv(S, $\llbracket$t$_S$$\rrbracket$$_S$, $\llbracket$t$_T$$\rrbracket$$_T$, $\varnothing$) = Yes $\Leftrightarrow$ t$_S$ $\simeq$ t$_T$\\ equiv(S, $\llbracket$t$_S$$\rrbracket$$_S$, $\llbracket$t$_T$$\rrbracket$$_T$, $\varnothing$) = Yes $\Leftrightarrow$ t$_S$ $\simeq$ t$_T$\\
@ -891,7 +1059,7 @@ equiv(S, Failure as C$_S$, Switch(a, ($\pi$$_i$ $\to$ C$_T$$_i$)\(^{\text{i$\in
equiv(S, Leaf bb$_S$ as C$_S$, Switch(a, ($\pi$$_i$ $\to$ C$_T$$_i$)\(^{\text{i$\in $I}}\))) := Forall(equiv( S$\cap$a$\to$$\pi$(k$_i$)), C$_S$, C$_T$$_i$)\(^{\text{i$\in $I}}\))\\ equiv(S, Leaf bb$_S$ as C$_S$, Switch(a, ($\pi$$_i$ $\to$ C$_T$$_i$)\(^{\text{i$\in $I}}\))) := Forall(equiv( S$\cap$a$\to$$\pi$(k$_i$)), C$_S$, C$_T$$_i$)\(^{\text{i$\in $I}}\))\\
\end{tabular} \end{tabular}
\end{center} \end{center}
The algorithm either returns Yes for every sub-input space S$_i$ := S$\cap$(a$\to$$\pi$(k$_i$)) and Our algorithm either returns Yes for every sub-input space S$_i$ := S$\cap$(a$\to$$\pi$(k$_i$)) and
subtree C$_T$$_i$ subtree C$_T$$_i$
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
@ -924,7 +1092,7 @@ values not covered but the union of the constructors k$_i$
$\pi$\(_{\text{fallback}}\) = ¬\bigcup\limits\(_{\text{i$\in $I}}\)$\pi$(k$_i$)\\ $\pi$\(_{\text{fallback}}\) = ¬\bigcup\limits\(_{\text{i$\in $I}}\)$\pi$(k$_i$)\\
\end{tabular} \end{tabular}
\end{center} \end{center}
The algorithm proceeds by trimming Our algorithm proceeds by trimming
\begin{center} \begin{center}
\begin{tabular}{l} \begin{tabular}{l}
equiv(S, Switch(a, (k$_i$ $\to$ C$_S$$_i$)\(^{\text{i$\in $I}}\), C\(_{\text{sf}}\)), C$_T$) :=\\ equiv(S, Switch(a, (k$_i$ $\to$ C$_S$$_i$)\(^{\text{i$\in $I}}\), C\(_{\text{sf}}\)), C$_T$) :=\\

View file

@ -46,7 +46,7 @@
* Translation validation of the Pattern Matching Compiler * Translation validation of the Pattern Matching Compiler
** Source program ** Source program
The algorithm takes as its input a source program and translates it Our algorithm takes as its input a source program and translates it
into an algebraic data structure which type we call /decision_tree/. into an algebraic data structure which type we call /decision_tree/.
#+BEGIN_SRC #+BEGIN_SRC
@ -64,36 +64,28 @@ We distinguish
- Failure: a value matching this part results in an error - Failure: a value matching this part results in an error
- Leaf: a value matching this part results into the evaluation of a - Leaf: a value matching this part results into the evaluation of a
source black box of code source black box of code
Our algorithm doesn't support type-declaration-based analysis
The algorithm doesn't support type-declaration-based analysis
to know the list of constructors at a given type. to know the list of constructors at a given type.
Let's consider some trivial examples: Let's consider some trivial examples:
#+BEGIN_SRC | function true -> 1
function true -> 1
#+END_SRC
is translated to is translated to
|Switch ([(true, Leaf 1)], Failure) |Switch ([(true, Leaf 1)], Failure)
while while
#+BEGIN_SRC | function
function | \vert{} true -> 1
true -> 1 | \vert{} false -> 2
| false -> 2
#+END_SRC
will be translated to will be translated to
|Switch ([(true, Leaf 1); (false, Leaf 2)])
|Switch ([(true, Leaf 1); (false, Leaf 2)])
It is possible to produce Unreachable examples by using It is possible to produce Unreachable examples by using
refutation clauses (a "dot" in the right-hand-side) refutation clauses (a "dot" in the right-hand-side)
#+BEGIN_SRC |function
function |\vert{} true -> 1
true -> 1 |\vert{} false -> 2
| false -> 2 |\vert{} _ -> .
| _ -> .
#+END_SRC
that gets translated into that gets translated into
Switch ([(true, Leaf 1); (false, Leaf 2)], Unreachable) | Switch ([(true, Leaf 1); (false, Leaf 2)], Unreachable)
We trust this annotation, which is reasonable as the type-checker We trust this annotation, which is reasonable as the type-checker
verifies that it indeed holds. verifies that it indeed holds.
@ -107,7 +99,6 @@ true.
\begin{comment} \begin{comment}
[ ] Finisci con Switch [ ] Finisci con Switch
[ ] Spiega del fallback [ ] Spiega del fallback
[ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le Lambda
\end{comment} \end{comment}
The source code of a pattern matching function has the The source code of a pattern matching function has the
@ -142,8 +133,7 @@ This BNF grammar describes formally the grammar of the source program:
| clause ::= lexpr "->" rexpr | clause ::= lexpr "->" rexpr
| lexpr ::= rule (ε\vert{}condition) | lexpr ::= rule (ε\vert{}condition)
| rexpr ::= _code ;; arbitrary code | rexpr ::= _code ;; arbitrary code
| rule ::= wildcard\vert{}variable\vert{}constructor_pattern\vert{}or_pattern ;; | rule ::= wildcard\vert{}variable\vert{}constructor_pattern\vert or_pattern ;;
| ;; rules
| wildcard ::= "_" | wildcard ::= "_"
| variable ::= identifier | variable ::= identifier
| constructor_pattern ::= constructor (rule\vert{}ε) (assignment\vert{}ε) | constructor_pattern ::= constructor (rule\vert{}ε) (assignment\vert{}ε)
@ -152,55 +142,117 @@ This BNF grammar describes formally the grammar of the source program:
| condition ::= "when" bexpr | condition ::= "when" bexpr
| assignment ::= "as" id | assignment ::= "as" id
| bexpr ::= _code ;; arbitrary code | bexpr ::= _code ;; arbitrary code
The source program is parsed using the ocaml-compiler-libs library.
The result of parsing, when successful, results in a list of clauses
and a list of type declarations.
Every clause consists of three objects: a left-hand-side that is the
kind of pattern expressed, an option guard and a right-hand-side expression.
Patterns are encoded in the following way:
| pattern | type |
|-----------------+-------------|
| _ | Wildcard |
| p₁ as x | Assignment |
| c(p₁,p₂,...,pₙ) | Constructor |
| (p₁\vert p₂) | Orpat |
Once parsed, the type declarations and the list of clauses are encoded in the form of a matrix
that is later evaluated using a matrix decomposition algorithm.
Patterns are of the form
| pattern | type of pattern |
|-----------------+---------------------|
| _ | wildcard |
| x | variable |
| c(p₁,p₂,...,pₙ) | constructor pattern |
| (p₁\vert p₂) | or-pattern |
The pattern /p/ matches a value /v/, written as p ≼ v, when one of the
following rules apply
|--------------------+---+--------------------+-------------------------------------------|
| _ | ≼ | v | ∀v |
| x | ≼ | v | ∀v |
| (p₁ \vert p₂) | ≼ | v | iff p₁ ≼ v or p₂ ≼ v |
| c(p₁, p₂, ..., pₐ) | ≼ | c(v₁, v₂, ..., vₐ) | iff (p₁, p₂, ..., pₐ) ≼ (v₁, v₂, ..., vₐ) |
| (p₁, p₂, ..., pₐ) | ≼ | (v₁, v₂, ..., vₐ) | iff pᵢ ≼ vᵢ ∀i ∈ [1..a] |
|--------------------+---+--------------------+-------------------------------------------|
When a value /v/ matches pattern /p/ we say that /v/ is an /instance/ of /p/.
During compilation by the translators, expressions at the
right-hand-side are compiled into
Lambda code and are referred as lambda code actions lᵢ.
We define the /pattern matrix/ P as the matrix |m x n| where m bigger
or equal than the number of clauses in the source program and n is
equal to the arity of the constructor with the gratest arity.
\begin{equation*}
P =
\begin{pmatrix}
p_{1,1} & p_{1,2} & \cdots & p_{1,n} \\
p_{2,1} & p_{2,2} & \cdots & p_{2,n} \\
\vdots & \vdots & \ddots & \vdots \\
p_{m,1} & p_{m,2} & \cdots & p_{m,n} )
\end{pmatrix}
\end{equation*}
every row of /P/ is called a pattern vector
$\vec{p_i}$ = (p₁, p₂, ..., pₙ); In every instance of P pattern
vectors appear normalized on the length of the longest pattern vector.
Considering the pattern matrix P we say that the value vector
$\vec{v}$ = (v₁, v₂, ..., vᵢ) matches the pattern vector pᵢ in P if and only if the following two
conditions are satisfied:
- p_{i,1}, p_{i,2}, \cdots, p_{i,n} ≼ (v₁, v₂, ..., vᵢ)
- ∀j < i p_{j,1}, p_{j,2}, \cdots, p_{j,n} ⋠ (v₁, v₂, ..., vᵢ)
We can define the following three relations with respect to patterns:
- Pattern p is less precise than pattern q, written p ≼ q, when all
instances of q are instances of p
- Pattern p and q are equivalent, written p ≡ q, when their instances
are the same
- Patterns p and q are compatible when they share a common instance
Wit the support of two auxiliary functions
- tail of an ordered family
| tail((xᵢ)^{i ∈ I}) := (xᵢ)^{i ≠ min(I)}
- first non-⊥ element of an ordered family
| First((xᵢ)ⁱ) := ⊥ \quad \quad \quad if ∀i, xᵢ =
| First((xᵢ)ⁱ) := x_min{i \vert{} xᵢ ≠ ⊥} \quad if ∃i, xᵢ ≠ ⊥
we now define what it means to run a pattern row against a value
vector of the same length, that is (pᵢ)ⁱ(vᵢ)ⁱ
| pᵢ | vᵢ | result_{pat} |
|--------------------------+----------------------+-------------------------------------------------|
| ∅ | (∅) | [] |
| (_, tail(pᵢ)ⁱ) | (vᵢ) | tail(pᵢ)ⁱ(tail(vᵢ)ⁱ) |
| (x, tail(pᵢ)ⁱ) | (vᵢ) | σ[x↦v₀] if tail(pᵢ)ⁱ(tail(vᵢ)ⁱ) = σ |
| (K(qⱼ)ʲ, tail(pᵢ)ⁱ) | (K(v'ⱼ)ʲ,tail(vⱼ)ʲ) | ((qⱼ)ʲ +++ tail(pᵢ)ⁱ)((v'ⱼ)ʲ +++ tail(vᵢ)ⁱ) |
| (K(qⱼ)ʲ, tail(pᵢ)ⁱ) | (K'(v'ₗ)ˡ,tail(vⱼ)ʲ) | ⊥ if K ≠ K' |
| (q₁\vert{}q₂, tail(pᵢ)ⁱ) | (vᵢ)ⁱ | First((q₁,tail(pᵢ)ⁱ)(vᵢ)ⁱ, (q₂,tail(pᵢ)ⁱ)(vᵢ)ⁱ) |
A source program tₛ is a collection of pattern clauses pointing to A source program tₛ is a collection of pattern clauses pointing to
/bb/ terms. Running a program tₛ against an input value vₛ produces as /bb/ terms. Running a program tₛ against an input value vₛ, written
a result /r/: tₛ(vₛ) produces a result /r/ belonging to the following grammar:
| tₛ ::= (p → bb)^{i∈I} | tₛ ::= (p → bb)^{i∈I}
| p ::= \vert K(pᵢ)ⁱ, i ∈ I \vert (p\vert{}q) \vert n ∈
| r ::= guard list * (Match bb \vert{} NoMatch \vert{} Absurd)
| tₛ(vₛ) → r | tₛ(vₛ) → r
| r ::= guard list * (Match bb \vert{} NoMatch \vert{} Absurd)
\begin{comment}
TODO: understand how to explain guard
\end{comment}
We can define what it means to run an input value vₛ against a source
program tₛ:
| tₛ(vₛ) := ⊥ \quad \quad \quad if ∀i, pᵢ(vₛ) =
| First((xᵢ)ⁱ) := x_min{i \vert{} xᵢ ≠ ⊥} \quad if ∃i, xᵢ ≠ ⊥
| tₛ(vₛ) = Absurd if bb_min{pᵢ → bbᵢ \vert{} pᵢ(vₛ) ≠ ⊥} = /refutation clause/
| tₛ(vₛ) = Match bb_min{pᵢ → bbᵢ \vert{} pᵢ(vₛ) ≠ ⊥}
TODO: argument on what it means to run a source program [ ... ] Big part that I think doesn't need revision from you [ ... ]
/guard/ and /bb/ expressions are treated as blackboxes of OCaml code.
A sound approach for treating these blackboxes would be to inspect the
OCaml compiler during translation to Lambda code and extract the
blackboxes compiled in their Lambda representation.
This would allow to test for equality with the respective blackbox at
the target level.
Given that this level of introspection is currently not possibile, we
decided to restrict the structure of blackboxes to the following (valid) OCaml
code:
#+BEGIN_SRC
external guard : 'a -> 'b = "guard"
external observe : 'a -> 'b = "observe"
#+END_SRC
We assume these two external functions /guard/ and /observe/ with a valid
type that lets the user pass any number of arguments to them.
All the guards are of the form \texttt{guard <arg> <arg> <arg>}, where the
<arg> are expressed using the OCaml pattern matching language.
Similarly, all the right-hand-side expressions are of the form
\texttt{observe <arg> <arg> ...} with the same constraints on arguments.
#+BEGIN_SRC
type t = K1 | K2 of t (* declaration of an algebraic and recursive datatype t *)
let _ = function
| K1 -> observe 0
| K2 K1 -> observe 1
| K2 x when guard x -> observe 2
| K2 (K2 x) as y when guard x y -> observe 3
| K2 _ -> observe 4
#+END_SRC
In our prototype we make use of accessors to encode stored values. In our prototype we make use of accessors to encode stored values.
\begin{minipage}{0.2\linewidth}
\begin{minipage}{0.6\linewidth}
\begin{verbatim} \begin{verbatim}
let value = 1 :: 2 :: 3 :: [] let value = 1 :: 2 :: 3 :: []
(* that can also be written *) (* that can also be written *)
let value = [] let value = []
@ -277,45 +329,54 @@ Base cases:
because a source matrix in the case of empty rows returns because a source matrix in the case of empty rows returns
the first expression and (Leaf bb)(v) := Match bb the first expression and (Leaf bb)(v) := Match bb
2. [| (aⱼ)ʲ, ∅ |] ≡ Failure 2. [| (aⱼ)ʲ, ∅ |] ≡ Failure
Regarding non base cases: Regarding non base cases:
Let's first define Let's first define some auxiliary functions
| let Idx(k) := [0; arity(k)[ - The index family of a constructor
| let First(∅) := ⊥ | Idx(K) := [0; arity(K)[
| let First((aⱼ)ʲ) := a_{min(j∈J≠∅)} - head of an ordered family (we write x for any object here, value, pattern etc.)
\[ | head((xᵢ)^{i ∈ I}) = x_min(I)
m := ((a_i)^i ((p_{ij})^i \to e_j)^{ij}) - tail of an ordered family
\] | tail((xᵢ)^{i ∈ I}) := (xᵢ)^{i ≠ min(I)}
\[ - head constructor of a value or pattern
(k_k)^k := headconstructor(p_{i0})^i | constr(K(xᵢ)ⁱ) = K
\] | constr(_) = ⊥
\begin{equation} | constr(x) = ⊥
Groups(m) := ( k_k \to ((a)_{0.l})^{l \in Idx(k_k)} +++ (a_i)^{i \in I\DZ}), \\ - first non-⊥ element of an ordered family
( if p_{0j} is k(q_l) then \\ | First((xᵢ)ⁱ) := ⊥ \quad \quad \quad if ∀i, xᵢ =
(qₗ)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\ | First((xᵢ)ⁱ) := x_min{i \vert{} xᵢ ≠ ⊥} \quad if ∃i, xᵢ ≠ ⊥
if p_{0j} is \_ then \\ - definition of group decomposition:
(\_)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\ | let constrs((pᵢ)^{i ∈ I}) = { K \vert{} K = constr(pᵢ), i ∈ I }
else \bot )^j ), \\ | let Groups(m) where m = ((aᵢ)ⁱ ((pᵢⱼ)ⁱ → eⱼ)ⁱʲ) =
((a_i)^{i \in I\DZ}, ((p_{ij})^{i \in I\DZ} \to eⱼ if p_{0j} is \_ else \bot)^{j \in J}) | \quad \quad let (Kₖ)ᵏ = constrs(pᵢ₀)ⁱ in
\end{equation} | \quad \quad ( Kₖ →
| \quad \quad \quad \quad ((a₀.ₗ)ˡ +++ tail(aᵢ)ⁱ)
Groups(m) is an auxiliary function that source a matrix m into | \quad \quad \quad \quad (
| \quad \quad \quad \quad if pₒⱼ is Kₖ(qₗ) then
| \quad \quad \quad \quad \quad \quad (qₗ)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ
| \quad \quad \quad \quad if pₒⱼ is _ then
| \quad \quad \quad \quad \quad \quad (_)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ
| \quad \quad \quad \quad else ⊥
| \quad \quad \quad \quad )ʲ
| \quad \quad ), (
| \quad \quad \quad \quad tail(aᵢ)ⁱ, (tail(pᵢⱼ)ⁱ → eⱼ if p₀ⱼ is _ else ⊥)ʲ
| \quad \quad )
Groups(m) is an auxiliary function that decomposes a matrix m into
submatrices, according to the head constructor of their first pattern. submatrices, according to the head constructor of their first pattern.
Groups(m) returns one submatrix m_r for each head constructor k that Groups(m) returns one submatrix m_r for each head constructor K that
occurs on the first row of m, plus one "wildcard submatrix" m_{wild} occurs on the first row of m, plus one "wildcard submatrix" m_{wild}
that matches on all values that do not start with one of those head that matches on all values that do not start with one of those head
constructors. constructors.
Intuitively, m is equivalent to its decomposition in the following Intuitively, m is equivalent to its decomposition in the following
sense: if the first pattern of an input vector (vᵢ)ⁱ starts with one sense: if the first pattern of an input vector (v_i)^i starts with one
of the head constructors k, then running (vᵢ)ⁱ against m is the same of the head constructors Kₖ, then running (v_i)^i against m is the same
as running it against the submatrix mₖ; otherwise (its head as running it against the submatrix m_{K}; otherwise (its head
constructor is none of the k) it is equivalent to running it against constructor ∉ (Kₖ)ᵏ) it is equivalent to running it against
the wildcard submatrix. the wildcard submatrix.
We formalize this intuition as follows: We formalize this intuition as follows
Lemma (Groups): *** Lemma (Groups):
Let \[m\] be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\]. Let /m/ be a matrix with
| Groups(m) = (k_r \to m_r)^k, m_{wild}
For any value vector $(v_i)^l$ such that $v_0 = k(v'_l)^l$ for some For any value vector $(v_i)^l$ such that $v_0 = k(v'_l)^l$ for some
constructor k, constructor k,
we have: we have:
@ -329,43 +390,60 @@ TODO: fix \{0}
\end{comment} \end{comment}
*** Proof: *** Proof:
Let $m$ be a matrix with \[Group(m) = (k_r \to m_r)^k, m_{wild}\] Let /m/ be a matrix ((aᵢ)ⁱ, ((pᵢⱼ)ⁱ → eⱼ)ʲ) with
Let $(v_i)^i$ be an input matrix with $v_0 = k(v'_l)^l$ for some k. | Groups(m) = (Kₖ → mₖ)ᵏ, m_{wild}
We proceed by case analysis: Below we are going to assume that m is a simplified matrix such that
- either k is one of the kₖ for some k the first row does not contain an or-pattern or a binding to a
- or k is none of the (kₖ)ᵏ variable.
Both m(vᵢ)ⁱ and mₖ(vₖ)ᵏ are defined as the first matching result of Let (vᵢ)ⁱ be an input matrix with v₀ = Kᵥ(v'_{l})ˡ for some constructor Kᵥ.
a family over each row rⱼ of a matrix We have to show that:
- if Kₖ = Kᵥ for some Kₖ ∈ constrs(p₀ⱼ)ʲ, then
m(vᵢ)ⁱ = mₖ((v'ₗ)ˡ +++ tail(vᵢ)ⁱ)
- otherwise
m(vᵢ)ⁱ = m_{wild}(tail(vᵢ)ⁱ)
Let us call (rₖⱼ) the j-th row of the submatrix mₖ, and rⱼ_{wild}
the j-th row of the wildcard submatrix m_{wild}.
We know, from the definition of Our goal contains same-behavior equalities between matrices, for
Groups(m), that mₖ is a fixed input vector (vᵢ)ⁱ. It suffices to show same-behavior
| ((a){0.l})^{l∈Idx(kₖ)} +++ (aᵢ)^{i∈I\DZ}), equalities between each row of the matrices for this input
| ( vector. We show that for any j,
| \quad if p_{0j} is k(qₗ) then - if Kₖ = Kᵥ for some Kₖ ∈ constrs(p₀ⱼ)ʲ, then
| \quad \quad (qₗ)ˡ +++ (p_{ij})^{i∈I\DZ } → eⱼ | (pᵢⱼ)ⁱ(vᵢ)ⁱ = rₖⱼ((v'ₗ)ˡ +++ tail(vᵢ)ⁱ
| \quad if p_{0j} is _ then - otherwise
| \quad \quad (_)ˡ +++ (p_{ij})^{i∈I\DZ} → eⱼ | (pᵢⱼ)ⁱ(vᵢ)ⁱ = rⱼ_{wild} tail(vᵢ)ⁱ
| \quad else ⊥ In the first case (Kᵥ is Kₖ for some Kₖ ∈ constrs(p₀ⱼ)ʲ), we
| )^{j∈J} have to prove that
| (pᵢⱼ)ⁱ(vᵢ)ⁱ = rₖⱼ((v'ₗ)ˡ +++ tail(vᵢ)ⁱ
By definition of mₖ we know that rₖⱼ is equal to
| if pₒⱼ is Kₖ(qₗ) then
| \quad (qₗ)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ
| if pₒⱼ is _ then
| \quad (_)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ
| else ⊥
\begin{comment}
Maybe better as a table?
| pₒⱼ | rₖⱼ |
|--------+---------------------------|
| Kₖ(qₗ) | (qₗ)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ |
| _ | (_)ˡ +++ tail(pᵢⱼ)ⁱ → eⱼ |
| else | ⊥ |
\end{comment}
By definition of (pᵢ)ⁱ(vᵢ)ⁱ we know that (pᵢⱼ)ⁱ(vᵢ) is equal to
| (K(qⱼ)ʲ, tail(pᵢⱼ)ⁱ) (K(v'ₗ)ˡ,tail(vᵢ)ⁱ) := ((qⱼ)ʲ +++ tail(pᵢⱼ)ⁱ)((v'ₗ)ˡ +++ tail(vᵢ)ⁱ)
| (_, tail(pᵢⱼ)ⁱ) (vᵢ) := tail(pᵢⱼ)ⁱ(tail(vᵢ)ⁱ)
| (K(qⱼ)ʲ, tail(pᵢⱼ)ⁱ) (K'(v'ₗ)ˡ,tail(vⱼ)ʲ) := ⊥ if K ≠ K'
By definition, m(vᵢ)ⁱ is We prove this first case by a second case analysis on p₀ⱼ.
| m(vᵢ)ⁱ = First(rⱼ(vᵢ)ⁱ)ʲ for m = ((aᵢ)ⁱ, (rⱼ)ʲ)
| (pᵢ)ⁱ (vᵢ)ⁱ = {
| \quad if p₀ = k(qₗ)ˡ, v₀ = k'(v'ₖ)ᵏ, k=Idx(k') and l=Idx(k)
| \quad \quad if k ≠ k' then ⊥
| if k = k' then ((qₗ)ˡ +++ (pᵢ)^{i∈I\DZ}) ((v'ₖ)ᵏ +++ (vᵢ)^{i∈I\DZ} )
| if p₀ = (q₁\vert{}q₂) then
| First( (q₁pᵢ^{i∈I\DZ}) vᵢ^{i∈I\DZ}, (q₂pᵢ^{i∈I \DZ}) vᵢ^{i∈I\DZ})}
For this reason, if we can prove that TODO
| ∀j, rⱼ(vᵢ)ⁱ = r'ⱼ((v'ₖ)ᵏ ++ (vᵢ)ⁱ)
it follows that
| m(vᵢ)ⁱ = mₖ((v'ₖ)ᵏ ++ (vᵢ)ⁱ)
from the above definition.
We can also show that aᵢ = (a_{0.l})ˡ +++ a_{i∈I\DZ} because v(a₀) = K(v(a){0.l})ˡ) In the second case (Kᵥ is distinct from Kₖ for all Kₖ ∈ constrs(pₒⱼ)ʲ),
we have to prove that
| (pᵢⱼ)ⁱ(vᵢ)ⁱ = rⱼ_{wild} tail(vᵢ)ⁱ
TODO
** Target translation ** Target translation
@ -436,7 +514,7 @@ respective decision tree produces the same result
The equivalence checking algorithm takes as input a domain of The equivalence checking algorithm takes as input a domain of
possible values \emph{S} and a pair of source and target decision trees and possible values \emph{S} and a pair of source and target decision trees and
in case the two trees are not equivalent it returns a counter example. in case the two trees are not equivalent it returns a counter example.
The algorithm respects the following correctness statement: Our algorithm respects the following correctness statement:
\begin{comment} \begin{comment}
TODO: we have to define what \coversTEX mean for readers to understand the specifications TODO: we have to define what \coversTEX mean for readers to understand the specifications
@ -554,14 +632,13 @@ We say that Cₜ covers the input space /S/, written
/covers(Cₜ, S)/ when every value vₛ∈S is a valid input to the /covers(Cₜ, S)/ when every value vₛ∈S is a valid input to the
decision tree Cₜ. (TODO: rephrase) decision tree Cₜ. (TODO: rephrase)
Given an input space /S/ and a couple of decision trees, where Given an input space /S/ and a couple of decision trees, where
the target decision tree Cₜ covers the input space /S/, we say that the target decision tree Cₜ covers the input space /S/ we can define equivalence:
the two decision trees are equivalent when:
| equiv(S, Cₛ, Cₜ, gs) = Yes ∧ covers(Cₜ, S) → ∀vₛ≃vₜ ∈ S, Cₛ(vₛ) ≃gs Cₜ(vₜ) | equiv(S, Cₛ, Cₜ, gs) = Yes ∧ covers(Cₜ, S) → ∀vₛ≃vₜ ∈ S, Cₛ(vₛ) ≃gs Cₜ(vₜ)
Similarly we say that a couple of decision trees in the presence of Similarly we say that a couple of decision trees in the presence of
an input space /S/ are /not/ equivalent when: an input space /S/ are /not/ equivalent in the following way:
| equiv(S, Cₛ, Cₜ, gs) = No(vₛ,vₜ) ∧ covers(Cₜ, S) → vₛ≃vₜ ∈ S ∧ Cₛ(vₛ) ≠gs Cₜ(vₜ) | equiv(S, Cₛ, Cₜ, gs) = No(vₛ,vₜ) ∧ covers(Cₜ, S) → vₛ≃vₜ ∈ S ∧ Cₛ(vₛ) ≠gs Cₜ(vₜ)
Corollary: For a full input space /S/, that is the universe of the Corollary: For a full input space /S/, that is the universe of the
target program we say: target program:
| equiv(S, 〚tₛ〛ₛ, 〚tₜ〛ₜ, ∅) = Yes ⇔ tₛ ≃ tₜ | equiv(S, 〚tₛ〛ₛ, 〚tₜ〛ₜ, ∅) = Yes ⇔ tₛ ≃ tₜ
@ -663,7 +740,7 @@ I think the unreachable case should go at the end.
3. When we have a Leaf or a Failure at the left side: 3. When we have a Leaf or a Failure at the left side:
| equiv(S, Failure as Cₛ, Switch(a, (πᵢ → Cₜᵢ)^{i∈I})) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)^{i∈I}) | equiv(S, Failure as Cₛ, Switch(a, (πᵢ → Cₜᵢ)^{i∈I})) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)^{i∈I})
| equiv(S, Leaf bbₛ as Cₛ, Switch(a, (πᵢ → Cₜᵢ)^{i∈I})) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)^{i∈I}) | equiv(S, Leaf bbₛ as Cₛ, Switch(a, (πᵢ → Cₜᵢ)^{i∈I})) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)^{i∈I})
The algorithm either returns Yes for every sub-input space Sᵢ := S∩(a→π(kᵢ)) and Our algorithm either returns Yes for every sub-input space Sᵢ := S∩(a→π(kᵢ)) and
subtree Cₜᵢ subtree Cₜᵢ
| equiv(Sᵢ, Cₛ, Cₜᵢ) = Yes ∀i | equiv(Sᵢ, Cₛ, Cₜᵢ) = Yes ∀i
or we have a counter example vₛ, vₜ for which or we have a counter example vₛ, vₜ for which
@ -676,7 +753,7 @@ I think the unreachable case should go at the end.
4. When we have a Switch on the right we define π_{fallback} as the domain of 4. When we have a Switch on the right we define π_{fallback} as the domain of
values not covered but the union of the constructors kᵢ values not covered but the union of the constructors kᵢ
| π_{fallback} = ¬\bigcup\limits_{i∈I}π(kᵢ) | π_{fallback} = ¬\bigcup\limits_{i∈I}π(kᵢ)
The algorithm proceeds by trimming Our algorithm proceeds by trimming
| equiv(S, Switch(a, (kᵢ → Cₛᵢ)^{i∈I}, C_{sf}), Cₜ) := | equiv(S, Switch(a, (kᵢ → Cₛᵢ)^{i∈I}, C_{sf}), Cₜ) :=
| Forall(equiv( S∩(a→π(kᵢ)^{i∈I}), Cₛᵢ, C_{t/a→π(kᵢ)})^{i∈I} +++ equiv(S∩(a→πₙ), Cₛ, C_{a→π_{fallback}})) | Forall(equiv( S∩(a→π(kᵢ)^{i∈I}), Cₛᵢ, C_{t/a→π(kᵢ)})^{i∈I} +++ equiv(S∩(a→πₙ), Cₛ, C_{a→π_{fallback}}))
The statement still holds and we show this by first analyzing the The statement still holds and we show this by first analyzing the