correzioni coppo #1 vero
This commit is contained in:
parent
f464d23c8f
commit
14cf53ee8d
3 changed files with 142 additions and 57 deletions
|
@ -7,7 +7,7 @@ try:
|
|||
except:
|
||||
allsymbols = json.load(open('../unicode-latex.json'))
|
||||
|
||||
mysymbols = ['≡', '≠', '≼', '→', '←', '⊀', '⋠', '≺', '∀', 'ε', '₀', '₂', '₁', '₃', 'ₐ', 'ₖ', 'ₘ', 'ₙ', 'ᵢ', 'ⁱ', '⋮', 'ₛ', 'ₜ', '≃', '⇔', '∧', '∅', 'ℕ', 'ⱼ', 'ʲ', '⊥', 'π', 'α', 'β', '∞', 'σ', '≤', '⊈', '∧', '∨', '∃', '⇒', '∩', '∉', '⋃', 'ᵏ', 'ₗ', 'ˡ', 'ₒ', 'ᵣ', 'ᴵ', '≈', '⊆' ]
|
||||
mysymbols = ['≡', '≠', '≼', '→', '←', '⊀', '⋠', '≺', '∀', 'ε', '₀', '₂', '₁', '₃', '₄', 'ₐ', 'ₖ', 'ₘ', 'ₙ', 'ᵢ', 'ⁱ', '⋮', 'ₛ', 'ₜ', '≃', '⇔', '∧', '∅', 'ℕ', 'ⱼ', 'ʲ', '⊥', 'π', 'α', 'β', '∞', 'σ', '≤', '⊈', '∧', '∨', '∃', '⇒', '∩', '∉', '⋃', 'ᵏ', 'ₗ', 'ˡ', 'ₒ', 'ᵣ', 'ᴵ', '≈', '⊆' ]
|
||||
extrasymbols = {'〚': '\llbracket', '〛': r'\rrbracket', '̸': '\neg', '¬̸': '\neg', '∈': '\in ', 'ₛ': '_S', 'ₜ': '_T'}
|
||||
|
||||
symbols = {s: allsymbols[s] for s in mysymbols}
|
||||
|
|
BIN
tesi/tesi.pdf
BIN
tesi/tesi.pdf
Binary file not shown.
|
@ -106,7 +106,7 @@ by applying symbolic execution.
|
|||
A pattern matching compiler turns a series of pattern matching clauses
|
||||
into simple control flow structures such as \texttt{if, switch}, for example:
|
||||
\begin{lstlisting}
|
||||
match x with
|
||||
match scrutinee with
|
||||
| [] -> (0, None)
|
||||
| x::[] -> (1, Some x)
|
||||
| _::y::_ -> (2, Some y)
|
||||
|
@ -115,17 +115,25 @@ Given as input to the pattern matching compiler, this snippet of code
|
|||
gets translated into the Lambda intermediate representation of
|
||||
the OCaml compiler. The Lambda representation of a program is shown by
|
||||
calling the \texttt{ocamlc} compiler with \texttt{-drawlambda} flag.
|
||||
In this example we renamed the variables assigned in order to ease the
|
||||
understanding of the tests that are performed when the code is
|
||||
translated into the Lambda form.
|
||||
code phase.
|
||||
\begin{lstlisting}
|
||||
(if scrutinee
|
||||
(let (field_1 =a (field 1 scrutinee))
|
||||
(if field_1
|
||||
(let
|
||||
(field_1_1 =a (field 1 field_1)
|
||||
x =a (field 0 field_1))
|
||||
(makeblock 0 2 (makeblock 0 x)))
|
||||
(let (y =a (field 0 scrutinee))
|
||||
(makeblock 0 1 (makeblock 0 y)))))
|
||||
[0: 0 0a])
|
||||
(function scrutinee
|
||||
(if scrutinee ;;; true when scrutinee (list) not empty
|
||||
(let (tail =a (field 1 scrutinee/81)) ;;; assignment
|
||||
(if tail
|
||||
(let
|
||||
y =a (field 0 tail))
|
||||
;;; y is the first element of the tail
|
||||
(makeblock 0 2 (makeblock 0 y)))
|
||||
;;; allocate memory for tuple (2, Some y)
|
||||
(let (x =a (field 0 scrutinee))
|
||||
;;; x is the head of the scrutinee
|
||||
(makeblock 0 1 (makeblock 0 x)))))
|
||||
;;; allocate memory for tuple (1, Some x)
|
||||
[0: 0 0a]))) ;;; low level representatio of (0, None)
|
||||
\end{lstlisting}
|
||||
|
||||
The OCaml pattern matching compiler is a critical part of the OCaml compiler
|
||||
|
@ -215,10 +223,28 @@ reduced tree is equivalent to $C_i$.
|
|||
\subsection{From source programs to decision trees}
|
||||
Our source language supports integers, lists, tuples and all algebraic
|
||||
datatypes. Patterns support wildcards, constructors and literals,
|
||||
Or-patterns such as $(p_1 | p_2)$ and pattern variables. We also support
|
||||
\texttt{when} guards, which are interesting as they introduce the
|
||||
evaluation of expressions during matching. Decision trees have nodes
|
||||
of the form:
|
||||
Or-patterns such as $(p_1 | p_2)$ and pattern variables.
|
||||
In particular Or-patterns provide a more compact way to group patterns
|
||||
that point to the same expression.
|
||||
|
||||
\begin{minipage}{0.4\linewidth}
|
||||
\begin{lstlisting}
|
||||
match w with
|
||||
| p₁ -> expr
|
||||
| p₂ -> expr
|
||||
| p₃ -> expr
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
\begin{minipage}{0.6\linewidth}
|
||||
\begin{lstlisting}
|
||||
match w with
|
||||
| p₁|p₂|p₃ -> expr
|
||||
\end{lstlisting}
|
||||
\end{minipage}
|
||||
We also support \texttt{when} guards, which are interesting as they
|
||||
introduce the evaluation of expressions during matching.
|
||||
This is the type definition of decision tree as they are used in the
|
||||
prototype implementation:
|
||||
\begin{lstlisting}
|
||||
type decision_tree =
|
||||
| Unreachable
|
||||
|
@ -236,15 +262,18 @@ matched by the source clauses.
|
|||
\texttt{Unreachable} is used when we statically know that no value
|
||||
can flow to that subtree.
|
||||
|
||||
We write 〚tₛ〛ₛ for the decision tree of the source program
|
||||
t_S, computed by a matrix decomposition algorithm (each column
|
||||
We write 〚tₛ〛ₛ to denote the translation of the source program (the
|
||||
set of pattern matching clauses) into a decision tree, computed by a matrix decomposition algorithm (each column
|
||||
decomposition step gives a \texttt{Switch} node).
|
||||
It satisfies the following correctness statement:
|
||||
\[
|
||||
\forall t_s, \forall v_s, \quad t_s(v_s) = \semTEX{t_s}_s(v_s)
|
||||
\]
|
||||
Running any source value $v_S$ against the source program gives the
|
||||
same result as running it against the decision tree.
|
||||
The correctness statement intuitively states that for every source
|
||||
program, for every source value that is well-formed input to a source
|
||||
program, running the program tₛ against the input value vₛ is the same
|
||||
as running the compiled source program 〚tₛ〛 (that is a decision tree) against the same input
|
||||
value vₛ".
|
||||
|
||||
\subsection{From target programs to decision trees}
|
||||
The target programs include the following Lambda constructs:
|
||||
|
@ -259,8 +288,10 @@ nodes are never emitted.
|
|||
Guards result in branching. In comparison with the source decision
|
||||
trees, \texttt{Unreachable} nodes are never emitted.
|
||||
|
||||
We write $\semTEX{t_T}_T$ for the decision tree of the target program
|
||||
$t_T$, satisfying the following correctness statement:
|
||||
We write $\semTEX{t_T}_T$ to denote the translation of a target
|
||||
program tₜ into a decision tree of the target program
|
||||
$t_T$, satisfying the following correctness statement that is
|
||||
simmetric to the correctness statement for the translation of source programs:
|
||||
\[
|
||||
\forall t_T, \forall v_T, \quad t_T(v_T) = \semTEX{t_T}_T(v_T)
|
||||
\]
|
||||
|
@ -819,7 +850,7 @@ existing translators that consists of taking the source and the target
|
|||
* Translation validation of the Pattern Matching Compiler
|
||||
|
||||
** Source program
|
||||
The algorithm takes as its input a source program and translates it
|
||||
Our algorithm takes as its input a source program and translates it
|
||||
into an algebraic data structure which type we call /decision_tree/.
|
||||
|
||||
#+BEGIN_SRC
|
||||
|
@ -957,18 +988,32 @@ All the guards are of the form \texttt{guard <arg> <arg> <arg>}, where the
|
|||
<arg> are expressed using the OCaml pattern matching language.
|
||||
Similarly, all the right-hand-side expressions are of the form
|
||||
\texttt{observe <arg> <arg> ...} with the same constraints on arguments.
|
||||
|
||||
#+BEGIN_SRC
|
||||
type t = K1 | K2 of t (* declaration of an algebraic and recursive datatype t *)
|
||||
|
||||
let _ = function
|
||||
| K1 -> observe 0
|
||||
| K2 K1 -> observe 1
|
||||
| K2 x when guard x -> observe 2
|
||||
| K2 x when guard x -> observe 2 (* guard inspects the x variable *)
|
||||
| K2 (K2 x) as y when guard x y -> observe 3
|
||||
| K2 _ -> observe 4
|
||||
#+END_SRC
|
||||
|
||||
We note that the right hand side of /observe/ is just an arbitrary
|
||||
value and in this case just enumerates the order in which expressions
|
||||
appear.
|
||||
Even if this is an oversimplification of the problem for the
|
||||
prototype, it must be noted that at the compiler level we have the
|
||||
possibility to compile the pattern clauses in two separate steps so
|
||||
that the guards and right-hand-side expressions are semantically equal
|
||||
to their counterparts at the target program level.
|
||||
\begin{lstlisting}
|
||||
let _ = function
|
||||
| K1 -> lambda₀
|
||||
| K2 K1 -> lambda₁
|
||||
| K2 x when lambda-guard₀ -> lambda₂
|
||||
| K2 (K2 x) as y when lambda-guard₁ -> lambda₃
|
||||
| K2 _ -> lambda₄
|
||||
\end{lstlisting}
|
||||
The source program is parsed using the ocaml-compiler-libs library.
|
||||
The result of parsing, when successful, results in a list of clauses
|
||||
and a list of type declarations.
|
||||
|
@ -1008,24 +1053,27 @@ following rules apply
|
|||
When a value /v/ matches pattern /p/ we say that /v/ is an /instance/ of /p/.
|
||||
|
||||
|
||||
During compilation by the translators, expressions are compiled into
|
||||
During compilation by the translators, expressions at the
|
||||
right-hand-side are compiled into
|
||||
Lambda code and are referred as lambda code actions lᵢ.
|
||||
|
||||
The entire pattern matching code is represented as a clause matrix
|
||||
that associates rows of patterns (p_{i,1}, p_{i,2}, ..., p_{i,n}) to
|
||||
lambda code action lⁱ
|
||||
We define the /pattern matrix/ P as the matrix |m x n| where m bigger
|
||||
or equal than the number of clauses in the source program and n is
|
||||
equal to the arity of the constructor with the gratest arity.
|
||||
\begin{equation*}
|
||||
(P → L) =
|
||||
P =
|
||||
\begin{pmatrix}
|
||||
p_{1,1} & p_{1,2} & \cdots & p_{1,n} & → l₁ \\
|
||||
p_{2,1} & p_{2,2} & \cdots & p_{2,n} & → l₂ \\
|
||||
\vdots & \vdots & \ddots & \vdots & → \vdots \\
|
||||
p_{m,1} & p_{m,2} & \cdots & p_{m,n} & → lₘ
|
||||
p_{1,1} & p_{1,2} & \cdots & p_{1,n} \\
|
||||
p_{2,1} & p_{2,2} & \cdots & p_{2,n} \\
|
||||
\vdots & \vdots & \ddots & \vdots \\
|
||||
p_{m,1} & p_{m,2} & \cdots & p_{m,n} )
|
||||
\end{pmatrix}
|
||||
\end{equation*}
|
||||
|
||||
every row of /P/ is called a pattern vector
|
||||
$\vec{p_i}$ = (p₁, p₂, ..., pₙ); In every instance of P pattern
|
||||
vectors appear normalized on the length of the longest pattern vector.
|
||||
Considering the pattern matrix P we say that the value vector
|
||||
$\vec{v}$ = (v₁, v₂, ..., vᵢ) matches the line number i in P if and only if the following two
|
||||
$\vec{v}$ = (v₁, v₂, ..., vᵢ) matches the pattern vector pᵢ in P if and only if the following two
|
||||
conditions are satisfied:
|
||||
- p_{i,1}, p_{i,2}, \cdots, p_{i,n} ≼ (v₁, v₂, ..., vᵢ)
|
||||
- ∀j < i p_{j,1}, p_{j,2}, \cdots, p_{j,n} ⋠ (v₁, v₂, ..., vᵢ)
|
||||
|
@ -1038,10 +1086,20 @@ We can define the following three relations with respect to patterns:
|
|||
- Patterns p and q are compatible when they share a common instance
|
||||
|
||||
\subsubsection{Matrix decomposition of pattern clauses}
|
||||
|
||||
We define a new object, the /clause matrix/ P → L of size |m x n+1| that associates
|
||||
pattern vectors $\vec{p_i}$ to lambda code action lᵢ.
|
||||
\begin{equation*}
|
||||
P → L =
|
||||
\begin{pmatrix}
|
||||
p_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
|
||||
p_{2,1} & p_{2,2} & \cdots & p_{2,n} → l₂ \\
|
||||
\vdots & \vdots & \ddots & \vdots → \vdots \\
|
||||
p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ
|
||||
\end{pmatrix}
|
||||
\end{equation*}
|
||||
The initial input of the decomposition algorithm C consists of a vector of variables
|
||||
$\vec{x}$ = (x₁, x₂, ..., xₙ) of size /n/ where /n/ is the arity of
|
||||
the type of /x/ and a clause matrix P → L of width n and height m.
|
||||
the type of /x/ and the /clause matrix/ P → L.
|
||||
That is:
|
||||
|
||||
\begin{equation*}
|
||||
|
@ -1050,12 +1108,12 @@ C((\vec{x} = (x₁, x₂, ..., xₙ),
|
|||
p_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
|
||||
p_{2,1} & p_{2,2} & \cdots & p_{2,n} → l₂ \\
|
||||
\vdots & \vdots & \ddots & \vdots → \vdots \\
|
||||
p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ)
|
||||
\end{pmatrix}
|
||||
p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ
|
||||
\end{pmatrix})
|
||||
\end{equation*}
|
||||
|
||||
The base case C₀ of the algorithm is the case in which the $\vec{x}$
|
||||
is empty and the result of the compilation
|
||||
is an empty sequence and the result of the compilation
|
||||
C₀ is l₁
|
||||
\begin{equation*}
|
||||
C₀((),
|
||||
|
@ -1091,7 +1149,7 @@ following four rules:
|
|||
for wildcard patterns and the lambda action lᵢ remains unchanged.
|
||||
|
||||
2) Constructor rule: if all patterns in the first column of P are
|
||||
constructors patterns of the form k(q₁, q₂, ..., qₙ) we define a
|
||||
constructors patterns of the form k(q₁, q₂, ..., q_{n'}) we define a
|
||||
new matrix, the specialized clause matrix S, by applying the
|
||||
following transformation on every row /p/:
|
||||
\begin{lstlisting}[mathescape,columns=fullflexible,basicstyle=\fontfamily{lmvtt}\selectfont,]
|
||||
|
@ -1147,11 +1205,40 @@ following four rules:
|
|||
largest prefix matrix for which one of the three previous rules
|
||||
apply, and P₂ → L₂ containing the remaining rows. The algorithm is
|
||||
applied to both matrices.
|
||||
|
||||
|
||||
It is important to note that the application of the decomposition
|
||||
algorithm converges. This intuition can be verified by defining the
|
||||
size of the clause matrix P → L as equal to the length of the longest
|
||||
pattern vector $\vec{p_i}$ and the length of a pattern vector as the
|
||||
number of symbols that appear in the clause.
|
||||
While it is very easy to see that the application of rules 1) and 4)
|
||||
produces new matrices of length equal or smaller than the original
|
||||
clause matrix, we can show that:
|
||||
- with the application of the constructor rule the pattern vector $\vec{p_i}$ loses one
|
||||
symbol after its decomposition:
|
||||
| \vert{}(p_{i,1} (q₁, q₂, ..., q_{n'}), p_{i,2}, p_{i,3}, ..., p_{i,n})\vert{} = n + n'
|
||||
| \vert{}(q_{i,1}, q_{i,2}, ..., q_{i,n'}, p_{i,2}, p_{i,3}, ..., p_{i,n})\vert{} = n + n' - 1
|
||||
- with the application of the orpat rule, we add one row to the clause
|
||||
matrix P → L but the length of a row containing an
|
||||
Or-pattern decreases
|
||||
\begin{equation*}
|
||||
\vert{}P → L\vert{} = \big\lvert
|
||||
\begin{pmatrix}
|
||||
(p_{1,1}\vert{}q_{1,1}) & p_{1,2} & \cdots & p_{1,n} → l₁ \\
|
||||
\vdots & \vdots & \ddots & \vdots → \vdots \\
|
||||
\end{pmatrix}\big\rvert = n + 1
|
||||
\end{equation*}
|
||||
\begin{equation*}
|
||||
\vert{}P' → L'\vert{} = \big\lvert
|
||||
\begin{pmatrix}
|
||||
p_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
|
||||
q_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
|
||||
\vdots & \vdots & \ddots & \vdots → \vdots \\
|
||||
\end{pmatrix}\big\rvert = n
|
||||
\end{equation*}
|
||||
In our prototype we make use of accessors to encode stored values.
|
||||
\begin{minipage}{0.2\linewidth}
|
||||
\begin{verbatim}
|
||||
|
||||
let value = 1 :: 2 :: 3 :: []
|
||||
(* that can also be written *)
|
||||
let value = []
|
||||
|
@ -1228,7 +1315,6 @@ Base cases:
|
|||
because a source matrix in the case of empty rows returns
|
||||
the first expression and (Leaf bb)(v) := Match bb
|
||||
2. [| (aⱼ)ʲ, ∅ |] ≡ Failure
|
||||
|
||||
Regarding non base cases:
|
||||
Let's first define
|
||||
| let Idx(k) := [0; arity(k)[
|
||||
|
@ -1240,16 +1326,15 @@ m := ((a_i)^i ((p_{ij})^i \to e_j)^{ij})
|
|||
\[
|
||||
(k_k)^k := headconstructor(p_{i0})^i
|
||||
\]
|
||||
\begin{equation}
|
||||
Groups(m) := ( k_k \to ((a)_{0.l})^{l \in Idx(k_k)} +++ (a_i)^{i \in I\DZ}), \\
|
||||
( if p_{0j} is k(q_l) then \\
|
||||
(qₗ)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\
|
||||
if p_{0j} is \_ then \\
|
||||
(\_)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\
|
||||
else \bot )^j ), \\
|
||||
((a_i)^{i \in I\DZ}, ((p_{ij})^{i \in I\DZ} \to eⱼ if p_{0j} is \_ else \bot)^{j \in J})
|
||||
\end{equation}
|
||||
|
||||
| Groups(m) := (k_{k} \to ((a)_{0.l})^{l \in Idx(k_{k})} +++ (a_{i})^{i\in{}I\DZ}),
|
||||
|(
|
||||
| \quad if p_{0j} is k(q_{l}) then
|
||||
| \quad \quad \quad (qₗ)^{l \in Idx(k_k)} +++ (p_{ij})^{i\in{}I\DZ} \to e_{j}
|
||||
| \quad if p_{0j} is "_" then
|
||||
| \quad \quad \quad ("_")^{l \in Idx(k_{k})} +++ (p_{ij})^{i\in{}I\DZ} \to e_{j}
|
||||
| \quad else \bot
|
||||
|)^{j ∈ J},
|
||||
| ((a_{i})^{i\in{}I\DZ}, ((p_{ij})^{i\in{}I\DZ} \to eⱼ if p_{0j} is \_ else \bot)^{j\in{}J})
|
||||
Groups(m) is an auxiliary function that source a matrix m into
|
||||
submatrices, according to the head constructor of their first pattern.
|
||||
Groups(m) returns one submatrix m_r for each head constructor k that
|
||||
|
@ -1266,7 +1351,7 @@ the wildcard submatrix.
|
|||
|
||||
We formalize this intuition as follows:
|
||||
Lemma (Groups):
|
||||
Let \[m\] be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\].
|
||||
Let /m/ be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\].
|
||||
For any value vector $(v_i)^l$ such that $v_0 = k(v'_l)^l$ for some
|
||||
constructor k,
|
||||
we have:
|
||||
|
|
Loading…
Reference in a new issue