correzioni coppo vero

This commit is contained in:
Francesco Mecca 2020-04-11 17:56:40 +02:00
parent f464d23c8f
commit 14cf53ee8d
3 changed files with 142 additions and 57 deletions

View file

@ -7,7 +7,7 @@ try:
except:
allsymbols = json.load(open('../unicode-latex.json'))
mysymbols = ['', '', '', '', '', '', '', '', '', 'ε', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'ʲ', '', 'π', 'α', 'β', '', 'σ', '', '', '', '', '', '', '', '', '', '', '', 'ˡ', '', '', '', '', '' ]
mysymbols = ['', '', '', '', '', '', '', '', '', 'ε', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'ʲ', '', 'π', 'α', 'β', '', 'σ', '', '', '', '', '', '', '', '', '', '', '', 'ˡ', '', '', '', '', '' ]
extrasymbols = {'': '\llbracket', '': r'\rrbracket', '̸': '\neg', '¬̸': '\neg', '': '\in ', '': '_S', '': '_T'}
symbols = {s: allsymbols[s] for s in mysymbols}

Binary file not shown.

View file

@ -106,7 +106,7 @@ by applying symbolic execution.
A pattern matching compiler turns a series of pattern matching clauses
into simple control flow structures such as \texttt{if, switch}, for example:
\begin{lstlisting}
match x with
match scrutinee with
| [] -> (0, None)
| x::[] -> (1, Some x)
| _::y::_ -> (2, Some y)
@ -115,17 +115,25 @@ Given as input to the pattern matching compiler, this snippet of code
gets translated into the Lambda intermediate representation of
the OCaml compiler. The Lambda representation of a program is shown by
calling the \texttt{ocamlc} compiler with \texttt{-drawlambda} flag.
In this example we renamed the variables assigned in order to ease the
understanding of the tests that are performed when the code is
translated into the Lambda form.
code phase.
\begin{lstlisting}
(if scrutinee
(let (field_1 =a (field 1 scrutinee))
(if field_1
(function scrutinee
(if scrutinee ;;; true when scrutinee (list) not empty
(let (tail =a (field 1 scrutinee/81)) ;;; assignment
(if tail
(let
(field_1_1 =a (field 1 field_1)
x =a (field 0 field_1))
(makeblock 0 2 (makeblock 0 x)))
(let (y =a (field 0 scrutinee))
(makeblock 0 1 (makeblock 0 y)))))
[0: 0 0a])
y =a (field 0 tail))
;;; y is the first element of the tail
(makeblock 0 2 (makeblock 0 y)))
;;; allocate memory for tuple (2, Some y)
(let (x =a (field 0 scrutinee))
;;; x is the head of the scrutinee
(makeblock 0 1 (makeblock 0 x)))))
;;; allocate memory for tuple (1, Some x)
[0: 0 0a]))) ;;; low level representatio of (0, None)
\end{lstlisting}
The OCaml pattern matching compiler is a critical part of the OCaml compiler
@ -215,10 +223,28 @@ reduced tree is equivalent to $C_i$.
\subsection{From source programs to decision trees}
Our source language supports integers, lists, tuples and all algebraic
datatypes. Patterns support wildcards, constructors and literals,
Or-patterns such as $(p_1 | p_2)$ and pattern variables. We also support
\texttt{when} guards, which are interesting as they introduce the
evaluation of expressions during matching. Decision trees have nodes
of the form:
Or-patterns such as $(p_1 | p_2)$ and pattern variables.
In particular Or-patterns provide a more compact way to group patterns
that point to the same expression.
\begin{minipage}{0.4\linewidth}
\begin{lstlisting}
match w with
| p₁ -> expr
| p₂ -> expr
| p₃ -> expr
\end{lstlisting}
\end{minipage}
\begin{minipage}{0.6\linewidth}
\begin{lstlisting}
match w with
| p₁|p₂|p₃ -> expr
\end{lstlisting}
\end{minipage}
We also support \texttt{when} guards, which are interesting as they
introduce the evaluation of expressions during matching.
This is the type definition of decision tree as they are used in the
prototype implementation:
\begin{lstlisting}
type decision_tree =
| Unreachable
@ -236,15 +262,18 @@ matched by the source clauses.
\texttt{Unreachable} is used when we statically know that no value
can flow to that subtree.
We write 〚tₛ〛ₛ for the decision tree of the source program
t_S, computed by a matrix decomposition algorithm (each column
We write 〚tₛ〛ₛ to denote the translation of the source program (the
set of pattern matching clauses) into a decision tree, computed by a matrix decomposition algorithm (each column
decomposition step gives a \texttt{Switch} node).
It satisfies the following correctness statement:
\[
\forall t_s, \forall v_s, \quad t_s(v_s) = \semTEX{t_s}_s(v_s)
\]
Running any source value $v_S$ against the source program gives the
same result as running it against the decision tree.
The correctness statement intuitively states that for every source
program, for every source value that is well-formed input to a source
program, running the program tₛ against the input value vₛ is the same
as running the compiled source program 〚tₛ〛 (that is a decision tree) against the same input
value vₛ".
\subsection{From target programs to decision trees}
The target programs include the following Lambda constructs:
@ -259,8 +288,10 @@ nodes are never emitted.
Guards result in branching. In comparison with the source decision
trees, \texttt{Unreachable} nodes are never emitted.
We write $\semTEX{t_T}_T$ for the decision tree of the target program
$t_T$, satisfying the following correctness statement:
We write $\semTEX{t_T}_T$ to denote the translation of a target
program tₜ into a decision tree of the target program
$t_T$, satisfying the following correctness statement that is
simmetric to the correctness statement for the translation of source programs:
\[
\forall t_T, \forall v_T, \quad t_T(v_T) = \semTEX{t_T}_T(v_T)
\]
@ -819,7 +850,7 @@ existing translators that consists of taking the source and the target
* Translation validation of the Pattern Matching Compiler
** Source program
The algorithm takes as its input a source program and translates it
Our algorithm takes as its input a source program and translates it
into an algebraic data structure which type we call /decision_tree/.
#+BEGIN_SRC
@ -957,18 +988,32 @@ All the guards are of the form \texttt{guard <arg> <arg> <arg>}, where the
<arg> are expressed using the OCaml pattern matching language.
Similarly, all the right-hand-side expressions are of the form
\texttt{observe <arg> <arg> ...} with the same constraints on arguments.
#+BEGIN_SRC
type t = K1 | K2 of t (* declaration of an algebraic and recursive datatype t *)
let _ = function
| K1 -> observe 0
| K2 K1 -> observe 1
| K2 x when guard x -> observe 2
| K2 x when guard x -> observe 2 (* guard inspects the x variable *)
| K2 (K2 x) as y when guard x y -> observe 3
| K2 _ -> observe 4
#+END_SRC
We note that the right hand side of /observe/ is just an arbitrary
value and in this case just enumerates the order in which expressions
appear.
Even if this is an oversimplification of the problem for the
prototype, it must be noted that at the compiler level we have the
possibility to compile the pattern clauses in two separate steps so
that the guards and right-hand-side expressions are semantically equal
to their counterparts at the target program level.
\begin{lstlisting}
let _ = function
| K1 -> lambda₀
| K2 K1 -> lambda₁
| K2 x when lambda-guard₀ -> lambda₂
| K2 (K2 x) as y when lambda-guard₁ -> lambda₃
| K2 _ -> lambda₄
\end{lstlisting}
The source program is parsed using the ocaml-compiler-libs library.
The result of parsing, when successful, results in a list of clauses
and a list of type declarations.
@ -1008,24 +1053,27 @@ following rules apply
When a value /v/ matches pattern /p/ we say that /v/ is an /instance/ of /p/.
During compilation by the translators, expressions are compiled into
During compilation by the translators, expressions at the
right-hand-side are compiled into
Lambda code and are referred as lambda code actions lᵢ.
The entire pattern matching code is represented as a clause matrix
that associates rows of patterns (p_{i,1}, p_{i,2}, ..., p_{i,n}) to
lambda code action lⁱ
We define the /pattern matrix/ P as the matrix |m x n| where m bigger
or equal than the number of clauses in the source program and n is
equal to the arity of the constructor with the gratest arity.
\begin{equation*}
(P → L) =
P =
\begin{pmatrix}
p_{1,1} & p_{1,2} & \cdots & p_{1,n} & → l₁ \\
p_{2,1} & p_{2,2} & \cdots & p_{2,n} & → l₂ \\
\vdots & \vdots & \ddots & \vdots & → \vdots \\
p_{m,1} & p_{m,2} & \cdots & p_{m,n} & → lₘ
p_{1,1} & p_{1,2} & \cdots & p_{1,n} \\
p_{2,1} & p_{2,2} & \cdots & p_{2,n} \\
\vdots & \vdots & \ddots & \vdots \\
p_{m,1} & p_{m,2} & \cdots & p_{m,n} )
\end{pmatrix}
\end{equation*}
every row of /P/ is called a pattern vector
$\vec{p_i}$ = (p₁, p₂, ..., pₙ); In every instance of P pattern
vectors appear normalized on the length of the longest pattern vector.
Considering the pattern matrix P we say that the value vector
$\vec{v}$ = (v₁, v₂, ..., vᵢ) matches the line number i in P if and only if the following two
$\vec{v}$ = (v₁, v₂, ..., vᵢ) matches the pattern vector pᵢ in P if and only if the following two
conditions are satisfied:
- p_{i,1}, p_{i,2}, \cdots, p_{i,n} ≼ (v₁, v₂, ..., vᵢ)
- ∀j < i p_{j,1}, p_{j,2}, \cdots, p_{j,n} ⋠ (v₁, v₂, ..., vᵢ)
@ -1038,10 +1086,20 @@ We can define the following three relations with respect to patterns:
- Patterns p and q are compatible when they share a common instance
\subsubsection{Matrix decomposition of pattern clauses}
We define a new object, the /clause matrix/ P → L of size |m x n+1| that associates
pattern vectors $\vec{p_i}$ to lambda code action lᵢ.
\begin{equation*}
P → L =
\begin{pmatrix}
p_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
p_{2,1} & p_{2,2} & \cdots & p_{2,n} → l₂ \\
\vdots & \vdots & \ddots & \vdots → \vdots \\
p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ
\end{pmatrix}
\end{equation*}
The initial input of the decomposition algorithm C consists of a vector of variables
$\vec{x}$ = (x₁, x₂, ..., xₙ) of size /n/ where /n/ is the arity of
the type of /x/ and a clause matrix P → L of width n and height m.
the type of /x/ and the /clause matrix/ P → L.
That is:
\begin{equation*}
@ -1050,12 +1108,12 @@ C((\vec{x} = (x₁, x₂, ..., xₙ),
p_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
p_{2,1} & p_{2,2} & \cdots & p_{2,n} → l₂ \\
\vdots & \vdots & \ddots & \vdots → \vdots \\
p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ)
\end{pmatrix}
p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ
\end{pmatrix})
\end{equation*}
The base case C₀ of the algorithm is the case in which the $\vec{x}$
is empty and the result of the compilation
is an empty sequence and the result of the compilation
C₀ is l₁
\begin{equation*}
C₀((),
@ -1091,7 +1149,7 @@ following four rules:
for wildcard patterns and the lambda action lᵢ remains unchanged.
2) Constructor rule: if all patterns in the first column of P are
constructors patterns of the form k(q₁, q₂, ..., q) we define a
constructors patterns of the form k(q₁, q₂, ..., q_{n'}) we define a
new matrix, the specialized clause matrix S, by applying the
following transformation on every row /p/:
\begin{lstlisting}[mathescape,columns=fullflexible,basicstyle=\fontfamily{lmvtt}\selectfont,]
@ -1147,11 +1205,40 @@ following four rules:
largest prefix matrix for which one of the three previous rules
apply, and P₂ → L₂ containing the remaining rows. The algorithm is
applied to both matrices.
It is important to note that the application of the decomposition
algorithm converges. This intuition can be verified by defining the
size of the clause matrix P → L as equal to the length of the longest
pattern vector $\vec{p_i}$ and the length of a pattern vector as the
number of symbols that appear in the clause.
While it is very easy to see that the application of rules 1) and 4)
produces new matrices of length equal or smaller than the original
clause matrix, we can show that:
- with the application of the constructor rule the pattern vector $\vec{p_i}$ loses one
symbol after its decomposition:
| \vert{}(p_{i,1} (q₁, q₂, ..., q_{n'}), p_{i,2}, p_{i,3}, ..., p_{i,n})\vert{} = n + n'
| \vert{}(q_{i,1}, q_{i,2}, ..., q_{i,n'}, p_{i,2}, p_{i,3}, ..., p_{i,n})\vert{} = n + n' - 1
- with the application of the orpat rule, we add one row to the clause
matrix P → L but the length of a row containing an
Or-pattern decreases
\begin{equation*}
\vert{}P → L\vert{} = \big\lvert
\begin{pmatrix}
(p_{1,1}\vert{}q_{1,1}) & p_{1,2} & \cdots & p_{1,n} → l₁ \\
\vdots & \vdots & \ddots & \vdots → \vdots \\
\end{pmatrix}\big\rvert = n + 1
\end{equation*}
\begin{equation*}
\vert{}P' → L'\vert{} = \big\lvert
\begin{pmatrix}
p_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
q_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
\vdots & \vdots & \ddots & \vdots → \vdots \\
\end{pmatrix}\big\rvert = n
\end{equation*}
In our prototype we make use of accessors to encode stored values.
\begin{minipage}{0.2\linewidth}
\begin{verbatim}
let value = 1 :: 2 :: 3 :: []
(* that can also be written *)
let value = []
@ -1228,7 +1315,6 @@ Base cases:
because a source matrix in the case of empty rows returns
the first expression and (Leaf bb)(v) := Match bb
2. [| (aⱼ)ʲ, ∅ |] ≡ Failure
Regarding non base cases:
Let's first define
| let Idx(k) := [0; arity(k)[
@ -1240,16 +1326,15 @@ m := ((a_i)^i ((p_{ij})^i \to e_j)^{ij})
\[
(k_k)^k := headconstructor(p_{i0})^i
\]
\begin{equation}
Groups(m) := ( k_k \to ((a)_{0.l})^{l \in Idx(k_k)} +++ (a_i)^{i \in I\DZ}), \\
( if p_{0j} is k(q_l) then \\
(qₗ)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\
if p_{0j} is \_ then \\
(\_)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\
else \bot )^j ), \\
((a_i)^{i \in I\DZ}, ((p_{ij})^{i \in I\DZ} \to eⱼ if p_{0j} is \_ else \bot)^{j \in J})
\end{equation}
| Groups(m) := (k_{k} \to ((a)_{0.l})^{l \in Idx(k_{k})} +++ (a_{i})^{i\in{}I\DZ}),
|(
| \quad if p_{0j} is k(q_{l}) then
| \quad \quad \quad (qₗ)^{l \in Idx(k_k)} +++ (p_{ij})^{i\in{}I\DZ} \to e_{j}
| \quad if p_{0j} is "_" then
| \quad \quad \quad ("_")^{l \in Idx(k_{k})} +++ (p_{ij})^{i\in{}I\DZ} \to e_{j}
| \quad else \bot
|)^{j ∈ J},
| ((a_{i})^{i\in{}I\DZ}, ((p_{ij})^{i\in{}I\DZ} \to eⱼ if p_{0j} is \_ else \bot)^{j\in{}J})
Groups(m) is an auxiliary function that source a matrix m into
submatrices, according to the head constructor of their first pattern.
Groups(m) returns one submatrix m_r for each head constructor k that
@ -1266,7 +1351,7 @@ the wildcard submatrix.
We formalize this intuition as follows:
Lemma (Groups):
Let \[m\] be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\].
Let /m/ be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\].
For any value vector $(v_i)^l$ such that $v_0 = k(v'_l)^l$ for some
constructor k,
we have: