correzioni coppo #1 vero

2020-04-11 17:56:40 +02:00 · 2020-04-11 17:56:40 +02:00 · 14cf53ee8d
commit 14cf53ee8d
parent f464d23c8f
3 changed files with 142 additions and 57 deletions
--- a/tesi/conv.py
+++ b/tesi/conv.py
@ -7,7 +7,7 @@ try:
 except:
    allsymbols = json.load(open('../unicode-latex.json'))
-mysymbols = ['≡', '≠', '≼', '→', '←', '⊀', '⋠', '≺', '∀',  'ε', '₀', '₂', '₁', '₃', 'ₐ', 'ₖ', 'ₘ', 'ₙ', 'ᵢ', 'ⁱ', '⋮', 'ₛ', 'ₜ', '≃', '⇔', '∧', '∅', 'ℕ', 'ⱼ', 'ʲ', '⊥', 'π', 'α', 'β', '∞', 'σ', '≤', '⊈', '∧', '∨', '∃', '⇒', '∩', '∉', '⋃', 'ᵏ', 'ₗ', 'ˡ', 'ₒ', 'ᵣ', 'ᴵ', '≈', '⊆' ]
+mysymbols = ['≡', '≠', '≼', '→', '←', '⊀', '⋠', '≺', '∀',  'ε', '₀', '₂', '₁', '₃', '₄', 'ₐ', 'ₖ', 'ₘ', 'ₙ', 'ᵢ', 'ⁱ', '⋮', 'ₛ', 'ₜ', '≃', '⇔', '∧', '∅', 'ℕ', 'ⱼ', 'ʲ', '⊥', 'π', 'α', 'β', '∞', 'σ', '≤', '⊈', '∧', '∨', '∃', '⇒', '∩', '∉', '⋃', 'ᵏ', 'ₗ', 'ˡ', 'ₒ', 'ᵣ', 'ᴵ', '≈', '⊆' ]
 extrasymbols = {'〚': '\llbracket', '〛': r'\rrbracket', '̸': '\neg', '¬̸': '\neg', '∈': '\in ', 'ₛ': '_S', 'ₜ': '_T'}
 symbols = {s: allsymbols[s] for s in mysymbols}
--- a/tesi/tesi.pdf
+++ b/tesi/tesi.pdf
--- a/tesi/tesi_unicode.org
+++ b/tesi/tesi_unicode.org
@ -106,7 +106,7 @@ by applying symbolic execution.
 A pattern matching compiler turns a series of pattern matching clauses
 into simple control flow structures such as \texttt{if, switch}, for example:
 \begin{lstlisting}
-  match x with
+  match scrutinee with
  | [] -> (0, None)
  | x::[] -> (1, Some x)
  | _::y::_ -> (2, Some y)
@ -115,17 +115,25 @@ Given as input to the pattern matching compiler, this snippet of code
 gets translated into the Lambda intermediate representation of
 the OCaml compiler. The Lambda representation of a program is shown by
 calling the \texttt{ocamlc} compiler with \texttt{-drawlambda} flag.
 In this example we renamed the variables assigned in order to ease the
 understanding of the tests that are performed when the code is
 translated into the Lambda form.
 code phase.
 \begin{lstlisting}
-(if scrutinee
+(function scrutinee
-    (let (field_1 =a (field 1 scrutinee))
+    (if scrutinee    ;;; true when scrutinee (list) not empty
-        (if field_1
+    (let (tail =a (field 1 scrutinee/81))    ;;; assignment
-            (let
+        (if tail
-                (field_1_1 =a (field 1 field_1)
+        (let
-                 x =a (field 0 field_1))
+            y =a (field 0 tail)) 
-                (makeblock 0 2 (makeblock 0 x)))
+            ;;; y is the first element of the tail
-            (let (y =a (field 0 scrutinee))
+            (makeblock 0 2 (makeblock 0 y)))
-                (makeblock 0 1 (makeblock 0 y)))))
+            ;;; allocate memory for tuple (2, Some y)
-    [0: 0 0a])
+        (let (x =a (field 0 scrutinee))      
            ;;; x is the head of the scrutinee
            (makeblock 0 1 (makeblock 0 x))))) 
            ;;; allocate memory for tuple (1, Some x)
    [0: 0 0a])))  ;;; low level representatio of (0, None)
 \end{lstlisting}
 The OCaml pattern matching compiler is a critical part of the OCaml compiler
@ -215,10 +223,28 @@ reduced tree is equivalent to $C_i$.
 \subsection{From source programs to decision trees}
 Our source language supports integers, lists, tuples and all algebraic
 datatypes. Patterns support wildcards, constructors and literals,
-Or-patterns such as $(p_1 | p_2)$ and pattern variables.  We also support
+Or-patterns such as $(p_1 | p_2)$ and pattern variables.
-\texttt{when} guards, which are interesting as they introduce the
+In particular Or-patterns provide a more compact way to group patterns
-evaluation of expressions during matching.  Decision trees have nodes
+that point to the same expression.
-of the form:
+
 \begin{minipage}{0.4\linewidth}
 \begin{lstlisting}
 match w with
 | p₁ -> expr
 | p₂ -> expr
 | p₃ -> expr
 \end{lstlisting}
 \end{minipage}
 \begin{minipage}{0.6\linewidth}
 \begin{lstlisting}
 match w with
 | p₁|p₂|p₃ -> expr
 \end{lstlisting}
 \end{minipage}
 We also support \texttt{when} guards, which are interesting as they
 introduce the evaluation of expressions during matching. 
 This is the type definition of decision tree as they are used in the
 prototype implementation:
 \begin{lstlisting}
 type decision_tree =
  | Unreachable
@ -236,15 +262,18 @@ matched by the source clauses.
 \texttt{Unreachable} is used when we statically know that no value
 can flow to that subtree.
-We write 〚tₛ〛ₛ for the decision tree of the source program
+We write 〚tₛ〛ₛ to denote the translation of the source program (the
-t_S, computed by a matrix decomposition algorithm (each column
+set of pattern matching clauses) into a decision tree, computed by a matrix decomposition algorithm (each column
 decomposition step gives a \texttt{Switch} node).
 It satisfies the following correctness statement:
 \[
 \forall t_s, \forall v_s, \quad t_s(v_s) = \semTEX{t_s}_s(v_s)
 \]
-Running any source value $v_S$ against the source program gives the
+The correctness statement intuitively states that for every source
-same result as running it against the decision tree.
+program, for every source value that is well-formed input to a source
 program, running the program tₛ against the input value vₛ is the same
 as running the compiled source program 〚tₛ〛 (that is a decision tree) against the same input
 value vₛ".
 \subsection{From target programs to decision trees}
 The target programs include the following Lambda constructs:
@ -259,8 +288,10 @@ nodes are never emitted.
 Guards result in branching. In comparison with the source decision
 trees, \texttt{Unreachable} nodes are never emitted.
-We write $\semTEX{t_T}_T$ for the decision tree of the target program
+We write $\semTEX{t_T}_T$ to denote the translation of a target
-$t_T$, satisfying the following correctness statement:
+program tₜ into a decision tree of the target program
 $t_T$, satisfying the following correctness statement that is
 simmetric to the correctness statement for the translation of source programs:
 \[
 \forall t_T, \forall v_T, \quad t_T(v_T) = \semTEX{t_T}_T(v_T)
 \]
@ -819,7 +850,7 @@ existing translators that consists of taking the source and the target
 * Translation validation of the Pattern Matching Compiler
 ** Source program
-The algorithm takes as its input a source program and translates it
+Our algorithm takes as its input a source program and translates it
 into an algebraic data structure which type we call /decision_tree/.
 #+BEGIN_SRC 
@ -957,18 +988,32 @@ All the guards are of the form \texttt{guard <arg> <arg> <arg>}, where the
 <arg> are expressed using the OCaml pattern matching language.
 Similarly, all the right-hand-side expressions are of the form
 \texttt{observe <arg> <arg> ...} with the same constraints on arguments.
 #+BEGIN_SRC 
 type t = K1 | K2 of t (* declaration of an algebraic and recursive datatype t *)
 let _ = function
  | K1 -> observe 0
  | K2 K1 -> observe 1
-  | K2 x when guard x -> observe 2
+  | K2 x when guard x -> observe 2 (* guard inspects the x variable *)
  | K2 (K2 x) as y when guard x y -> observe 3
  | K2 _ -> observe 4
 #+END_SRC
-
+We note that the right hand side of /observe/ is just an arbitrary
 value and in this case just enumerates the order in which expressions
 appear.
 Even if this is an oversimplification of the problem for the
 prototype, it must be noted that at the compiler level we have the
 possibility to compile the pattern clauses in two separate steps so
 that the guards and right-hand-side expressions are semantically equal
 to their counterparts at the target program level.
 \begin{lstlisting}
 let _ = function
  | K1 -> lambda₀
  | K2 K1 -> lambda₁
  | K2 x when lambda-guard₀ -> lambda₂
  | K2 (K2 x) as y when lambda-guard₁ -> lambda₃
  | K2 _ -> lambda₄
 \end{lstlisting}
 The source program is parsed using the ocaml-compiler-libs library.
 The result of parsing, when successful, results in a list of clauses
 and a list of type declarations.
@ -1008,24 +1053,27 @@ following rules apply
 When a value /v/ matches pattern /p/ we say that /v/ is an /instance/ of /p/.
-During compilation by the translators, expressions are compiled into
+During compilation by the translators, expressions at the
 right-hand-side are compiled into
 Lambda code and are referred as lambda code actions lᵢ.
-The entire pattern matching code is represented as a clause matrix
+We define the /pattern matrix/ P as the matrix |m x n| where m bigger
-that associates rows of patterns (p_{i,1}, p_{i,2}, ..., p_{i,n}) to
+or equal than the number of clauses in the source program and n is
-lambda code action lⁱ
+equal to the arity of the constructor with the gratest arity.
 \begin{equation*}
-(P → L) =
+P =
 \begin{pmatrix}
-p_{1,1} & p_{1,2} & \cdots & p_{1,n} & → l₁ \\
+p_{1,1} & p_{1,2} & \cdots & p_{1,n}  \\
-p_{2,1} & p_{2,2} & \cdots & p_{2,n} & →  l₂ \\
+p_{2,1} & p_{2,2} & \cdots & p_{2,n}  \\
-\vdots & \vdots & \ddots & \vdots & → \vdots \\
+\vdots & \vdots & \ddots & \vdots  \\
-p_{m,1} & p_{m,2} & \cdots & p_{m,n} & → lₘ
+p_{m,1} & p_{m,2} & \cdots & p_{m,n} )
 \end{pmatrix}
 \end{equation*}
-
+every row of /P/ is called a pattern vector
 $\vec{p_i}$ = (p₁, p₂, ..., pₙ); In every instance of P pattern
 vectors appear normalized on the length of the longest pattern vector.
 Considering the pattern matrix P we say that the value vector
-$\vec{v}$ = (v₁, v₂, ..., vᵢ) matches the line number i in P  if and only if the following two
+$\vec{v}$ = (v₁, v₂, ..., vᵢ) matches the pattern vector pᵢ in P  if and only if the following two
 conditions are satisfied:
 - p_{i,1}, p_{i,2}, \cdots, p_{i,n}  ≼ (v₁, v₂, ..., vᵢ)
 - ∀j < i p_{j,1}, p_{j,2}, \cdots, p_{j,n} ⋠ (v₁, v₂, ..., vᵢ) 
@ -1038,10 +1086,20 @@ We can define the following three relations with respect to patterns:
 - Patterns p and q are compatible when they share a common instance
 \subsubsection{Matrix decomposition of pattern clauses}
-
+We define a new object, the /clause matrix/ P → L of size |m x n+1| that associates
 pattern vectors $\vec{p_i}$ to lambda code action lᵢ.
 \begin{equation*}
 P → L = 
 \begin{pmatrix}
 p_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
 p_{2,1} & p_{2,2} & \cdots & p_{2,n} → l₂ \\
 \vdots & \vdots & \ddots & \vdots →  \vdots \\
 p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ
 \end{pmatrix}
 \end{equation*}
 The initial input of the decomposition algorithm C consists of a vector of variables
 $\vec{x}$ = (x₁, x₂, ..., xₙ)  of size /n/ where /n/ is the arity of
-the type of /x/ and a clause matrix P → L of width n and height m.
+the type of /x/ and the /clause matrix/ P → L.
 That is:
 \begin{equation*}
@ -1050,12 +1108,12 @@ C((\vec{x} = (x₁, x₂, ..., xₙ),
 p_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
 p_{2,1} & p_{2,2} & \cdots & p_{2,n} → l₂ \\
 \vdots & \vdots & \ddots & \vdots →  \vdots \\
-p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ)
+p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ
-\end{pmatrix}
+\end{pmatrix})
 \end{equation*}
 The base case C₀ of the algorithm is the case in which the $\vec{x}$
-is empty and the result of the compilation
+is an empty sequence and the result of the compilation
 C₀ is l₁
 \begin{equation*}
 C₀((),
@ -1091,7 +1149,7 @@ following four rules:
    for wildcard patterns and the lambda action lᵢ remains unchanged.
 2) Constructor rule: if all patterns in the first column of P are
-   constructors patterns of the form k(q₁, q₂, ..., qₙ) we define a
+   constructors patterns of the form k(q₁, q₂, ..., q_{n'}) we define a
   new matrix, the specialized clause matrix S, by applying the
   following transformation on every row /p/:
    \begin{lstlisting}[mathescape,columns=fullflexible,basicstyle=\fontfamily{lmvtt}\selectfont,]
@ -1147,11 +1205,40 @@ following four rules:
   largest prefix matrix for which one of the three previous rules
   apply, and P₂ → L₂ containing the remaining rows. The algorithm is
   applied to both matrices.
-
+It is important to note that the application of the decomposition
-
+algorithm converges. This intuition can be verified by defining the
 size of the clause matrix P → L as equal to the length of the longest
 pattern vector $\vec{p_i}$ and the length of a pattern vector as the
 number of symbols that appear in the clause.
 While it is very easy to see that the application of rules 1) and 4)
 produces new matrices of length equal or smaller than the original
 clause matrix, we can show that:
 - with the application of the constructor rule the pattern vector $\vec{p_i}$ loses one
  symbol after its decomposition:
  | \vert{}(p_{i,1} (q₁, q₂, ..., q_{n'}), p_{i,2}, p_{i,3}, ..., p_{i,n})\vert{} = n + n'
  | \vert{}(q_{i,1}, q_{i,2}, ..., q_{i,n'}, p_{i,2}, p_{i,3}, ..., p_{i,n})\vert{} = n + n' - 1
 - with the application of the orpat rule, we add one row to the clause
  matrix P → L but the length of a row containing an
  Or-pattern decreases
 \begin{equation*}
 \vert{}P → L\vert{} = \big\lvert
 \begin{pmatrix}
 (p_{1,1}\vert{}q_{1,1}) & p_{1,2} & \cdots & p_{1,n} → l₁ \\
 \vdots & \vdots & \ddots & \vdots →  \vdots \\
 \end{pmatrix}\big\rvert = n + 1
 \end{equation*}
 \begin{equation*}
 \vert{}P' → L'\vert{} = \big\lvert
 \begin{pmatrix}
 p_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
 q_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
 \vdots & \vdots & \ddots & \vdots →  \vdots \\
 \end{pmatrix}\big\rvert = n
 \end{equation*}
 In our prototype we make use of accessors to encode stored values.
 \begin{minipage}{0.2\linewidth}
 \begin{verbatim}
 let value =  1 :: 2 :: 3 :: []
 (* that can also be written *)
 let value = []
@ -1228,7 +1315,6 @@ Base cases:
   because a source matrix in the case of empty rows returns
   the first expression and (Leaf bb)(v) := Match bb
 2. [| (aⱼ)ʲ, ∅ |] ≡ Failure 
 Regarding non base cases:
 Let's first define
 | let Idx(k) := [0; arity(k)[
@ -1240,16 +1326,15 @@ m := ((a_i)^i ((p_{ij})^i \to e_j)^{ij})
 \[
 (k_k)^k := headconstructor(p_{i0})^i
 \]
-\begin{equation}
+| Groups(m) := (k_{k} \to  ((a)_{0.l})^{l  \in  Idx(k_{k})} +++ (a_{i})^{i\in{}I\DZ}), 
- Groups(m) := ( k_k \to  ((a)_{0.l})^{l  \in  Idx(k_k)} +++ (a_i)^{i \in I\DZ}), \\
+|(
-    ( if p_{0j} is k(q_l) then \\
+| \quad if p_{0j} is k(q_{l}) then 
-        (qₗ)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ} \to e_j \\
+| \quad \quad \quad (qₗ)^{l \in Idx(k_k)} +++ (p_{ij})^{i\in{}I\DZ} \to e_{j} 
-     if p_{0j} is \_ then \\
+| \quad    if p_{0j} is "_" then 
-        (\_)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\DZ}   \to e_j \\
+| \quad \quad \quad ("_")^{l \in Idx(k_{k})} +++ (p_{ij})^{i\in{}I\DZ}   \to e_{j} 
-     else \bot  )^j ), \\
+|  \quad   else \bot
-  ((a_i)^{i \in I\DZ}, ((p_{ij})^{i \in I\DZ}  \to eⱼ if p_{0j} is \_ else \bot)^{j \in J}) 
+|)^{j ∈ J},
-\end{equation}
+|  ((a_{i})^{i\in{}I\DZ}, ((p_{ij})^{i\in{}I\DZ}  \to eⱼ if p_{0j} is \_ else \bot)^{j\in{}J}) 
 Groups(m) is an auxiliary function that source a matrix m into
 submatrices, according to the head constructor of their first pattern.
 Groups(m) returns one submatrix m_r for each head constructor k that
@ -1266,7 +1351,7 @@ the wildcard submatrix.
 We formalize this intuition as follows:
 Lemma (Groups):
-Let \[m\] be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\].
+Let /m/ be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\].
 For any value vector $(v_i)^l$ such that $v_0 = k(v'_l)^l$ for some
 constructor k,
 we have: