correzioni coppo

This commit is contained in:
Francesco Mecca 2020-04-09 23:46:51 +02:00
parent 048affd516
commit 67ae30a28d
2 changed files with 72 additions and 89 deletions

Binary file not shown.

View file

@ -1,5 +1,4 @@
\begin{comment} \begin{comment}
TODO: neg is parsed incorrectly
TODO: chiedi a Gabriel se T e S vanno bene, ma prima controlla che siano coerenti TODO: chiedi a Gabriel se T e S vanno bene, ma prima controlla che siano coerenti
* TODO Scaletta [1/6] * TODO Scaletta [1/6]
- [X] Introduction - [X] Introduction
@ -103,6 +102,10 @@ into simple control flow structures such as \texttt{if, switch}, for example:
| x::[] -> (1, Some x) | x::[] -> (1, Some x)
| _::y::_ -> (2, Some y) | _::y::_ -> (2, Some y)
\end{lstlisting} \end{lstlisting}
Given as input to the pattern matching compiler, this snippet of code
gets translated into the Lambda intermediate representation of
the OCaml compiler. The Lambda representation of a program is shown by
calling the \texttt{ocamlc} compiler with \texttt{-drawlambda} flag.
\begin{lstlisting} \begin{lstlisting}
(if scrutinee (if scrutinee
(let (field_1 =a (field 1 scrutinee)) (let (field_1 =a (field 1 scrutinee))
@ -115,12 +118,6 @@ into simple control flow structures such as \texttt{if, switch}, for example:
(makeblock 0 1 (makeblock 0 y))))) (makeblock 0 1 (makeblock 0 y)))))
[0: 0 0a]) [0: 0 0a])
\end{lstlisting} \end{lstlisting}
\begin{comment}
%% TODO: side by side
\end{comment}
The code on the right is in the Lambda intermediate representation of
the OCaml compiler. The Lambda representation of a program is shown by
calling the \texttt{ocamlc} compiler with \texttt{-drawlambda} flag.
The OCaml pattern matching compiler is a critical part of the OCaml compiler The OCaml pattern matching compiler is a critical part of the OCaml compiler
in terms of correctness because bugs typically would result in wrong code in terms of correctness because bugs typically would result in wrong code
@ -151,6 +148,7 @@ execution.
Here are the decision trees for the source and target example program. Here are the decision trees for the source and target example program.
\begin{minipage}{0.5\linewidth} \begin{minipage}{0.5\linewidth}
\scriptsize
\begin{verbatim} \begin{verbatim}
Switch(Root) Switch(Root)
/ \ / \
@ -166,21 +164,22 @@ Here are the decision trees for the source and target example program.
\end{verbatim} \end{verbatim}
\end{minipage} \end{minipage}
\hfill \hfill
\begin{minipage}{0.5\linewidth} \begin{minipage}{0.4\linewidth}
\scriptsize
\begin{verbatim} \begin{verbatim}
Switch(Root) Switch(Root)
/ \ / \
(= int 0) (!= int 0) (= int 0) (!= int 0)
/ \ / \
Leaf Switch(Root.1) Leaf Switch(Root.1)
(makeblock 0 / \ (mkblock 0 / \
0 0a) / \ 0 0a) / \
(= int 0) (!= int 0) (= int 0) (!= int 0)
/ \ / \
Leaf Leaf Leaf Leaf
[x = Root.0] [y = Root.1.0] [x = Root.0] [y = Root.1.0]
(makeblock 0 (makeblock 0 (mkblock 0 (mkblock 0
1 (makeblock 0 x)) 2 (makeblock 0 y)) 1 (mkblock 0 x)) 2 (mkblock 0 y))
\end{verbatim} \end{verbatim}
\end{minipage} \end{minipage}
@ -207,7 +206,7 @@ reduced tree is equivalent to $C_i$.
\subsection{From source programs to decision trees} \subsection{From source programs to decision trees}
Our source language supports integers, lists, tuples and all algebraic Our source language supports integers, lists, tuples and all algebraic
datatypes. Patterns support wildcards, constructors and literals, datatypes. Patterns support wildcards, constructors and literals,
or-patterns $(p_1 | p_2)$ and pattern variables. We also support Or-patterns such as $(p_1 | p_2)$ and pattern variables. We also support
\texttt{when} guards, which are interesting as they introduce the \texttt{when} guards, which are interesting as they introduce the
evaluation of expressions during matching. Decision trees have nodes evaluation of expressions during matching. Decision trees have nodes
of the form: of the form:
@ -742,21 +741,17 @@ respectively the /precondition/ and the /postcondition/ that
constitute the assertions of the program. C is the directive being constitute the assertions of the program. C is the directive being
executed. executed.
The language of the assertions P is: The language of the assertions P is:
#+BEGIN_SRC | P ::= true \vert false \vert a < b \vert P_{1} \wedge P_{2} \vert P_{1} \lor P_{2} \vert \not P
P ::= true | false | a < b | P_{1} \wedge P_{2} | P_{1} \lor P_{2} | \notP
#+END_SRC
where a and b are numbers. where a and b are numbers.
In the Hoare rules assertions could also take the form of In the Hoare rules assertions could also take the form of
#+BEGIN_SRC | P ::= \forall i. P \vert \exists i. P \vert P_{1} \Rightarrow P_{2}
P ::= \forall i. P | \exists i. P | P_{1} \Rightarrow P_{2}
#+END_SRC
where i is a logical variable, but assertions of these kinds increases where i is a logical variable, but assertions of these kinds increases
the complexity of the symbolic engine. the complexity of the symbolic engine.
Execution follows the rules of Hoare logic: Execution follows the rules of Hoare logic:
- Empty statement : - Empty statement :
\begin{mathpar} \begin{mathpar}
\infer{ } \infer{ }
{ \{P\} skip \{P\} } { \{P\}skip\{P\} }
\end{mathpar} \end{mathpar}
- Assignment statement : The truthness of P[a/x] is equivalent to the - Assignment statement : The truthness of P[a/x] is equivalent to the
truth of {P} after the assignment. truth of {P} after the assignment.
@ -775,14 +770,14 @@ Execution follows the rules of Hoare logic:
- Conditional : - Conditional :
\begin{mathpar} \begin{mathpar}
\infer { \{P \wedge b \} c_1 \{Q\}, \{P\wedge\not b\}c_2\{Q\} } \infer { \{P \wedge b \} c_1 \{Q\}, \{P\wedge\not b\}c_2\{Q\} }
{ \{P\}if b then c_1 else c_2\{Q\} } { \{P\}\text{if b then $c_1$ else $c_2$}\{Q\} }
\end{mathpar} \end{mathpar}
- Loop : {P} is the loop invariant. After the loop is finished /P/ - Loop : {P} is the loop invariant. After the loop is finished /P/
holds and ¬̸b caused the loop to end. holds and ¬̸b caused the loop to end.
\begin{mathpar} \begin{mathpar}
\infer { \{P \wedge b \}c\{P\} } \infer { \{P \wedge b \}c\{P\} }
{ \{P\} while b do c \{P \wedge \neg b\} } { \{P\}\text{while b do c} \{P \wedge \neg b\} }
\end{mathpar} \end{mathpar}
Even if the semantics of symbolic execution engines are well defined, Even if the semantics of symbolic execution engines are well defined,
@ -816,7 +811,7 @@ existing translators that consists of taking the source and the target
** Source program ** Source program
The algorithm takes as its input a source program and translates it The algorithm takes as its input a source program and translates it
into an algebraic data structure called /decision_tree/. into an algebraic data structure which type we call /decision_tree/.
#+BEGIN_SRC #+BEGIN_SRC
type decision_tree = type decision_tree =
@ -824,7 +819,7 @@ type decision_tree =
| Failure | Failure
| Leaf of source_expr | Leaf of source_expr
| Guard of source_blackbox * decision_tree * decision_tree | Guard of source_blackbox * decision_tree * decision_tree
| Node of accessor * (constructor * decision_tree) list * decision_tree | Switch of accessor * (constructor * decision_tree) list * decision_tree
#+END_SRC #+END_SRC
Unreachable, Leaf of source_expr and Failure are the terminals of the three. Unreachable, Leaf of source_expr and Failure are the terminals of the three.
@ -842,18 +837,16 @@ Let's consider some trivial examples:
function true -> 1 function true -> 1
#+END_SRC #+END_SRC
[ ] Converti a disegni is translated to
|Switch ([(true, Leaf 1)], Failure)
Is translated to
|Node ([(true, Leaf 1)], Failure)
while while
#+BEGIN_SRC #+BEGIN_SRC
function function
true -> 1 true -> 1
| false -> 2 | false -> 2
#+END_SRC #+END_SRC
will give will be translated to
|Node ([(true, Leaf 1); (false, Leaf 2)]) |Switch ([(true, Leaf 1); (false, Leaf 2)])
It is possible to produce Unreachable examples by using It is possible to produce Unreachable examples by using
refutation clauses (a "dot" in the right-hand-side) refutation clauses (a "dot" in the right-hand-side)
@ -864,7 +857,7 @@ true -> 1
| _ -> . | _ -> .
#+END_SRC #+END_SRC
that gets translated into that gets translated into
Node ([(true, Leaf 1); (false, Leaf 2)], Unreachable) Switch ([(true, Leaf 1); (false, Leaf 2)], Unreachable)
We trust this annotation, which is reasonable as the type-checker We trust this annotation, which is reasonable as the type-checker
verifies that it indeed holds. verifies that it indeed holds.
@ -875,20 +868,21 @@ branches, one that is taken in case the guard evaluates to true and
the other one that contains the path taken when the guard evaluates to the other one that contains the path taken when the guard evaluates to
true. true.
[ ] Finisci con Node \begin{comment}
[ ] Finisci con Switch
[ ] Spiega del fallback [ ] Spiega del fallback
[ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le Lambda [ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le Lambda
[ ] Specifica che stesso algoritmo usato per compilare a lambda, piu` optimizations \end{comment}
The source code of a pattern matching function has the The source code of a pattern matching function has the
following form: following form:
|match variable with |match variable with
|\vert pattern₁ -> expr₁ |\vert pattern₁ \to expr₁
|\vert pattern₂ when guard -> expr₂ |\vert pattern₂ when guard \to expr₂
|\vert pattern₃ as var -> expr₃ |\vert pattern₃ as var \to expr₃
|⋮ |⋮
|\vert pₙ -> exprₙ |\vert pₙ \to exprₙ
and can include any expression that is legal for the OCaml compiler, and can include any expression that is legal for the OCaml compiler,
such as /when/ guards and assignments. Patterns could or could not such as /when/ guards and assignments. Patterns could or could not
@ -896,45 +890,34 @@ be exhaustive.
Pattern matching code could also be written using the more compact form: Pattern matching code could also be written using the more compact form:
|function |function
|\vert pattern₁ -> expr₁ |\vert pattern₁ \to expr₁
|\vert pattern₂ when guard -> expr₂ |\vert pattern₂ when guard \to expr₂
|\vert pattern₃ as var -> expr₃ |\vert pattern₃ as var \to expr₃
|⋮ |⋮
|\vert pₙ -> exprₙ |\vert pₙ \to exprₙ
This BNF grammar describes formally the grammar of the source program: This BNF grammar describes formally the grammar of the source program:
#+BEGIN_SRC bnf | start ::= "match" id "with" patterns \vert{} "function" patterns
start ::= "match" id "with" patterns | "function" patterns | patterns ::= (pattern0\vert{}pattern1) pattern1+
patterns ::= (pattern0|pattern1) pattern1+ | ;; pattern0 and pattern1 are needed to distinguish the first case in which
;; pattern0 and pattern1 are needed to distinguish the first case in which | ;; we can avoid writing the optional vertical line
;; we can avoid writing the optional vertical line | pattern0 ::= clause
pattern0 ::= clause | pattern1 ::= "\vert" clause
pattern1 ::= "|" clause | clause ::= lexpr "->" rexpr
clause ::= lexpr "->" rexpr | lexpr ::= rule (ε\vert{}condition)
| rexpr ::= _code ;; arbitrary code
lexpr ::= rule (ε|condition) | rule ::= wildcard\vert{}variable\vert{}constructor_pattern\vert{}or_pattern ;;
rexpr ::= _code ;; arbitrary code | ;; rules
| wildcard ::= "_"
rule ::= wildcard|variable|constructor_pattern|or_pattern ;; | variable ::= identifier
| constructor_pattern ::= constructor (rule\vert{}ε) (assignment\vert{}ε)
;; rules | constructor ::= int\vert{}float\vert{}char\vert{}string\vert{}bool \vert{}unit\vert{}record\vert{}exn\vert{}objects\vert{}ref \vert{}list\vert{}tuple\vert{}array\vert{}variant\vert{}parameterized_variant ;; data types
wildcard ::= "_" | or_pattern ::= rule ("\vert{}" wildcard\vert{}variable\vert{}constructor_pattern)+
variable ::= identifier | condition ::= "when" bexpr
constructor_pattern ::= constructor (rule|ε) (assignment|ε) | assignment ::= "as" id
| bexpr ::= _code ;; arbitrary code
constructor ::= int|float|char|string|bool
|unit|record|exn|objects|ref
|list|tuple|array
|variant|parameterized_variant ;; data types
or_pattern ::= rule ("|" wildcard|variable|constructor_pattern)+
condition ::= "when" bexpr
assignment ::= "as" id
bexpr ::= _code ;; arbitrary code
#+END_SRC
\begin{comment} \begin{comment}
Check that it is still coherent to this bnf Check that it is still coherent to this bnf
@ -948,7 +931,7 @@ Patterns are of the form
| c(p₁,p₂,...,pₙ) | constructor pattern | | c(p₁,p₂,...,pₙ) | constructor pattern |
| (p₁\vert p₂) | or-pattern | | (p₁\vert p₂) | or-pattern |
During compilation by the translators expressions are compiled into During compilation by the translators, expressions are compiled into
Lambda code and are referred as lambda code actions lᵢ. Lambda code and are referred as lambda code actions lᵢ.
The entire pattern matching code is represented as a clause matrix The entire pattern matching code is represented as a clause matrix
@ -970,7 +953,7 @@ following rules apply
|--------------------+---+--------------------+-------------------------------------------| |--------------------+---+--------------------+-------------------------------------------|
| _ | ≼ | v | ∀v | | _ | ≼ | v | ∀v |
| x | ≼ | v | ∀v | | x | ≼ | v | ∀v |
| (p₁ \vert\ p₂) | ≼ | v | iff p₁ ≼ v or p₂ ≼ v | | (p₁ \vert p₂) | ≼ | v | iff p₁ ≼ v or p₂ ≼ v |
| c(p₁, p₂, ..., pₐ) | ≼ | c(v₁, v₂, ..., vₐ) | iff (p₁, p₂, ..., pₐ) ≼ (v₁, v₂, ..., vₐ) | | c(p₁, p₂, ..., pₐ) | ≼ | c(v₁, v₂, ..., vₐ) | iff (p₁, p₂, ..., pₐ) ≼ (v₁, v₂, ..., vₐ) |
| (p₁, p₂, ..., pₐ) | ≼ | (v₁, v₂, ..., vₐ) | iff pᵢ ≼ vᵢ ∀i ∈ [1..a] | | (p₁, p₂, ..., pₐ) | ≼ | (v₁, v₂, ..., vₐ) | iff pᵢ ≼ vᵢ ∀i ∈ [1..a] |
|--------------------+---+--------------------+-------------------------------------------| |--------------------+---+--------------------+-------------------------------------------|
@ -1038,14 +1021,14 @@ Similarly, all the right-hand-side expressions are of the form
\texttt{observe <arg> <arg> ...} with the same constraints on arguments. \texttt{observe <arg> <arg> ...} with the same constraints on arguments.
#+BEGIN_SRC #+BEGIN_SRC
type t = Z | S of t type t = K1 | K2 of t (* declaration of an algebraic and recursive datatype t *)
let _ = function let _ = function
| Z -> observe 0 | K1 -> observe 0
| S Z -> observe 1 | K2 K1 -> observe 1
| S x when guard x -> observe 2 | K2 x when guard x -> observe 2
| S (S x) as y when guard x y -> observe 3 | K2 (K2 x) as y when guard x y -> observe 3
| S _ -> observe 4 | K2 _ -> observe 4
#+END_SRC #+END_SRC
Once parsed, the type declarations and the list of clauses are encoded in the form of a matrix Once parsed, the type declarations and the list of clauses are encoded in the form of a matrix
@ -1069,7 +1052,7 @@ p_{m,1} & p_{m,2} & \cdots & p_{m,n} → lₘ)
\end{equation*} \end{equation*}
The base case C₀ of the algorithm is the case in which the $\vec{x}$ The base case C₀ of the algorithm is the case in which the $\vec{x}$
is empty, that is $\vec{x}$ = (), then the result of the compilation is empty and the result of the compilation
C₀ is l₁ C₀ is l₁
\begin{equation*} \begin{equation*}
C₀((), C₀((),
@ -1102,7 +1085,7 @@ following four rules:
\end{equation*} \end{equation*}
That means in every lambda action lᵢ there is a binding of x₁ to the That means in every lambda action lᵢ there is a binding of x₁ to the
variable that appears on the pattern $p_{i,1}. Bindings are omitted variable that appears on the pattern p_{i,1}. Bindings are omitted
for wildcard patterns and the lambda action lᵢ remains unchanged. for wildcard patterns and the lambda action lᵢ remains unchanged.
2) Constructor rule: if all patterns in the first column of P are 2) Constructor rule: if all patterns in the first column of P are
@ -1344,7 +1327,7 @@ possible input values /S/. The fallback is not needed when the user
doesn't use a wildcard pattern. doesn't use a wildcard pattern.
%%% Give example of thing %%% Give example of thing
| Cₛ ::= Leaf bb | Node(a, (Kᵢ → Cᵢ)^{i∈S} , C?) | Cₛ ::= Leaf bb | Switch(a, (Kᵢ → Cᵢ)^{i∈S} , C?)
| a ::= Here | n.a | a ::= Here | n.a
| vₛ ::= K(vᵢ)^{i∈I} | n ∈ | vₛ ::= K(vᵢ)^{i∈I} | n ∈
@ -1492,11 +1475,11 @@ We prove this by induction on Cₜ:
That means that the result of trimming on a Leaf is the Leaf itself That means that the result of trimming on a Leaf is the Leaf itself
b. The same applies to Failure terminal b. The same applies to Failure terminal
| Failure_{/a→π}(v) = Failure(v) | Failure_{/a→π}(v) = Failure(v)
c. When Cₜ = Node(b, (π→Cᵢ)ⁱ)_{/a→π} then c. When Cₜ = Switch(b, (π→Cᵢ)ⁱ)_{/a→π} then
we look at the accessor /a/ of the subtree Cᵢ and we look at the accessor /a/ of the subtree Cᵢ and
we define πᵢ' = πᵢ if a≠b else πᵢ∩π we define πᵢ' = πᵢ if a≠b else πᵢ∩π
Trimming a switch node yields the following result: Trimming a switch node yields the following result:
| Node(b, (π→Cᵢ)ⁱ)_{/a→π} := Node(b, (π'ᵢ→C_{i/a→π})ⁱ) | Switch(b, (π→Cᵢ)ⁱ)_{/a→π} := Switch(b, (π'ᵢ→C_{i/a→π})ⁱ)
\begin{comment} \begin{comment}
Actually in the proof.org file I transcribed: Actually in the proof.org file I transcribed:
@ -1507,10 +1490,10 @@ This is not correct because you don't have Unreachable nodes in target decision
For the trimming lemma we have to prove that running the value vₜ against For the trimming lemma we have to prove that running the value vₜ against
the decistion tree Cₜ is the same as running vₜ against the tree the decistion tree Cₜ is the same as running vₜ against the tree
C_{trim} that is the result of the trimming operation on Cₜ C_{trim} that is the result of the trimming operation on Cₜ
| Cₜ(vₜ) = C_{trim}(vₜ) = Node(b, (πᵢ'→C_{i/a→π})ⁱ)(vₜ) | Cₜ(vₜ) = C_{trim}(vₜ) = Switch(b, (πᵢ'→C_{i/a→π})ⁱ)(vₜ)
We can reason by first noting that when vₜ∉(b→πᵢ)ⁱ the node must be a Failure node. We can reason by first noting that when vₜ∉(b→πᵢ)ⁱ the node must be a Failure node.
In the case where ∃k| vₜ∈(b→πₖ) then we can prove that In the case where ∃k| vₜ∈(b→πₖ) then we can prove that
| C_{k/a→π}(vₜ) = Node(b, (πᵢ'→C_{i/a→π})ⁱ)(vₜ) | C_{k/a→π}(vₜ) = Switch(b, (πᵢ'→C_{i/a→π})ⁱ)(vₜ)
because when a ≠ b then πₖ'= πₖ and this means that vₜ∈πₖ' because when a ≠ b then πₖ'= πₖ and this means that vₜ∈πₖ'
while when a = b then πₖ'=(πₖ∩π) and vt∈πₖ' because: while when a = b then πₖ'=(πₖ∩π) and vt∈πₖ' because:
- by the hypothesis, vₜ∈π - by the hypothesis, vₜ∈π
@ -1580,8 +1563,8 @@ I think the unreachable case should go at the end.
|equiv(S, Failure, Failure) := Yes |equiv(S, Failure, Failure) := Yes
Given that ∀v, Failure(v) = Failure, the statement holds. Given that ∀v, Failure(v) = Failure, the statement holds.
3. When we have a Leaf or a Failure at the left side: 3. When we have a Leaf or a Failure at the left side:
| equiv(S, Failure as Cₛ, Node(a, (πᵢ → Cₜᵢ)ⁱ)) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)ⁱ) | equiv(S, Failure as Cₛ, Switch(a, (πᵢ → Cₜᵢ)ⁱ)) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)ⁱ)
| equiv(S, Leaf bbₛ as Cₛ, Node(a, (πᵢ → Cₜᵢ)ⁱ)) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)ⁱ) | equiv(S, Leaf bbₛ as Cₛ, Switch(a, (πᵢ → Cₜᵢ)ⁱ)) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)ⁱ)
The algorithm either returns Yes for every sub-input space Sᵢ := S∩(a→π(kᵢ)) and The algorithm either returns Yes for every sub-input space Sᵢ := S∩(a→π(kᵢ)) and
subtree Cₜᵢ subtree Cₜᵢ
| equiv(Sᵢ, Cₛ, Cₜᵢ) = Yes ∀i | equiv(Sᵢ, Cₛ, Cₜᵢ) = Yes ∀i
@ -1592,11 +1575,11 @@ I think the unreachable case should go at the end.
| vₛ≃vₜ∈S ∧ Cₛ(vₛ)≠Cₜ(vₜ) | vₛ≃vₜ∈S ∧ Cₛ(vₛ)≠Cₜ(vₜ)
we can say that we can say that
| equiv(Sᵢ, Cₛ, Cₜᵢ) = No(vₛ, vₜ) for some minimal k∈I | equiv(Sᵢ, Cₛ, Cₜᵢ) = No(vₛ, vₜ) for some minimal k∈I
4. When we have a Node on the right we define πₙ as the domain of 4. When we have a Switch on the right we define πₙ as the domain of
values not covered but the union of the constructors kᵢ values not covered but the union of the constructors kᵢ
| πₙ = ¬(⋃π(kᵢ)ⁱ) | πₙ = ¬(⋃π(kᵢ)ⁱ)
The algorithm proceeds by trimming The algorithm proceeds by trimming
| equiv(S, Node(a, (kᵢ → Cₛᵢ)ⁱ, C_{sf}), Cₜ) := | equiv(S, Switch(a, (kᵢ → Cₛᵢ)ⁱ, C_{sf}), Cₜ) :=
| Forall(equiv( S∩(a→π(kᵢ)ⁱ), Cₛᵢ, C_{t/a→π(kᵢ)})ⁱ +++ equiv(S∩(a→π(kᵢ)), Cₛ, C_{a→πₙ})) | Forall(equiv( S∩(a→π(kᵢ)ⁱ), Cₛᵢ, C_{t/a→π(kᵢ)})ⁱ +++ equiv(S∩(a→π(kᵢ)), Cₛ, C_{a→πₙ}))
The statement still holds and we show this by first analyzing the The statement still holds and we show this by first analyzing the
/Yes/ case: /Yes/ case: