ancora tesi

This commit is contained in:
Francesco Mecca 2020-06-20 13:43:13 +02:00
parent b26311c22e
commit 30871c038b
36 changed files with 107 additions and 74 deletions

View file

@ -1,5 +0,0 @@
* TODO [0/1]
- [ ] 002:
- [ ] Laplace adjustment
- [ ] Output code decoding
- [ ] Itemeset bUh frequent: significato frequent

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 538 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 320 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 334 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.4 KiB

Binary file not shown.

View file

@ -849,7 +849,7 @@ Reasoning about all possible executions of a program is not always
feasible[CIT] and in case of explosion usually symbolic execution engines
implement heuristics to reduce the size of the search space.[CIT]
** Translation validation
** Translation Validation
Translators, such as translators and code generators, are huge pieces of
software usually consisting of multiple subsystem and
constructing an actual specification of a translator implementation for
@ -865,7 +865,7 @@ existing translators that consists of taking the source and the target
- [ ] What happens when there is no semantic equivalence
- [ ] Translation validation through symbolic execution
* Translation validation of the Pattern Matching Compiler
* Translation Validation of the Pattern Matching Compiler
** Accessors
OCaml encourages widespread usage of composite types to encapsulate
data. Composite types include /discriminated unions/, of which we have
@ -954,6 +954,14 @@ At the target level /accessors/ are constructed by compressing the steps taken b
the symbolic engine during the evaluation of a value.
/Accessors/ don't store any kind of information about the concrete
value of the scrutinee.
Accessors respect the following invariants:
| v(Here) = v
| K(vᵢ)ⁱ(k.a) = vₖ(a) if k ∈ [0;n[
We will see in the following chapters how at the source level and the
target level a value v_{S} and a value v_{T} can be deconstructed into
a value vector $(v_i)^{i\in{I}}$ of which we can access the root using
the $Here$ accessor and we can inspect the k-th element using an
accessor of the form $k.a$.
** Source program
The OCaml source code of a pattern matching function has the
@ -1041,7 +1049,7 @@ During compilation by the translator, expressions at the
right-hand-side are compiled into
Lambda code and are referred as lambda code actions lᵢ.
\\
We define the /pattern matrix/ P as the matrix |m x n| where m is bigger
We define the /pattern matrix/ P as the matrix |m × n| where m is bigger
or equal than the number of clauses in the source program and n is
equal to the arity of the constructor with the gratest arity.
\begin{equation*}
@ -1324,10 +1332,10 @@ In our prototype the source matrix mₛ is defined as follows
** Target translation
The target program of the following general form is parsed using a parser
generated by Menhir, a LR(1) parser generator for the OCaml programming language.
generated by Menhir[CIT], a LR(1) parser generator for the OCaml programming language.
Menhir compiles LR(1) a grammar specification, in this case a subset
of the Lambda intermediate language, down to OCaml code.
This is the grammar of the target language (TODO: check menhir grammar)
This is the grammar of the target language[CIT] (TODO: check menhir grammar)
| start ::= sexpr
| sexpr ::= variable \vert{} string \vert{} "(" special_form ")"
| string ::= "\"" identifier "\"" ;; string between doublequotes
@ -1450,7 +1458,7 @@ Guard nodes of the tree are emitted whenever a guard is found. Guards
node contains a blackbox of code that is never evaluated and two
branches, one that is taken in case the guard evaluates to true and
the other one that contains the path taken when the guard evaluates to
true.
false.
\begin{comment}
Are K and Here clear here?
\end{comment}
@ -1467,36 +1475,35 @@ in terms of the decision tree of pattern matrices
by the following:
| 〚((pᵢ → bbᵢ)^{i∈I}〛 := 〚(Here), (pᵢ → bbᵢ)^{i∈I} 〛
Decision tree computed from pattern matrices respect the following invariant:
| ∀v (vᵢ)^{i∈I} = v(aᵢ)^{i∈I} → 〚m〛(v) = m(vᵢ)^{i∈I} for m = ((aᵢ)^{i∈I}, (rᵢ)^{i∈I})
| v(Here) = v
| K(vᵢ)ⁱ(k.a) = vₖ(a) if k ∈ [0;n[
\begin{comment}
TODO: EXPLAIN
\end{comment}
| ∀v (vᵢ)^{i∈I} = v(aᵢ)^{i∈I} → m(vᵢ)^{i∈I} = 〚m〛(v) for m = ((aᵢ)^{i∈I}, (rᵢ)^{i∈I})
The invariant conveys the fact that OCaml pattern matching values can
be deconstructed into a value vector and if we can correctly inspect
the value vector elements using the accessor notation we can build a
decision tree 〚m〛 from a pattern matrix that are equivalent when run
against the value at hand.
We proceed to show the correctness of the invariant by a case
analysys.
Base cases:
1. [| ∅, (∅ → bbᵢ)ⁱ |] ≡ Leaf bbᵢ where i := min(I), that is a
decision tree [|ms|] defined by an empty accessor and empty
decision tree [|m|] defined by an empty accessor and empty
patterns pointing to blackboxes /bbᵢ/. This respects the invariant
because a source matrix in the case of empty rows returns
the first expression and (Leaf bb)(v) := Match bb
2. [| (aⱼ)ʲ, ∅ |] ≡ Failure
the first expression and $(Leaf bb)(v) := Match\enspace bb$
2. [| (aⱼ)ʲ, ∅ |] ≡ Failure, as it is the case with the matrix
decomposition algorithm
Regarding non base cases:
Let's first define some auxiliary functions
- The index family of a constructor
| Idx(K) := [0; arity(K)[
- head of an ordered family (we write x for any object here, value, pattern etc.)
| head((xᵢ)^{i ∈ I}) = x_min(I)
- tail of an ordered family
| tail((xᵢ)^{i ∈ I}) := (xᵢ)^{i ≠ min(I)}
- head constructor of a value or pattern
- The index family of a constructor: $Idx(K) := [0; arity(K)[$
- head of an ordered family (we write x for any object here, value,
pattern etc.): $head((x_i)^{i \in I}) = x_{min(I)}$
- tail of an ordered family: $tail((x_i)^{i \in I}) := (x_i)^{i \ne min(I)}$
- head constructor of a value or pattern:
| constr(K(xᵢ)ⁱ) = K
| constr(_) = ⊥
| constr(x) = ⊥
- first non-⊥ element of an ordered family
- first non-⊥ element of an ordered family:
| First((xᵢ)ⁱ) := ⊥ \quad \quad \quad if ∀i, xᵢ =
| First((xᵢ)ⁱ) := x_min{i \vert{} xᵢ ≠ ⊥} \quad if ∃i, xᵢ ≠ ⊥
- definition of group decomposition:
@ -1520,12 +1527,12 @@ submatrices, according to the head constructor of their first pattern.
Groups(m) returns one submatrix m_r for each head constructor K that
occurs on the first row of m, plus one "wildcard submatrix" m_{wild}
that matches on all values that do not start with one of those head
constructors.
constructors. \\
Intuitively, m is equivalent to its decomposition in the following
sense: if the first pattern of an input vector (v_i)^i starts with one
of the head constructors Kₖ, then running (v_i)^i against m is the same
as running it against the submatrix m_{Kₖ}; otherwise (its head
constructor (Kₖ)ᵏ) it is equivalent to running it against
as running it against the submatrix m_{Kₖ}; otherwise (when its head
constructor is not one of (Kₖ)ᵏ) it is equivalent to running it against
the wildcard submatrix.
\\
We formalize this intuition as follows
@ -1631,8 +1638,8 @@ we have to prove that
# TODO: replace C with D
Symbolic Values during symbolic evaluation have the following form
| vₜ ::= Block(tag ∈ , (vᵢ)^{i∈I}) \vert n ∈
The result of the symbolic evaluation is a target decision tree C
| Cₜ ::= Leaf bb \vert Switch(a, (πᵢ → Cᵢ)^{i∈S} , C?) \vert Failure
The result of the symbolic evaluation is a target decision tree D
| Dₜ ::= Leaf bb \vert Switch(a, (πᵢ → Dᵢ)^{i∈S} , D?) \vert Failure
Every branch of the decision tree is "constrained" by a domain πₜ that
intuitively tells us the set of
possible values that can "flow" through that path.
@ -1641,9 +1648,9 @@ possible values that can "flow" through that path.
πₜ conditions are refined by the engine during the evaluation; at the
root of the decision tree the domain corresponds to the set of
possible values that the type of the function can hold.
C? is the fallback node of the tree that is taken whenever the value
D? is the fallback node of the tree that is taken whenever the value
at that point of the execution can't flow to any other subbranch.
Intuitively, the π_{fallback} condition of the C? fallback node is
Intuitively, the π_{fallback} condition of the D? fallback node is
| π_{fallback} = ¬\bigcup\limits_{i∈I}πᵢ
The fallback node can be omitted in the case where the domain of the
children nodes correspond to set of possible values pointed by the
@ -1672,7 +1679,7 @@ tₛ:
and simmetrically, to run a value vₜ against a target program tₜ:
# TODO t_t todo.1
We denote as $\vrel {v_S} {v_T}$ the equivalence relation between a
source value vₛ and a target value vₜ. The relationship is proven by
source value vₛ and a target value vₜ. The equivalence relation is proven by
structural induction.
\begin{mathpar}
\infer[integers]
@ -1754,35 +1761,59 @@ executed guards and a \emph{matching result} $r$.
\infer
{\forall {\vrel {v_S} {v_T}},\quad \runrel {t_S(v_S)} {t_T(v_T)}}
{\progrel t_S t_T}
{\progrel {t_S} {t_T}}
\end{mathpar}
Once formulated in this way, our equivalence algorithm must check the
natural notion of input-output equivalence for matching programs,
captured by the relation $\progrel {t_S} {t_T}$.
\subsubsection{The equivalence checking algorithm}
\begin{comment}
The equivalence checking algorithm takes as input a domain of
possible values \emph{S} and a pair of source and target decision trees and
in case the two trees are not equivalent it returns a counter example.
The algorithm respects the following correctness statement:
\begin{comment}
% TODO: we have to define what \covers mean for readers to understand the specifications
% (or we use a simplifying lie by hiding \covers in the statements).
\end{comment}
\begin{align*}
\SimpleEquiv S {C_S} {C_T} = \Yes \;\land\; \covers {C_T} S
\SimpleEquiv S {D_S} {D_T} = \Yes \;\land\; \covers {D_T} S
& \implies
\forall {\vrel {v_S} {v_T}} \in S,\; \runrel {C_S(v_S)} {C_T(v_T)}
\forall {\vrel {v_S} {v_T}} \in S,\; \runrel {D_S(v_S)} {D_T(v_T)}
\\
\SimpleEquiv S {C_S} {C_T} = \No {v_S} {v_T} \;\land\; \covers {C_T} S
\SimpleEquiv S {D_S} {D_T} = \No {v_S} {v_T} \;\land\; \covers {D_T} S
& \implies
\vrel {v_S} {v_T} \in S \;\land\; {\nparamrel{run} {C_S(v_S)} {C_T(v_T)}}
\vrel {v_S} {v_T} \in S \;\land\; {\nparamrel{run} {D_S(v_S)} {D_T(v_T)}}
\end{align*}
This algorithm $\SimpleEquiv S {C_S} {C_T}$ is in fact exactly
This algorithm $\SimpleEquiv S {D_S} {D_T}$ is in fact exactly
a decision procedure for the provability of the judgment
$(\Equivrel S {C_S} {C_T} \emptyqueue)$, defined below in the general
form $(\Equivrel S {C_S} {C_T} G)$ where $G$ is a \emph{guard queue},
$(\Equivrel S {D_S} {D_T} \emptyqueue)$, defined below in the general
form $(\Equivrel S {D_S} {D_T} G)$ where $G$ is a \emph{guard queue},
indicating an imbalance between the guards observed in the source tree
and in the target tree.
\end{comment}
During the equivalence checking phase we traverse the two trees,
recursively checking equivalence of pairs of subtrees. When we
traverse a branch condition, we learn a condition on an accessor that
restricts the set of possible input values that can flow in the
corresponding subtree. We represent this in our algorithm as an
\emph{input domain} $S$ of possible values (a mapping from accessors
to target domains).
The equivalence checking algorithm $\SimpleEquiv S {D_S} {D_T}$ takes
an input domain \emph{S} and a pair of source and target decision
trees. In case the two trees are not equivalent, it returns a counter
example.
It is defined exactly as a decision procedure for the provability of
the judgment $(\Equivrel S {D_S} {D_T} \emptyqueue)$, defined below in
the general form $(\Equivrel S {D_S} {D_T} G)$ where $G$ is a
\emph{guard queue}, indicating an imbalance between the guards
observed in the source tree and in the target tree. (For clarity of
exposition, the inference rules do not explain how we build the
counter-example.)
\begin{mathpar}
\begin{array}{l@{~}r@{~}l}
& & \text{\emph{input space}} \\
@ -1822,8 +1853,8 @@ for content equality.
\end{mathpar}
\end{comment}
\begin{mathpar}
\infer{ }
{\Equivrel \emptyset {C_S} {C_T} G}
\infer[empty]{ }
{\Equivrel \emptyset {D_S} {D_T} G}
\infer{ }
{\Equivrel S \Failure \Failure \emptyqueue}
@ -1840,14 +1871,14 @@ child of the right hand side is tested for equality against the left
hand side.
\begin{mathpar}
\infer
\infer[explode-right]
{D_S \in {\Leaf t, \Failure}
\\
\forall i,\; \EquivTEX {(S \land a \in D_i)} {D_S} {D_i} G
\forall i,\; \Equivrel {(S \cap a \in \pi_i)} {D_S} {D_i} G
\\
\EquivTEX {(S \land a \notin \Fam i {D_i})} {D_S} \Dfb G}
{\EquivTEX S
{D_S} {\Switch a {\Fam i {D_i} {D_i}} \Dfb} G}
\Equivrel {(S \cap a \notin \Fam i {\pi_i})} {D_S} \Dfb G}
{\Equivrel S
{D_S} {\Switch a {\Fam i {\pi_i} {D_i}} \Dfb} G}
\end{mathpar}
\begin{comment}
@ -1867,18 +1898,14 @@ result in an empty intersection. If the accessors are different,
\emph{$dom_T$} is left unchanged.
\begin{mathpar}
\infer
{\forall i,\;
\EquivTEX
{(S \land a = K_i)}
{D_i} {\trim {D_T} {a = K_i}} G
\infer[explode-right]
{D_S \in {\Leaf t, \Failure}
\\
\EquivTEX
{(S \land a \notin \Fam i {K_i})}
\Dfb {\trim {D_T} {a \notin \Fam i {K_i}}} G
}
{\EquivTEX S
{\Switch a {\Fam i {K_i, D_i}} \Dfb} {D_T} G}
\forall i,\; \Equivrel {(S \cap a \in \pi_i)} {D_S} {D_i} G
\\
\Equivrel {(S \cap a \notin \Fam i {\pi_i})} {D_S} \Dfb G}
{\Equivrel S
{D_S} {\Switch a {\Fam i {\pi_i} {D_i}} \Dfb} G}
\end{mathpar}
The equivalence checking algorithm deals with guards by storing a
@ -1893,18 +1920,19 @@ Termination of the algorithm is successful only when the guards queue
is empty.
\begin{mathpar}
\infer
{\EquivTEX S {C_0} {C_T} {G, (t_S = 0)}
{\Equivrel S {D_0} {D_T} {G, (e_S = 0)}
\\
\EquivTEX S {C_1} {C_T} {G, (t_S = 1)}}
{\EquivTEX S
{\Guard {t_S} {C_0} {C_1}} {C_T} G}
\Equivrel S {D_1} {D_T} {G, (e_S = 1)}}
{\Equivrel S
{\Guard {e_S} {D_0} {D_1}} {D_T} G}
\infer
{\trel {t_S} {t_T}
{\erel {e_S} {e_T}
\\
\EquivTEX S {C_S} {C_b} G}
{\EquivTEX S
{C_S} {\Guard {t_T} {C_0} {C_1}} {(t_S = b), G}}
\Equivrel S {D_S} {D_b} G}
{\Equivrel S
{D_S} {\Guard {e_T} {D_0} {D_1}} {(e_S = b), G}}
\end{mathpar}
Our equivalence-checking algorithm $\EquivTEX S {C_S} {C_T} G$ is
a exactly decision procedure for the provability of the judgment

View file

@ -1,15 +1,25 @@
* Apprendimento Automatico [2/2]
* Apprendimento Automatico [2/3]
- [X] Scrivile per date di esame
- [X] Richiedi date esame
- [ ] Slides [0/2]
- [ ] 002:
+ [ ] Laplace adjustment
+ [ ] Output code decoding
+ [ ] Itemeset bUh frequent: significato frequent
- [ ] 02:
+ [ ] Inductive Leap
+ [ ] Biased hyp
* Tesi [8/31]
* Tesi [8/33]
- [ ] Rivedere inference rules di Gabriel e aggiustarle con le mie
- [ ] Definisci domain sempre allo stesso modo, con bigcup o |
- [ ] Scegli se usare [[t_s]\] o D(ts)
- [X] Definizione di First(x_i): serve?
- [ ] ~1444: run t_t against value (todo.1)
- [X] ~1435: equivalence v_s v_t: formulala, correggi omitted
- [ ] Definisci Run e =_run
- [ ] Correggi Dfb e D?
- [X] 3.4: DECISION TREE CHAPTER
+ [X] sposta inferenze sotto decision tree in equivalenze?
+ [X] Rivedi sezioni: magari le altre mettile come sottosezioni