UniTO/tesi/gabriel/part4.org
2020-04-09 01:44:19 +02:00

354 lines
14 KiB
Org Mode
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Add headers to export only this section
* Correctness of the algorithm
Running a program tₛ or its translation 〚tₛ〛 against an input vₛ
produces as a result /r/ in the following way:
| ( 〚tₛ〛ₛ(vₛ) = Cₛ(vₛ) ) → r
| tₛ(vₛ) → r
Likewise
| ( 〚tₜ〛ₜ(vₜ) = Cₜ(vₜ) ) → r
| tₜ(vₜ) → r
where result r ::= guard list * (Match blackbox | NoMatch | Absurd)
and guard ::= blackbox.
Having defined equivalence between two inputs of which one is
expressed in the source language and the other in the target language
vₛ ≃ vₜ (TODO define, this talks about the representation of source values in the target)
we can define the equivalence between a couple of programs or a couple
of decision trees
| tₛ ≃ tₜ := ∀vₛ≃vₜ, tₛ(vₛ) = tₜ(vₜ)
| Cₛ ≃ Cₜ := ∀vₛ≃vₜ, Cₛ(vₛ) = Cₜ(vₜ)
The proposed equivalence algorithm that works on a couple of
decision trees is returns either /Yes/ or /No(vₛ, vₜ)/ where vₛ and
vₜ are a couple of possible counter examples for which the constraint
trees produce a different result.
** Statements
Theorem. We say that a translation of a source program to a decision tree
is correct when for every possible input, the source program and its
respective decision tree produces the same result
| ∀vₛ, tₛ(vₛ) = 〚tₛ〛ₛ(vₛ)
Likewise, for the target language:
| ∀vₜ, tₜ(vₜ) = 〚tₜ〛ₜ(vₜ)
Definition: in the presence of guards we can say that two results are
equivalent modulo the guards queue, written /r₁ ≃gs r₂/, when:
| (gs₁, r₁) ≃gs (gs₂, r₂) ⇔ (gs₁, r₁) = (gs₂ ++ gs, r₂)
Definition: we say that Cₜ covers the input space /S/, written
/covers(Cₜ, S) when every value vₛ∈S is a valid input to the
decision tree Cₜ. (TODO: rephrase)
Theorem: Given an input space /S/ and a couple of decision trees, where
the target decision tree Cₜ covers the input space /S/, we say that
the two decision trees are equivalent when:
| equiv(S, Cₛ, Cₜ, gs) = Yes ∧ covers(Cₜ, S) → ∀vₛ≃vₜ ∈ S, Cₛ(vₛ) ≃gs Cₜ(vₜ)
Similarly we say that a couple of decision trees in the presence of
an input space /S/ are /not/ equivalent when:
| equiv(S, Cₛ, Cₜ, gs) = No(vₛ,vₜ) ∧ covers(Cₜ, S) → vₛ≃vₜ ∈ S ∧ Cₛ(vₛ) ≠gs Cₜ(vₜ)
Corollary: For a full input space /S/, that is the universe of the
target program we say:
| equiv(S, 〚tₛ〛ₛ, 〚tₜ〛ₜ, ∅) = Yes ⇔ tₛ ≃ tₜ
*** Proof of the correctness of the translation from source programs to source decision trees
We define a source term tₛ as a collection of patterns pointing to blackboxes
| tₛ ::= (p → bb)^{i∈I}
A pattern is defined as either a constructor pattern, an or pattern or
a constant pattern
| p ::= | K(pᵢ)ⁱ, i ∈ I | (p|q) | n ∈
A decision tree is defined as either a Leaf, a Failure terminal or
an intermediate node with different children sharing the same accessor /a/
and an optional fallback.
Failure is emitted only when the patterns don't cover the whole set of
possible input values /S/. The fallback is not needed when the user
doesn't use a wildcard pattern.
%%% Give example of thing
| Cₛ ::= Leaf bb | Node(a, (Kᵢ → Cᵢ)^{i∈S} , C?)
| a ::= Here | n.a
| vₛ ::= K(vᵢ)^{i∈I} | n ∈
\begin{comment}
Are K and Here clear here?
\end{comment}
We define the decomposition matrix /mₛ/ as
| SMatrix mₛ := (aⱼ)^{j∈J}, ((p_{ij})^{j∈J} → bbᵢ)^{i∈I}
\begin{comment}
Correggi prendendo in considerazione l'accessor
\end{comment}
We define the decision tree of source programs
〚tₛ〛
in terms of the decision tree of pattern matrices
〚mₛ〛
by the following:
〚((pᵢ → bbᵢ)^{i∈I}〛 := 〚(Root), (pᵢ → bbᵢ)^{i∈I} 〛
decision tree computed from pattern matrices respect the following invariant:
| ∀v (vᵢ)^{i∈I} = v(aᵢ)^{i∈I} → 〚m〛(v) = m(vᵢ)^{i∈I} for m = ((aᵢ)^{i∈I}, (rᵢ)^{i∈I})
where
| v(Here) = v
| K(vᵢ)ⁱ(k.a) = vₖ(a) if k ∈ [0;n[
\begin{comment}
TODO: EXPLAIN
\end{comment}
We proceed to show the correctness of the invariant by a case
analysys.
Base cases:
1. [| ∅, (∅ → bbᵢ)ⁱ |] := Leaf bbᵢ where i := min(I), that is a
decision tree [|ms|] defined by an empty accessor and empty
patterns pointing to blackboxes /bbᵢ/. This respects the invariant
because a decomposition matrix in the case of empty rows returns
the first expression and we known that (Leaf bb)(v) := Match bb
2. [| (aⱼ)ʲ, ∅ |] := Failure
Regarding non base cases:
Let's first define
| let Idx(k) := [0; arity(k)[
| let First(∅) := ⊥
| let First((aⱼ)ʲ) := a_{min(j∈J≠∅)}
\[
m := ((a_i)^i ((p_{ij})^i \to e_j)^{ij})
\]
\[
(k_k)^k := headconstructor(p_{i0})^i
\]
\begin{equation}
Groups(m) := ( k_k \to ((a)_{0.l})^{l \in Idx(k_k)} +++ (a_i)^{i \in I\backslash \{0\} }), \\
( if p_{0j} is k(q_l) then \\
(qₗ)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\backslash \{0\}} \to e_j \\
if p_{0j} is \_ then \\
(\_)^{l \in Idx(k_k)} +++ (p_{ij})^{i \in I\backslash \{0\}} \to e_j \\
else \bot )^j ), \\
((a_i)^{i \in I\backslash \{0\}}, ((p_{ij})^{i \in I\backslash \{0\}} \to eⱼ if p_{0j} is \_ else \bot)^{j \in J})
\end{equation}
Groups(m) is an auxiliary function that decomposes a matrix m into
submatrices, according to the head constructor of their first pattern.
Groups(m) returns one submatrix m_r for each head constructor k that
occurs on the first row of m, plus one "wildcard submatrix" m_{wild}
that matches on all values that do not start with one of those head
constructors.
Intuitively, m is equivalent to its decompositionin the following
sense: if the first pattern of an input vector (v_i)^i starts with one
of the head constructors k, then running (v_i)^i against m is the same
as running it against the submatrix m_k; otherwise (its head
constructor is none of the k) it is equivalent to running it against
the wildcard submatrix.
We formalize this intuition as follows:
Lemma (Groups):
Let \[m\] be a matrix with \[Groups(m) = (k_r \to m_r)^k, m_{wild}\].
For any value vector \[(v_i)^l\] such that \[v_0 = k(v'_l)^l\] for some
constructor k,
we have:
\[
if k = kₖ for some k then
m(vᵢ)ⁱ = mₖ((v'ₗ)ˡ +++ (vᵢ)^{i∈I\backslash \{0\}})
else
m(vᵢ)ⁱ = m_{wild}(vᵢ)^{i∈I\backslash \{0\}}
\]
*** Proof:
Let \[m\] be a matrix with \[Group(m) = (k_r \to m_r)^k, m_{wild}\].
Let \[(v_i)^i\] be an input matrix with \[v_0 = k(v'_l)^l\] for some k.
We proceed by case analysis:
- either k is one of the kₖ for some k
- or k is none of the (kₖ)ᵏ
Both m(vᵢ)ⁱ and mₖ(vₖ)ᵏ are defined as the first matching result of
a family over each row rⱼ of a matrix
We know, from the definition of
Groups(m), that mₖ is
\[
((a){0.l})^{l∈Idx(kₖ)} +++ (aᵢ)^{i∈I\backslash \{0\}}),
(
if p_{0j} is k(qₗ) then
(qₗ)ˡ +++ (p_{ij})^{i∈I\backslash \{0\}} → eⱼ
if p_{0j} is _ then
(_)ˡ +++ (p_{ij})^{i∈I\backslash \{0\}} → eⱼ
else ⊥
\]
By definition, m(vᵢ)ⁱ is
m(vᵢ)ⁱ = First(rⱼ(vᵢ)ⁱ)ʲ for m = ((aᵢ)ⁱ, (rⱼ)ʲ)
(pᵢ)ⁱ (vᵢ)ⁱ = {
if p₀ = k(qₗ)ˡ, v₀ = k'(v'ₖ)ᵏ, k=Idx(k') and l=Idx(k)
if k ≠ k' then ⊥
if k = k' then ((qₗ)ˡ +++ (pᵢ)^{i∈I\backslash \{0\}}) ((v'ₖ)ᵏ +++ (vᵢ)^{i∈I\backslash \{0\}})
if p₀ = (q₁|q₂) then
First( (q₁pᵢ^{i∈I \backslash \{0\}}) vᵢ^{i∈I \backslash \{0\}}, (q₂pᵢ^{i∈I \backslash \{0\}}) vᵢ^{i∈I \backslash \{0\}} )
}
For this reason, if we can prove that
| ∀j, rⱼ(vᵢ)ⁱ = r'ⱼ((v'ₖ)ᵏ ++ (vᵢ)ⁱ)
it follows that
| m(vᵢ)ⁱ = mₖ((v'ₖ)ᵏ ++ (vᵢ)ⁱ)
from the above definition.
We can also show that aᵢ = a_{0.l}ˡ +++ a_{i∈I\backslash \{0\}} because v(a₀) = K(v(a){0.l})ˡ)
** Proof of equivalence checking
\begin{comment}
TODO: put ^i∈I where needed
\end{comment}
\subsubsection{The trimming lemma}
The trimming lemma allows to reduce the size of a decision tree given
an accessor → π relation (TODO: expand)
| ∀vₜ ∈ (a→π), Cₜ(vₜ) = C_{t/a→π(kᵢ)}(vₜ)
We prove this by induction on Cₜ:
a. Cₜ = Leaf_{bb}: when the decision tree is a leaf terminal, we
know that
| Leaf_{bb/a→π}(v) = Leaf_{bb}(v)
That means that the result of trimming on a Leaf is the Leaf itself
b. The same applies to Failure terminal
| Failure_{/a→π}(v) = Failure(v)
c. When Cₜ = Node(b, (π→Cᵢ)ⁱ)_{/a→π} then
we look at the accessor /a/ of the subtree Cᵢ and
we define πᵢ' = πᵢ if a≠b else πᵢ∩π
Trimming a switch node yields the following result:
| Node(b, (π→Cᵢ)ⁱ)_{/a→π} := Node(b, (π'ᵢ→C_{i/a→π})ⁱ)
\begin{comment}
Actually in the proof.org file I transcribed:
e. Unreachabe → ⊥
This is not correct because you don't have Unreachable nodes in target decision trees
\end{comment}
For the trimming lemma we have to prove that running the value vₜ against
the decistion tree Cₜ is the same as running vₜ against the tree
C_{trim} that is the result of the trimming operation on Cₜ
| Cₜ(vₜ) = C_{trim}(vₜ) = Node(b, (πᵢ'→C_{i/a→π})ⁱ)(vₜ)
We can reason by first noting that when vₜ∉(b→πᵢ)ⁱ the node must be a Failure node.
In the case where ∃k| vₜ∈(b→πₖ) then we can prove that
| C_{k/a→π}(vₜ) = Node(b, (πᵢ'→C_{i/a→π})ⁱ)(vₜ)
because when a ≠ b then πₖ'= πₖ and this means that vₜ∈πₖ'
while when a = b then πₖ'=(πₖ∩π) and vt∈πₖ' because:
- by the hypothesis, vₜ∈π
- we are in the case where vₜ∈πₖ
So vₜ ∈ πₖ' and by induction
| Cₖ(vₜ) = C_{k/a→π}(vₜ)
We also know that ∀vₜ∈(b→πₖ) → Cₜ(vₜ) = Cₖ(vₜ)
By putting together the last two steps, we have proven the trimming
lemma.
\begin{comment}
TODO: what should I say about covering??? I swap π and π'
Covering lemma:
∀a,π covers(Cₜ,S) → covers(C_{t/a→π}, (S∩a→π))
Uᵢπⁱ ≈ Uᵢπ'∩(a→π) ≈ (Uᵢπ')∩(a→π) %%
%%%%%%% Also: Should I swap π and π' ?
\end{comment}
\subsubsection{Equivalence checking}
The equivalence checking algorithm takes as parameters an input space
/S/, a source decision tree /Cₛ/ and a target decision tree /Cₜ/:
| equiv(S, Cₛ, Cₜ) → Yes | No(vₛ, vₜ)
When the algorithm returns Yes and the input space is covered by /Cₛ/
we can say that the couple of decision trees are the same for
every couple of source value /vₛ/ and target value /vₜ/ that are equivalent.
\begin{comment}
Define "covered"
Is it correct to say the same? How to correctly distinguish in words ≃ and = ?
\end{comment}
| equiv(S, Cₛ, Cₜ) = Yes and cover(Cₜ, S) → ∀ vₛ ≃ vₜ∈S ∧ Cₛ(vₛ) = Cₜ(vₜ)
In the case where the algorithm returns No we have at least a couple
of counter example values vₛ and vₜ for which the two decision trees
outputs a different result.
| equiv(S, Cₛ, Cₜ) = No(vₛ,vₜ) and cover(Cₜ, S) → ∀ vₛ ≃ vₜ∈S ∧ Cₛ(vₛ) ≠ Cₜ(vₜ)
We define the following
| Forall(Yes) = Yes
| Forall(Yes::l) = Forall(l)
| Forall(No(vₛ,vₜ)::_) = No(vₛ,vₜ)
There exists and are injective:
| int(k) ∈ (arity(k) = 0)
| tag(k) ∈ (arity(k) > 0)
| π(k) = {n\vert int(k) = n} x {n\vert tag(k) = n}
where k is a constructor.
\begin{comment}
TODO: explain:
∀v∈a→π, C_{/a→π}(v) = C(v)
\end{comment}
We proceed by case analysis:
\begin{comment}
I start numbering from zero to leave the numbers as they were on the blackboard, were we skipped some things
I think the unreachable case should go at the end.
\end{comment}
0. in case of unreachable:
| Cₛ(vₛ) = Absurd(Unreachable) ≠ Cₜ(vₜ) ∀vₛ,vₜ
1. In the case of an empty input space
| equiv(∅, Cₛ, Cₜ) := Yes
and that is trivial to prove because there is no pair of values (vₛ, vₜ) that could be
tested against the decision trees.
In the other subcases S is always non-empty.
2. When there are /Failure/ nodes at both sides the result is /Yes/:
|equiv(S, Failure, Failure) := Yes
Given that ∀v, Failure(v) = Failure, the statement holds.
3. When we have a Leaf or a Failure at the left side:
| equiv(S, Failure as Cₛ, Node(a, (πᵢ → Cₜᵢ)ⁱ)) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)ⁱ)
| equiv(S, Leaf bbₛ as Cₛ, Node(a, (πᵢ → Cₜᵢ)ⁱ)) := Forall(equiv( S∩a→π(kᵢ)), Cₛ, Cₜᵢ)ⁱ)
The algorithm either returns Yes for every sub-input space Sᵢ := S∩(a→π(kᵢ)) and
subtree Cₜᵢ
| equiv(Sᵢ, Cₛ, Cₜᵢ) = Yes ∀i
or we have a counter example vₛ, vₜ for which
| vₛ≃vₜ∈Sₖ ∧ cₛ(vₛ) ≠ Cₜₖ(vₜ)
then because
| vₜ∈(a→πₖ) → Cₜ(vₜ) = Cₜₖ(vₜ) ,
| vₛ≃vₜ∈S ∧ Cₛ(vₛ)≠Cₜ(vₜ)
we can say that
| equiv(Sᵢ, Cₛ, Cₜᵢ) = No(vₛ, vₜ) for some minimal k∈I
4. When we have a Node on the right we define πₙ as the domain of
values not covered but the union of the constructors kᵢ
| πₙ = ¬(⋃π(kᵢ)ⁱ)
The algorithm proceeds by trimming
| equiv(S, Node(a, (kᵢ → Cₛᵢ)ⁱ, C_{sf}), Cₜ) :=
| Forall(equiv( S∩(a→π(kᵢ)ⁱ), Cₛᵢ, C_{t/a→π(kᵢ)})ⁱ +++ equiv(S∩(a→π(kᵢ)), Cₛ, C_{a→πₙ}))
The statement still holds and we show this by first analyzing the
/Yes/ case:
| Forall(equiv( S∩(a→π(kᵢ)ⁱ), Cₛᵢ, C_{t/a→π(kᵢ)})ⁱ = Yes
The constructor k is either included in the set of constructors kᵢ:
| k \vert k∈(kᵢ)ⁱ ∧ Cₛ(vₛ) = Cₛᵢ(vₛ)
We also know that
| (1) Cₛᵢ(vₛ) = C_{t/a→πᵢ}(vₜ)
| (2) C_{T/a→πᵢ}(vₜ) = Cₜ(vₜ)
(1) is true by induction and (2) is a consequence of the trimming lemma.
Putting everything together:
| Cₛ(vₛ) = Cₛᵢ(vₛ) = C_{T/a→πᵢ}(vₜ) = Cₜ(vₜ)
When the k∉(kᵢ)ⁱ [TODO]
The auxiliary Forall function returns /No(vₛ, vₜ)/ when, for a minimum k,
| equiv(Sₖ, Cₛₖ, C_{T/a→πₖ} = No(vₛ, vₜ)
Then we can say that
| Cₛₖ(vₛ) ≠ C_{t/a→πₖ}(vₜ)
that is enough for proving that
| Cₛₖ(vₛ) ≠ (C_{t/a→πₖ}(vₜ) = Cₜ(vₜ))