results of mumble meeting

This commit is contained in:
Francesco Mecca 2020-04-03 15:53:54 +02:00
parent b433663bdd
commit 42f531b3e3
3 changed files with 271 additions and 18 deletions

121
tesi/mumble_meeting_gabriel Normal file
View file

@ -0,0 +1,121 @@
result r ::= guard list * (Match bb | NoMatch | Absurd)
guard := bb
(equiv S Cₛ Cₜ gs = Yes) ∧ covers(Cₜ, S) ⇒ ∀vₛ≃vₜ ∈ S, Cₛ(vₛ) ≃gs Cₜ(vₜ)
(equiv S Cₛ Cₜ gs = No(vₛ,vₜ) ∧ covers(Cₜ, S) ⇒ vₛ≃vₜ ∈ S ∧ Cₛ(vₛ) ≠gs Cₜ(vₜ)
(vs,vt) \in S
-------------------------
Test(S,false) = No(vs,vt)
Test(S,true) = Yes
-------------------------------
equiv \emptyset Cs Ct gs ~> Yes
-----------------------------------------
equiv S Failure (Leaf BBt) gs ~>No(vs, vt)
-----------------------------------------
equiv S (Leaf BBs) Failure gs ~>No(vs, vt)
----------------------------------------------
equiv S Failure Failure gs ~> Test(S, gs = [])
rBB = Test(S, equiv_BB BBs BBt)
rGuard = Test(s, gs = [])
------------------------------------------------------
equiv S (Leaf BBs) (Leaf BBt) gs ~> Merge(rBB, rGuard)
Cs \in (Leaf BBs, Failure)
(equiv S Cs Ci gs)^i ~> ri)^i
equiv S Cs Cf gs ~> rf
-----------------------------------------------------
equiv S Cs (Node(a, (Domi,Ci)^i, Cf)) gs ~> Merge((ri)^i, rf)
(equiv S Ci (trim Ct a=Ki) gs ~> ri)^i
equiv S Cf (trim Ct (a\notin(K_i)^i)) gs ~> rf
----------------------------------------------------------
equiv S (Node(a, (Ki,Ci)^i, Cf) Ct gs ~> Merge((ri)^i, rf)
<guard here>
equiv S Ctrue Ct (gs++[condition]) ~> rt
equiv S Cfalse Ct (gs++[condition]) ~> rf
-------------------------------------------
equiv S (Guard condition Ctrue Cfalse) Ct gs ~> (Merge rt rf)
rCond = Test(S, cond = condition)
equiv S Cs Ctrue gs ~> rt
equiv S Cs Cfalse gs ~> rf
-------------------------------------------
equiv S Cs (Guard condition Ctrue Cfalse) ([cond]++gs) ~> (Merge rt rf)
(vs,vt) \in S
-------------------------------------------
equiv S Cs (Guard condition Ctrue Cfalse) [] ~> No(vs,vt)
(Simplified rules)
Idea: only talk about the valid equivalences (no No business)
so instead of proving facts of the form
equiv S Cs Ct gs ~> r
we just prove
equiv S Cs Ct gs
------------------------
equiv \emptyset Cs Ct gs
--------------------------
equiv S Failure Failure []
equiv_BB BBs BBt
-------------------------------
equiv S (Leaf BBs) (Leaf BBt) []
Cs \in (Leaf BBs, Failure)
(equiv S Cs Ci gs)^i
equiv S Cs Cf gs
-----------------------------------------
equiv S Cs (Node(a, (Domi,Ci)^i, Cf)) gs
equiv S Ci (trim Ct a=Ki) gs
equiv S Cf (trim Ct (a\notin(K_i)^i) gs
-------------------------------------
equiv S (Node(a, (Ki,Ci)^i, Cf) Ct gs
equiv S Ctrue Ct (gs++[condition])
equiv S Cfalse Ct (gs++[condition])
--------------------------------------------
equiv S (Guard condition Ctrue Cfalse) Ct gs
equiv S Cs Ctrue gs
equiv S Cs Cfalse gs
--------------------------------------------
equiv S Cs (Guard condition Ctrue Cfalse) ([condition]++gs)
(Let's forget about those that are below)
forall i, (equiv S Ci (Ct/a=Ki) = Yes)
equiv S Cf (Ct/a\notin(K_i)^i) = Yes
-------------------------------------------
equiv S (Node(a, (Ki,Ci)^i, Cf) Ct gs ~> Yes
exists i, (equiv S Ci (Ct/a=Ki) = No(vs,vt))
-------------------------------------------------
equiv S (Node(a, (Ki,Ci)^i, Cf) Ct gs ~> No(vs,vt)
equiv S Cf (Ct/a\notin(K_i)^i) = No(vs,vt)
-------------------------------------------------
equiv kS (Node(a, (Ki,Ci)^i, Cf) Ct gs ~> No(vs,vt)
forall S Cs Ct gs, exists (a unique) r such that
(equiv S Cs Ct gs = r) is provable

Binary file not shown.

View file

@ -35,12 +35,18 @@ Magari prima pattern matching poi compilatore?
#+LaTeX_HEADER: \usepackage{comment}
#+LaTeX_HEADER: \usepackage{algpseudocode}
#+LaTeX_HEADER: \usepackage{amsmath,amssymb,amsthm}
#+Latex_HEADER: \newtheorem{definition}{Definition}
#+LaTeX_HEADER: \newtheorem{definition}{Definition}
#+LaTeX_HEADER: \usepackage{mathpartir}
#+LaTeX_HEADER: \usepackage{graphicx}
#+LaTeX_HEADER: \usepackage{listings}
#+LaTeX_HEADER: \usepackage{color}
#+LaTeX_HEADER: \usepackage{stmaryrd}
#+LaTeX_HEADER: \newcommand{\sem}[1]{{\llbracket{#1}\rrbracket}}
#+LaTeX_HEADER: \newcommand{\Equiv}[3]{\mathsf{equiv}(#1, #2, #3)} % \equiv is already taken
#+LaTeX_HEADER: \newcommand{\covers}[2]{#1 \mathrel{\mathsf{covers}} #2}
#+LaTeX_HEADER: \newcommand{\Yes}{\mathsf{Yes}}
#+LaTeX_HEADER: \newcommand{\No}[2]{\mathsf{No}(#1, #2)}
#+EXPORT_SELECT_TAGS: export
#+EXPORT_EXCLUDE_TAGS: noexport
#+OPTIONS: H:2 toc:nil \n:nil @:t ::t |:t ^:{} _:{} *:t TeX:t LaTeX:t
@ -89,16 +95,16 @@ into simple control flow structures such as \texttt{if, switch}, for example:
\begin{comment}
%% TODO: side by side
\end{comment}
The code in the right is in the Lambda intermediate representation of
The code on the right is in the Lambda intermediate representation of
the OCaml compiler. The Lambda representation of a program is shown by
calling the \texttt{ocamlc} compiler with \texttt{-drawlambda} flag.
The OCaml pattern matching compiler is a critical part of the OCaml compiler
in terms of correctness because any bug would result in wrong code
in terms of correctness because bugs typically would result in wrong code
production rather than triggering compilation failures.
Such bugs also are hard to catch by testing because they arise in
corner cases of complex patterns which are typically not in the
compiler test suite.
compiler test suite or most user programs.
The OCaml core developers group considered evolving the pattern matching compiler, either by
using a new algorithm or by incremental refactoring of its code base.
@ -120,19 +126,42 @@ Our algorithm translates both source and target programs into a common
representation, decision trees. Decision trees where chosen because
they model the space of possible values at a given branch of
execution.
Here is the decision tree for the source example program.
Here are the decision trees for the source and target example program.
\begin{minipage}{0.5\linewidth}
\begin{verbatim}
Node(Root)
Switch(Root)
/ \
(= []) (= ::)
/ \
Leaf Node(Root.1)
Leaf Switch(Root.1)
(0, None) / \
(= []) (= ::)
/ \
Leaf Leaf
(1, Some(Root.0)) (2, Some(Root.1.0))
[x = Root.0] [y = Root.1.0]
(1, Some x) (2, Some y)
\end{verbatim}
\end{minipage}
\hfill
\begin{minipage}{0.5\linewidth}
\begin{verbatim}
Switch(Root)
/ \
(= int 0) (!= int 0)
/ \
Leaf Switch(Root.1)
(makeblock 0 / \
0 0a) / \
(= int 0) (!= int 0)
/ \
Leaf Leaf
[x = Root.0] [y = Root.1.0]
(makeblock 0 (makeblock 0
1 (makeblock 0 x)) 2 (makeblock 0 y))
\end{verbatim}
\end{minipage}
\texttt{(Root.0)} is called an \emph{accessor}, that represents the
access path to a value that can be reached by deconstructing the
scrutinee. In this example \texttt{Root.0} is the first subvalue of
@ -140,8 +169,8 @@ the scrutinee.
Target decision trees have a similar shape but the tests on the
branches are related to the low level representation of values in
Lambda code. For example, cons cells \texttt{x::xs} are blocks with
tag 0.
Lambda code. For example, cons cells \texttt{x::xs} or tuples
\texttt{(x,y)} are blocks with tag 0.
To check the equivalence of a source and a target decision tree,
we proceed by case analysis.
@ -155,10 +184,11 @@ reduced tree is equivalent to $C_i$.
\subsection{From source programs to decision trees}
Our source language supports integers, lists, tuples and all algebraic
datatypes. Patterns support wildcards, constructors and literals, or
patterns $(p_1|p_2)$ and pattern variables.
We also support \texttt{when} guards.
Decision trees have nodes of the form:
datatypes. Patterns support wildcards, constructors and literals,
or-patterns $(p_1 | p_2)$ and pattern variables. We also support
\texttt{when} guards, which are interesting as they introduce the
evaluation of expressions during matching. Decision trees have nodes
of the form:
\begin{lstlisting}
type decision_tree =
| Unreachable
@ -183,7 +213,7 @@ It satisfies the following correctness statement:
\[
\forall t_S, \forall v_S, \quad t_S(v_S) = \sem{t_S}_S(v_S)
\]
Running any source values $v_S$ against the source program gives the
Running any source value $v_S$ against the source program gives the
same result as running it against the decision tree.
\subsection{From target programs to decision trees}
@ -192,8 +222,10 @@ The target programs include the following Lambda constructs:
various comparison operations, guards. The symbolic execution engine
traverses the target program and builds an environment that maps
variables to accessors. It branches at every control flow statement
and emits a Switch node. The branch condition $\pi_i$ is expressed as
and emits a \texttt{Switch} node. The branch condition $\pi_i$ is expressed as
an interval set of possible values at that point.
In comparison with the source decision trees, \texttt{Unreachable}
nodes are never emitted.
Guards result in branching. In comparison with the source decision
trees, \texttt{Unreachable} nodes are never emitted.
@ -204,8 +236,106 @@ $t_T$, satisfying the following correctness statement:
\]
\subsection{Equivalence checking}
TODO
The equivalence checking algorithm takes as input a domain of
possible values \emph{S} and a pair of source and target decision trees and
in case the two trees are not equivalent it returns a counter example.
The algorithm respects the following correctness statement:
\begin{comment}
% TODO: we have to define what \covers mean for readers to understand the specifications
% (or we use a simplifying lie by hiding \covers in the statements).
\end{comment}
\begin{align*}
\Equiv S {C_S} {C_T} = \Yes \;\land\; \covers {C_T} S
& \implies
\forall v_S \approx v_T \in S,\; C_S(v_S) = C_T(v_T)
\\
\Equiv S {C_S} {C_T} = \No {v_S} {v_T} \;\land\; \covers {C_T} S
& \implies
v_S \approx v_T \in S \;\land\; C_S(v_S) \neq C_T(v_T)
\end{align*}
The algorithm proceeds by case analysis. Inference rules are shown.
If $S$ is empty the results is $\Yes$.
\begin{verbatim}
------------------------
equiv \emptyset Cs Ct gs
\end{verbatim}
If the two decision trees are both terminal nodes the algorithm checks
for content equality.
\begin{verbatim}
--------------------------
equiv S Failure Failure []
equiv_BB BBs BBt
-------------------------------
equiv S (Leaf BBs) (Leaf BBt) []
\end{verbatim}
If the source decision tree (left hand side) is a terminal while the
target decistion tree (right hand side) is not, the algorithm proceeds
by \emph{explosion} of the right hand side. Explosion means that every
child of the right hand side is tested for equality against the left
hand side.
\begin{verbatim}
(equiv S Cs Ci gs)^i
equiv S Cs Cf gs
-----------------------------------------
equiv S Cs (Node(a, (Domi,Ci)^i, Cf)) gs
\end{verbatim}
\begin{comment}
% TODO: [Gabriel] in practice the $dom_S$ are constructors and the
% $dom_T$ are domains. Do we want to hide this fact, or be explicit
% about it? (Maybe we should introduce explicitly/clearly the idea of
% target domains at some point).
\end{comment}
When the left hand side is not a terminal, the algorithm explodes the
left hand side while trimming every right hand side subtree. Trimming
a left hand side tree on an interval set $dom_S$ computed from the right hand
side tree constructor means mapping every branch condition $dom_T$ (interval set of
possible values) on the left to the intersection of $dom_T$ and $dom_S$ when
the accessors on both side are equal, and removing the branches that
result in an empty intersection. If the accessors are different,
\emph{$dom_T$} is left unchanged.
\begin{verbatim}
equiv S Ci (trim Ct a=Ki) gs
equiv S Cf (trim Ct (a\notin(K_i)^i) gs
-------------------------------------
equiv S (Node(a, (Ki,Ci)^i, Cf) Ct gs
\end{verbatim}
The equivalence checking algorithm deals with guards by storing a
queue. A guard blackbox is pushed to the queue whenever the algorithm
encounters a Guard node on the right, while it pops a blackbox from
the queue whenever a Guard node appears on the left hand side.
The algorithm stops with failure if the popped blackbox and the and
blackbox on the left hand Guard node are different, otherwise in
continues by exploding to two subtrees, one in which the guard
condition evaluates to true, the other when it evaluates to false.
Termination of the algorithm is successful only when the guards queue
is empty.
\begin{verbatim}
equiv S Ctrue Ct (gs++[condition])
equiv S Cfalse Ct (gs++[condition])
--------------------------------------------
equiv S (Guard condition Ctrue Cfalse) Ct gs
equiv S Cs Ctrue gs
equiv S Cs Cfalse gs
--------------------------------------------
equiv S Cs (Guard condition Ctrue Cfalse) ([condition]++gs)
\end{verbatim}
\begin{comment}
TODO: replace inference rules with good latex
\end{comment}
* Background
@ -503,7 +633,9 @@ Moreover, the pattern matching compiler emits a warning when a
pattern is not exhaustive or some patterns are shadowed by precedent ones.
** Symbolic execution
TODO
Symbolic execution is a widely used techniques in the field of
computer security.
** Translation validation
Translators, such as translators and code generators, are huge pieces of