coppo #2
This commit is contained in:
parent
7ae47c2e45
commit
d35546fe8b
2 changed files with 28 additions and 22 deletions
3
tesi/.gitignore
vendored
3
tesi/.gitignore
vendored
|
@ -1,6 +1,9 @@
|
||||||
tesi.tex
|
tesi.tex
|
||||||
.#tesi.tex
|
.#tesi.tex
|
||||||
|
|
||||||
|
tesi_unicode.tex
|
||||||
|
|
||||||
|
|
||||||
## Core latex/pdflatex auxiliary files:
|
## Core latex/pdflatex auxiliary files:
|
||||||
*.aux
|
*.aux
|
||||||
*.lof
|
*.lof
|
||||||
|
|
|
@ -49,8 +49,13 @@ TODO: talk about compiler stuff
|
||||||
|
|
||||||
This dissertation presents an algorithm for the translation validation of the OCaml pattern
|
This dissertation presents an algorithm for the translation validation of the OCaml pattern
|
||||||
matching compiler. Given a source program and its compiled version the
|
matching compiler. Given a source program and its compiled version the
|
||||||
algorithm checks wheter the two are equivalent or produce a counter
|
algorithm checks whether the two are equivalent or produce a counter
|
||||||
example in case of a mismatch.
|
example in case of a mismatch.
|
||||||
|
For the prototype of this algorithm we have chosen a subset of the OCaml
|
||||||
|
language and implemented a prototype equivalence checker along with a
|
||||||
|
formal statement of correctness and its proof.
|
||||||
|
The prototype is to be included in the OCaml compiler infrastructure
|
||||||
|
and will aid the development.
|
||||||
|
|
||||||
Our equivalence algorithm works with decision trees. Source patterns are
|
Our equivalence algorithm works with decision trees. Source patterns are
|
||||||
converted into a decision tree using a matrix decomposition algorithm.
|
converted into a decision tree using a matrix decomposition algorithm.
|
||||||
|
@ -58,7 +63,9 @@ Target programs, described in the Lambda intermediate
|
||||||
representation language of the OCaml compiler, are turned into decision trees
|
representation language of the OCaml compiler, are turned into decision trees
|
||||||
by applying symbolic execution.
|
by applying symbolic execution.
|
||||||
|
|
||||||
|
\begin{comment}
|
||||||
\subsection{Translation validation}
|
\subsection{Translation validation}
|
||||||
|
\end{comment}
|
||||||
A pattern matching compiler turns a series of pattern matching clauses
|
A pattern matching compiler turns a series of pattern matching clauses
|
||||||
into simple control flow structures such as \texttt{if, switch}, for example:
|
into simple control flow structures such as \texttt{if, switch}, for example:
|
||||||
\begin{lstlisting}
|
\begin{lstlisting}
|
||||||
|
@ -79,7 +86,9 @@ into simple control flow structures such as \texttt{if, switch}, for example:
|
||||||
(makeblock 0 1 (makeblock 0 y)))))
|
(makeblock 0 1 (makeblock 0 y)))))
|
||||||
[0: 0 0a])
|
[0: 0 0a])
|
||||||
\end{lstlisting}
|
\end{lstlisting}
|
||||||
|
\begin{comment}
|
||||||
%% TODO: side by side
|
%% TODO: side by side
|
||||||
|
\end{comment}
|
||||||
The code in the right is in the Lambda intermediate representation of
|
The code in the right is in the Lambda intermediate representation of
|
||||||
the OCaml compiler. The Lambda representation of a program is shown by
|
the OCaml compiler. The Lambda representation of a program is shown by
|
||||||
calling the \texttt{ocamlc} compiler with \texttt{-drawlambda} flag.
|
calling the \texttt{ocamlc} compiler with \texttt{-drawlambda} flag.
|
||||||
|
@ -92,23 +101,23 @@ corner cases of complex patterns which are typically not in the
|
||||||
compiler test suite.
|
compiler test suite.
|
||||||
|
|
||||||
The OCaml core developers group considered evolving the pattern matching compiler, either by
|
The OCaml core developers group considered evolving the pattern matching compiler, either by
|
||||||
using a new algorithm or by incremental refactorings of its codebase.
|
using a new algorithm or by incremental refactoring of its code base.
|
||||||
For this reason we want to verify that new implementations of the
|
For this reason we want to verify that new implementations of the
|
||||||
compiler avoid the introduction of new bugs and that such
|
compiler avoid the introduction of new bugs and that such
|
||||||
modifications don't result in a different behaviour than the current one.
|
modifications don't result in a different behavior than the current one.
|
||||||
|
|
||||||
One possible approach is to formally verify the pattern matching compiler
|
One possible approach is to formally verify the pattern matching compiler
|
||||||
implementation using a machine checked proof.
|
implementation using a machine checked proof.
|
||||||
Another possibility, albeit with a weaker result, is to verify that
|
Another possibility, albeit with a weaker result, is to verify that
|
||||||
each source program and target program pair are semantically correct.
|
each source program and target program pair are semantically correct.
|
||||||
We chose the latter technique, translation validation because is easier to adopt in
|
We chose the latter technique, translation validation because is easier to adopt in
|
||||||
the case of a production compiler and to integrate with an existing codebase. The compiler is treated as a
|
the case of a production compiler and to integrate with an existing code base. The compiler is treated as a
|
||||||
blackbox and proof only depends on our equivalence algorithm.
|
black-box and proof only depends on our equivalence algorithm.
|
||||||
|
|
||||||
\subsection{Our approach}
|
\subsection{Our approach}
|
||||||
%% replace common TODO
|
%% replace common TODO
|
||||||
Our algorithm translates both source and target programs into a common
|
Our algorithm translates both source and target programs into a common
|
||||||
representation, decision trees. Decision trees where choosen because
|
representation, decision trees. Decision trees where chosen because
|
||||||
they model the space of possible values at a given branch of
|
they model the space of possible values at a given branch of
|
||||||
execution.
|
execution.
|
||||||
Here is the decision tree for the source example program.
|
Here is the decision tree for the source example program.
|
||||||
|
@ -144,12 +153,6 @@ subtree $C_i$. For every child $(\pi_i, C_i)$ we reduce $T$ by killing all
|
||||||
the branches that are incompatible with $\pi_i$ and check that the
|
the branches that are incompatible with $\pi_i$ and check that the
|
||||||
reduced tree is equivalent to $C_i$.
|
reduced tree is equivalent to $C_i$.
|
||||||
|
|
||||||
For the prototype we have choosen a simple subset of the OCaml
|
|
||||||
language and implemented a prototype equivalence checker along with a
|
|
||||||
formal statement of correctness and proof sketches.
|
|
||||||
The prototype is to be included in the OCaml compiler infrastructure
|
|
||||||
and will aid the development.
|
|
||||||
|
|
||||||
\subsection{From source programs to decision trees}
|
\subsection{From source programs to decision trees}
|
||||||
Our source language supports integers, lists, tuples and all algebraic
|
Our source language supports integers, lists, tuples and all algebraic
|
||||||
datatypes. Patterns support wildcards, constructors and literals, or
|
datatypes. Patterns support wildcards, constructors and literals, or
|
||||||
|
@ -186,7 +189,7 @@ same result as running it against the decision tree.
|
||||||
\subsection{From target programs to decision trees}
|
\subsection{From target programs to decision trees}
|
||||||
The target programs include the following Lambda constructs:
|
The target programs include the following Lambda constructs:
|
||||||
\texttt{let, if, switch, Match\_failure, catch, exit, field} and
|
\texttt{let, if, switch, Match\_failure, catch, exit, field} and
|
||||||
various comparation operations, guards. The symbolic execution engine
|
various comparison operations, guards. The symbolic execution engine
|
||||||
traverses the target program and builds an environment that maps
|
traverses the target program and builds an environment that maps
|
||||||
variables to accessors. It branches at every control flow statement
|
variables to accessors. It branches at every control flow statement
|
||||||
and emits a Switch node. The branch condition $\pi_i$ is expressed as
|
and emits a Switch node. The branch condition $\pi_i$ is expressed as
|
||||||
|
@ -214,7 +217,7 @@ The main features of ML languages are the use of the Hindley-Milner type system
|
||||||
provides many advantages with respect to static type systems of traditional imperative and object
|
provides many advantages with respect to static type systems of traditional imperative and object
|
||||||
oriented language such as C, C++ and Java, such as:
|
oriented language such as C, C++ and Java, such as:
|
||||||
- Polymorphism: in certain scenarios a function can accept more than one
|
- Polymorphism: in certain scenarios a function can accept more than one
|
||||||
type for the input parameters. For example a function that computes the lenght of a
|
type for the input parameters. For example a function that computes the length of a
|
||||||
list doesn't need to inspect the type of the elements of the list and for this reason
|
list doesn't need to inspect the type of the elements of the list and for this reason
|
||||||
a List.length function can accept lists of integers, lists of strings and in general
|
a List.length function can accept lists of integers, lists of strings and in general
|
||||||
lists of any type. Such languages offer polymorphic functions through subtyping at
|
lists of any type. Such languages offer polymorphic functions through subtyping at
|
||||||
|
@ -231,7 +234,7 @@ oriented language such as C, C++ and Java, such as:
|
||||||
programmer is not allowed to operate on data by ignoring or promoting its type.
|
programmer is not allowed to operate on data by ignoring or promoting its type.
|
||||||
- Type Inference: the principal type of a well formed term can be inferred without any
|
- Type Inference: the principal type of a well formed term can be inferred without any
|
||||||
annotation or declaration.
|
annotation or declaration.
|
||||||
- Algebraic data types: types that are modelled by the use of two
|
- Algebraic data types: types that are modeled by the use of two
|
||||||
algebraic operations, sum and product.
|
algebraic operations, sum and product.
|
||||||
A sum type is a type that can hold of many different types of
|
A sum type is a type that can hold of many different types of
|
||||||
objects, but only one at a time. For example the sum type defined
|
objects, but only one at a time. For example the sum type defined
|
||||||
|
@ -248,14 +251,14 @@ although mutable statements and imperative constructs are permitted.
|
||||||
In addition to that features an object system, that provides
|
In addition to that features an object system, that provides
|
||||||
inheritance, subtyping and dynamic binding, and modules, that
|
inheritance, subtyping and dynamic binding, and modules, that
|
||||||
provide a way to encapsulate definitions. Modules are checked
|
provide a way to encapsulate definitions. Modules are checked
|
||||||
statically and can be reificated through functors.
|
statically and can be reifycated through functors.
|
||||||
|
|
||||||
** Lambda form compilation
|
** Lambda form compilation
|
||||||
\begin{comment}
|
\begin{comment}
|
||||||
https://dev.realworld.org/compiler-backend.html
|
https://dev.realworld.org/compiler-backend.html
|
||||||
\end{comment}
|
\end{comment}
|
||||||
|
|
||||||
provides compilation in form of a byecode executable with an
|
provides compilation in form of a bytecode executable with an
|
||||||
optionally embeddable interpreter and a native executable that could
|
optionally embeddable interpreter and a native executable that could
|
||||||
be statically linked to provide a single file executable.
|
be statically linked to provide a single file executable.
|
||||||
|
|
||||||
|
@ -263,7 +266,7 @@ After the typechecker has proven that the program is type safe,
|
||||||
the compiler lower the code to /Lambda/, an s-expression based
|
the compiler lower the code to /Lambda/, an s-expression based
|
||||||
language that assumes that its input has already been proved safe.
|
language that assumes that its input has already been proved safe.
|
||||||
On the /Lambda/ representation of the source program, the compiler
|
On the /Lambda/ representation of the source program, the compiler
|
||||||
performes a series of optimization passes before translating the
|
performs a series of optimization passes before translating the
|
||||||
lambda form to assembly code.
|
lambda form to assembly code.
|
||||||
|
|
||||||
*** OCaml Native Datatypes
|
*** OCaml Native Datatypes
|
||||||
|
@ -298,7 +301,7 @@ There are several numeric types:
|
||||||
- floats: that use IEEE754 double-precision (64-bit) arithmetic with
|
- floats: that use IEEE754 double-precision (64-bit) arithmetic with
|
||||||
the addition of the literals /infinity/, /neg_infinity/ and /nan/.
|
the addition of the literals /infinity/, /neg_infinity/ and /nan/.
|
||||||
|
|
||||||
The are varios numeric operations defined:
|
The are various numeric operations defined:
|
||||||
|
|
||||||
- Arithmetic operations: +, -, *, /, % (modulo), neg (unary negation)
|
- Arithmetic operations: +, -, *, /, % (modulo), neg (unary negation)
|
||||||
- Bitwise operations: &, |, ^, <<, >> (zero-shifting), a>> (sign extending)
|
- Bitwise operations: &, |, ^, <<, >> (zero-shifting), a>> (sign extending)
|
||||||
|
@ -349,7 +352,7 @@ match color with
|
||||||
#+END_SRC
|
#+END_SRC
|
||||||
|
|
||||||
provides tokens to express data destructoring.
|
provides tokens to express data destructoring.
|
||||||
For example we can examine the content of a list with patten matching
|
For example we can examine the content of a list with pattern matching
|
||||||
|
|
||||||
#+BEGIN_SRC
|
#+BEGIN_SRC
|
||||||
|
|
||||||
|
@ -432,7 +435,7 @@ We distinguish
|
||||||
- Unreachable: statically it is known that no value can go there
|
- Unreachable: statically it is known that no value can go there
|
||||||
- Failure: a value matching this part results in an error
|
- Failure: a value matching this part results in an error
|
||||||
- Leaf: a value matching this part results into the evaluation of a
|
- Leaf: a value matching this part results into the evaluation of a
|
||||||
source blackbox of code
|
source black box of code
|
||||||
|
|
||||||
The algorithm doesn't support type-declaration-based analysis
|
The algorithm doesn't support type-declaration-based analysis
|
||||||
to know the list of constructors at a given type.
|
to know the list of constructors at a given type.
|
||||||
|
@ -758,7 +761,7 @@ following four rules:
|
||||||
\end{equation*}
|
\end{equation*}
|
||||||
4) Mixture rule:
|
4) Mixture rule:
|
||||||
When none of the previous rules apply the clause matrix P → L is
|
When none of the previous rules apply the clause matrix P → L is
|
||||||
splitted into two clause matrices, the first P₁ → L₁ that is the
|
split into two clause matrices, the first P₁ → L₁ that is the
|
||||||
largest prefix matrix for which one of the three previous rules
|
largest prefix matrix for which one of the three previous rules
|
||||||
apply, and P₂ → L₂ containing the remaining rows. The algorithm is
|
apply, and P₂ → L₂ containing the remaining rows. The algorithm is
|
||||||
applied to both matrices.
|
applied to both matrices.
|
||||||
|
|
Loading…
Reference in a new issue