inizio source

This commit is contained in:
Francesco Mecca 2020-03-03 17:18:40 +01:00
parent 2cd2b73451
commit e4fc6b92aa
4 changed files with 346 additions and 149 deletions

View file

@ -3,7 +3,7 @@ import re
from sys import argv
allsymbols = json.load(open('./unicode-latex.json'))
mysymbols = ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
mysymbols = ['', '', '', '', '', '', '', '', '', '', 'ε','', '', '', '', '', '', '', '', '', '', '']
symbols = {s: allsymbols[s] for s in mysymbols}
mathsymbols = {s: '$'+allsymbols[s]+'$' for s in symbols}

Binary file not shown.

View file

@ -2,7 +2,7 @@
* TODO Scaletta [1/5]
- [X] Abstract
- [-] Background [40%]
- [X] Ocaml
- [X]
- [ ] Lambda code [0%]
- [ ] Compiler optimizations
- [ ] other instructions
@ -23,7 +23,7 @@
\end{comment}
#+TITLE: Translation Verification of the OCaml pattern matching compiler
#+TITLE: Translation Verification of the pattern matching compiler
#+AUTHOR: Francesco Mecca
#+EMAIL: me@francescomecca.eu
#+DATE:
@ -44,7 +44,7 @@
#+STARTUP: showall
\begin{abstract}
This dissertation presents an algorithm for the translation validation of the OCaml
This dissertation presents an algorithm for the translation validation of the
pattern matching compiler. Given the source representation of the target program and the
target program compiled in untyped lambda form, the algoritmhm is capable of modelling
the source program in terms of symbolic constraints on it's branches and apply symbolic
@ -53,17 +53,17 @@ produced a valid result.
In this context a valid result means that for every input in the domain of the source
program the untyped lambda translation produces the same output as the source program.
The input of the program is modelled in terms of symbolic constraints closely related to
the runtime representation of OCaml objects and the output consists of OCaml code
the runtime representation of objects and the output consists of OCaml code
blackboxes that are not evaluated in the context of the verification.
\end{abstract}
* Background
** OCaml
Objective Caml (OCaml) is a dialect of the ML (Meta-Language) family of programming
**
Objective Caml () is a dialect of the ML (Meta-Language) family of programming
languages.
OCaml shares many features with other dialects of ML, such as SML and Caml Light,
shares many features with other dialects of ML, such as SML and Caml Light,
The main features of ML languages are the use of the Hindley-Milner type system that
provides many advantages with respect to static type systems of traditional imperative and object
oriented language such as C, C++ and Java, such as:
@ -99,33 +99,33 @@ oriented language such as C, C++ and Java, such as:
Moreover ML languages are functional, meaning that functions are
treated as first class citizens and variables are immutable,
although mutable statements and imperative constructs are permitted.
In addition to that OCaml features an object system, that provides
In addition to that features an object system, that provides
inheritance, subtyping and dynamic binding, and modules, that
provide a way to encapsulate definitions. Modules are checked
statically and can be reificated through functors.
** Lambda form compilation
\begin{comment}
https://dev.realworldocaml.org/compiler-backend.html
https://dev.realworld.org/compiler-backend.html
\end{comment}
OCaml provides compilation in form of a byecode executable with an
provides compilation in form of a byecode executable with an
optionally embeddable interpreter and a native executable that could
be statically linked to provide a single file executable.
After the OCaml typechecker has proven that the program is type safe,
the OCaml compiler lower the code to /Lambda/, an s-expression based
After the typechecker has proven that the program is type safe,
the compiler lower the code to /Lambda/, an s-expression based
language that assumes that its input has already been proved safe.
On the /Lambda/ representation of the source program, the compiler
performes a series of optimization passes before translating the
lambda form to assembly code.
*** OCaml datatypes
*** datatypes
Most native data types in OCaml, such as integers, tuples, lists,
Most native data types in , such as integers, tuples, lists,
records, can be seen as instances of the following definition
#+BEGIN_SRC ocaml
#+BEGIN_SRC
type t = Nil | One of int | Cons of int * t
#+END_SRC
that is a type /t/ with three constructors that define its complete
@ -134,7 +134,7 @@ Every constructor has an arity. Nil, a constructor of arity 0, is
called a constant constructor.
*** Lambda form types
A lambda form target file produced by the ocaml compiler consists of a
A lambda form target file produced by the compiler consists of a
single s-expression. Every s-expression consist of /(/, a sequence of
elements separated by a whitespace and a closing /)/.
Elements of s-expressions are:
@ -182,7 +182,7 @@ if statements and switch statements.
Pattern matching on the other hands express predicates through
syntactic templates that also allow to bind on data structures of
arbitrary shapes. One common example of pattern matching is the use of regular
expressions on strings. OCaml provides pattern matching on ADT and
expressions on strings. provides pattern matching on ADT and
primitive data types.
The result of a pattern matching operation is always one of:
- this value does not match this pattern”
@ -190,7 +190,7 @@ The result of a pattern matching operation is always one of:
names to values and the jump to the expression pointed at the
pattern.
#+BEGIN_SRC ocaml
#+BEGIN_SRC
type color = | Red | Blue | Green | Black | White
match color with
@ -200,10 +200,10 @@ match color with
| _ -> print "white or black"
#+END_SRC
OCaml provides tokens to express data destructoring.
provides tokens to express data destructoring.
For example we can examine the content of a list with patten matching
#+BEGIN_SRC ocaml
#+BEGIN_SRC
begin match list with
| [ ] -> print "empty list"
@ -216,7 +216,7 @@ Parenthesized patterns, such as the third one in the previous example,
matches the same value as the pattern without parenthesis.
The same could be done with tuples
#+BEGIN_SRC ocaml
#+BEGIN_SRC
begin match tuple with
| (Some _, Some _) -> print "Pair of optional types"
@ -231,7 +231,7 @@ pattern₂ if it matches pattern₁ or pattern₂.
Pattern clauses can make the use of /guards/ to test predicates and
variables can captured (binded in scope).
#+BEGIN_SRC ocaml
#+BEGIN_SRC
begin match token_list with
| "switch"::var::"{"::rest -> ...
@ -242,11 +242,97 @@ begin match token_list with
#+END_SRC
Moreover, the OCaml pattern matching compiler emits a warning when a
Moreover, the pattern matching compiler emits a warning when a
pattern is not exhaustive or some patterns are shadowed by precedent ones.
In general pattern matching on primitive and algebraic data types takes the
following form.
** Symbolic execution
** Translation validation
Translators, such as translators and code generators, are huge pieces of
software usually consisting of multiple subsystem and
constructing an actual specification of a translator implementation for
formal validation is a very long task. Moreover, different
translators implement different algorithms, so the correctness proof of
a translator cannot be generalized and reused to prove another translator.
Translation validation is an alternative to the verification of
existing translators that consists of taking the source and the target
(compiled) program and proving /a posteriori/ their semantic equivalence.
- [ ] Techniques for translation validation
- [ ] What does semantically equivalent mean
- [ ] What happens when there is no semantic equivalence
- [ ] Translation validation through symbolic execution
** Translation validation of the Pattern Matching Compiler
*** Source program
The algorithm takes as its input a source program and translates it
into an algebraic data structure called /constraint_tree/.
#+BEGIN_SRC
type constraint_tree =
| Unreachable
| Failure
| Leaf of source_expr
| Guard of source_blackbox * constraint_tree * constraint_tree
| Node of accessor * (constructor * constraint_tree) list * constraint_tree
#+END_SRC
Unreachable, Leaf of source_expr and Failure are the terminals of the three.
We distinguish
- Unreachable: statically it is known that no value can go there
- Failure: a value matching this part results in an error
- Leaf: a value matching this part results into the evaluation of a
source blackbox of code
The algorithm doesn't support type-declaration-based analysis
to know the list of constructors at a given type.
Let's consider some trivial examples:
#+BEGIN_SRC
function true -> 1
#+END_SRC
[ ] Converti a disegni
Is translated to
|Node ([(true, Leaf 1)], Failure)
while
#+BEGIN_SRC
function
true -> 1
| false -> 2
#+END_SRC
will give
|Node ([(true, Leaf 1); (false, Leaf 2)])
It is possible to produce Unreachable examples by using
refutation clauses (a "dot" in the right-hand-side)
#+BEGIN_SRC
function
true -> 1
| false -> 2
| _ -> .
#+END_SRC
that gets translated into
Node ([(true, Leaf 1); (false, Leaf 2)], Unreachable)
We trust this annotation, which is reasonable as the type-checker
verifies that it indeed holds.
Guard nodes of the tree are emitted whenever a guard is found. Guards
node contains a blackbox of code that is never evaluated and two
branches, one that is taken in case the guard evaluates to true and
the other one that contains the path taken when the guard evaluates to
true.
[ ] Finisci con Node
[ ] Spiega del fallback
[ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le lambda
[ ] Specifica che stesso algoritmo usato per compilare a lambda, piu` optimizations
The source code of a pattern matching function in has the
following form:
|match variable with
|\vert pattern₁ -> expr₁
@ -255,30 +341,57 @@ following form.
|⋮
|\vert pₙ -> exprₙ
It can be described more formally through a BNF grammar.
and can include any expression that is legal for the compiler,
such as "when" conditions and assignments. Patterns could or could not
be exhaustive.
Pattern matching code could also be written using the more compact form:
|function
|\vert pattern₁ -> expr₁
|\vert pattern₂ when guard -> expr₂
|\vert pattern₃ as var -> expr₃
|⋮
|\vert pₙ -> exprₙ
This BNF grammar describes formally the grammar of the source program:
#+BEGIN_SRC bnf
start ::= "match" id "with" patterns | "function" patterns
patterns ::= (pattern0|pattern1) pattern1+
;; pattern0 and pattern1 are needed to distinguish the first case in which
;; we can avoid writing the optional vertical line
pattern0 ::= clause
pattern1 ::= "|" clause
clause ::= lexpr "->" rexpr
pattern ::= value-name
| _ ;; wildcard pattern
| constant ;; matches a constant value
| pattern as value-name ;; binds to value-name
| ( pattern ) ;; parenthesized pattern
| pattern | pattern ;; or-pattern
| constr pattern ;; variant pattern
| [ pattern { ; pattern } [ ; ] ] ;; list patterns
| pattern :: pattern ;; lists patterns using cons operator (::)
| [| pattern { ; pattern } [ ; ] |] ;; array pattern
| char-literal .. char-literal ;; match on a range of characters
| { field [: typexpr] [= pattern] { ; field [: typexpr] [= pattern] } \
[; _ ] [ ; ] } ;; patterns that match on records
lexpr ::= rule (ε|condition)
rexpr ::= _code ;; arbitrary code
rule ::= wildcard|variable|constructor_pattern|or_pattern ;;
;; rules
wildcard ::= "_"
variable ::= identifier
constructor_pattern ::= constructor (rule|ε) (assignment|ε)
constructor ::= int|float|char|string|bool
|unit|record|exn|objects|ref
|list|tuple|array
|variant|parameterized_variant ;; data types
or_pattern ::= wildcard|variable|constructor_pattern ("|" wildcard|variable|constructor_pattern)+
condition ::= "when" bexpr
assignment ::= "as" id
bexpr ::= _code ;; arbitrary code
#+END_SRC
\begin{comment}
*** 1.2.1 Pattern matching compilation to lambda code
Check that it is still this
\end{comment}
During compilation, patterns are in the form
Patterns are of the form
| pattern | type of pattern |
|-----------------+---------------------|
| _ | wildcard |
@ -286,10 +399,10 @@ During compilation, patterns are in the form
| c(p₁,p₂,...,pₙ) | constructor pattern |
| (p₁\vert p₂) | or-pattern |
Expressions are compiled into lambda code and are referred as lambda
code actions.
During compilation by the translators expressions are compiled into
lambda code and are referred as lambda code actions lᵢ.
The entire pattern matching code can be represented as a clause matrix
The entire pattern matching code is represented as a clause matrix
that associates rows of patterns (p_{i,1}, p_{i,2}, ..., p_{i,n}) to
lambda code action lⁱ
\begin{equation*}
@ -449,34 +562,8 @@ following four rules:
apply, and P₂ → L₂ containing the remaining rows. The algorithm is
applied to both matrices.
\end{comment}
\begin{comment}
#+BEGIN_COMMENT
CITE paper?
#+END_COMMENT
\end{comment}
** Symbolic execution
** Translation validation
Translators, such as translators and code generators, are huge pieces of
software usually consisting of multiple subsystem and
constructing an actual specification of a translator implementation for
formal validation is a very long task. Moreover, different
translators implement different algorithms, so the correctness proof of
a translator cannot be generalized and reused to prove another translator.
Translation validation is an alternative to the verification of
existing translators that consists of taking the source and the target
(compiled) program and proving /a posteriori/ their semantic equivalence.
- [ ] Techniques for translation validation
- [ ] What does semantically equivalent mean
- [ ] What happens when there is no semantic equivalence
- [ ] Translation validation through symbolic execution
** Translation validation of the Pattern Matching Compiler
- [ ] Source translation
- [ ] Formal Grammar
- [ ] Compilation of source patterns
- [ ] Rest?

View file

@ -1,4 +1,4 @@
% Created 2020-03-02 Mon 14:31
% Created 2020-03-03 Tue 17:18
% Intended LaTeX compiler: pdflatex
\documentclass[11pt]{article}
\usepackage[utf8]{inputenc}
@ -24,10 +24,10 @@
\usepackage{color}
\author{Francesco Mecca}
\date{}
\title{Translation Verification of the OCaml pattern matching compiler}
\title{Translation Verification of the pattern matching compiler}
\hypersetup{
pdfauthor={Francesco Mecca},
pdftitle={Translation Verification of the OCaml pattern matching compiler},
pdftitle={Translation Verification of the pattern matching compiler},
pdfkeywords={},
pdfsubject={},
pdfcreator={Emacs 26.3 (Org mode 9.1.9)},
@ -37,12 +37,13 @@
\maketitle
\begin{comment}
\section{{\bfseries\sffamily TODO} Scaletta [1/5]}
\label{sec:org62901c2}
\label{sec:org7578cff}
\begin{itemize}
\item[{$\boxtimes$}] Abstract
\item[{$\boxminus$}] Background [40\%]
\begin{itemize}
\item[{$\boxtimes$}] Ocaml
\item[{$\boxtimes$}]
\item[{$\square$}] Lambda code [0\%]
\begin{itemize}
\item[{$\square$}] Compiler optimizations
@ -58,6 +59,7 @@
\begin{itemize}
\item[{$\square$}] Formal Grammar
\item[{$\square$}] Compilation of source patterns
\item[{$\square$}] Rest?
\end{itemize}
\item[{$\square$}] Target translation
\begin{itemize}
@ -74,7 +76,7 @@
\begin{abstract}
This dissertation presents an algorithm for the translation validation of the OCaml
This dissertation presents an algorithm for the translation validation of the
pattern matching compiler. Given the source representation of the target program and the
target program compiled in untyped lambda form, the algoritmhm is capable of modelling
the source program in terms of symbolic constraints on it's branches and apply symbolic
@ -83,19 +85,19 @@ produced a valid result.
In this context a valid result means that for every input in the domain of the source
program the untyped lambda translation produces the same output as the source program.
The input of the program is modelled in terms of symbolic constraints closely related to
the runtime representation of OCaml objects and the output consists of OCaml code
the runtime representation of objects and the output consists of OCaml code
blackboxes that are not evaluated in the context of the verification.
\end{abstract}
\section{Background}
\label{sec:orgbf4de70}
\label{sec:org5b6accf}
\subsection{OCaml}
\label{sec:orga9a97c9}
Objective Caml (OCaml) is a dialect of the ML (Meta-Language) family of programming
\subsection{}
\label{sec:org3c9e604}
Objective Caml () is a dialect of the ML (Meta-Language) family of programming
languages.
OCaml shares many features with other dialects of ML, such as SML and Caml Light,
shares many features with other dialects of ML, such as SML and Caml Light,
The main features of ML languages are the use of the Hindley-Milner type system that
provides many advantages with respect to static type systems of traditional imperative and object
oriented language such as C, C++ and Java, such as:
@ -133,33 +135,33 @@ in their definition and can be combined.
Moreover ML languages are functional, meaning that functions are
treated as first class citizens and variables are immutable,
although mutable statements and imperative constructs are permitted.
In addition to that OCaml features an object system, that provides
In addition to that features an object system, that provides
inheritance, subtyping and dynamic binding, and modules, that
provide a way to encapsulate definitions. Modules are checked
statically and can be reificated through functors.
\subsection{Lambda form compilation}
\label{sec:org2d10d35}
\label{sec:org6065c14}
\begin{comment}
https://dev.realworldocaml.org/compiler-backend.html
https://dev.realworld.org/compiler-backend.html
\end{comment}
OCaml provides compilation in form of a byecode executable with an
provides compilation in form of a byecode executable with an
optionally embeddable interpreter and a native executable that could
be statically linked to provide a single file executable.
After the OCaml typechecker has proven that the program is type safe,
the OCaml compiler lower the code to \emph{Lambda}, an s-expression based
After the typechecker has proven that the program is type safe,
the compiler lower the code to \emph{Lambda}, an s-expression based
language that assumes that its input has already been proved safe.
On the \emph{Lambda} representation of the source program, the compiler
performes a series of optimization passes before translating the
lambda form to assembly code.
\begin{enumerate}
\item OCaml datatypes
\label{sec:orgd605d09}
\item datatypes
\label{sec:org7b158eb}
Most native data types in OCaml, such as integers, tuples, lists,
Most native data types in , such as integers, tuples, lists,
records, can be seen as instances of the following definition
\begin{verbatim}
@ -171,8 +173,8 @@ Every constructor has an arity. Nil, a constructor of arity 0, is
called a constant constructor.
\item Lambda form types
\label{sec:org1ee20a6}
A lambda form target file produced by the ocaml compiler consists of a
\label{sec:org737fa2f}
A lambda form target file produced by the compiler consists of a
single s-expression. Every s-expression consist of \emph{(}, a sequence of
elements separated by a whitespace and a closing \emph{)}.
Elements of s-expressions are:
@ -203,7 +205,7 @@ The are varios numeric operations defined:
\end{itemize}
\item Functions
\label{sec:org914d5eb}
\label{sec:org369db83}
Functions are defined using the following syntax, and close over all
bindings in scope: (lambda (arg1 arg2 arg3) BODY)
@ -211,19 +213,19 @@ and are applied using the following syntax: (apply FUNC ARG ARG ARG)
Evaluation is eager.
\item Bindings
\label{sec:org055206b}
\label{sec:org120bc74}
The atom \emph{let} introduces a sequence of bindings:
(let BINDING BINDING BINDING \ldots{} BODY)
\item Other atoms
\label{sec:org0f92182}
\label{sec:org58bd28f}
TODO: if, switch, stringswitch\ldots{}
TODO: magari esempi
\end{enumerate}
\subsection{Pattern matching}
\label{sec:org9876fb9}
\label{sec:org5d3b2f5}
Pattern matching is a widely adopted mechanism to interact with ADT.
C family languages provide branching on predicates through the use of
@ -231,7 +233,7 @@ if statements and switch statements.
Pattern matching on the other hands express predicates through
syntactic templates that also allow to bind on data structures of
arbitrary shapes. One common example of pattern matching is the use of regular
expressions on strings. OCaml provides pattern matching on ADT and
expressions on strings. provides pattern matching on ADT and
primitive data types.
The result of a pattern matching operation is always one of:
\begin{itemize}
@ -251,7 +253,7 @@ match color with
| _ -> print "white or black"
\end{verbatim}
OCaml provides tokens to express data destructoring.
provides tokens to express data destructoring.
For example we can examine the content of a list with patten matching
\begin{verbatim}
@ -293,11 +295,114 @@ begin match token_list with
\end{verbatim}
Moreover, the OCaml pattern matching compiler emits a warning when a
Moreover, the pattern matching compiler emits a warning when a
pattern is not exhaustive or some patterns are shadowed by precedent ones.
In general pattern matching on primitive and algebraic data types takes the
following form.
\subsection{Symbolic execution}
\label{sec:orge2e0205}
\subsection{Translation validation}
\label{sec:orgbafe7bc}
Translators, such as translators and code generators, are huge pieces of
software usually consisting of multiple subsystem and
constructing an actual specification of a translator implementation for
formal validation is a very long task. Moreover, different
translators implement different algorithms, so the correctness proof of
a translator cannot be generalized and reused to prove another translator.
Translation validation is an alternative to the verification of
existing translators that consists of taking the source and the target
(compiled) program and proving \emph{a posteriori} their semantic equivalence.
\begin{itemize}
\item[{$\square$}] Techniques for translation validation
\item[{$\square$}] What does semantically equivalent mean
\item[{$\square$}] What happens when there is no semantic equivalence
\item[{$\square$}] Translation validation through symbolic execution
\end{itemize}
\subsection{Translation validation of the Pattern Matching Compiler}
\label{sec:org24ee133}
\begin{enumerate}
\item Source program
\label{sec:org8c21912}
The algorithm takes as its input a source program and translates it
into an algebraic data structure called \emph{constraint\_tree}.
\begin{verbatim}
type constraint_tree =
| Unreachable
| Failure
| Leaf of source_expr
| Guard of source_blackbox * constraint_tree * constraint_tree
| Node of accessor * (constructor * constraint_tree) list * constraint_tree
\end{verbatim}
Unreachable, Leaf of source\_expr and Failure are the terminals of the three.
We distinguish
\begin{itemize}
\item Unreachable: statically it is known that no value can go there
\item Failure: a value matching this part results in an error
\item Leaf: a value matching this part results into the evaluation of a
source blackbox of code
\end{itemize}
The algorithm doesn't support type-declaration-based analysis
to know the list of constructors at a given type.
Let's consider some trivial examples:
\begin{verbatim}
function true -> 1
\end{verbatim}
[ ] Converti a disegni
Is translated to
\begin{center}
\begin{tabular}{l}
Node ([(true, Leaf 1)], Failure)\\
\end{tabular}
\end{center}
while
\begin{verbatim}
function
true -> 1
| false -> 2
\end{verbatim}
will give
\begin{center}
\begin{tabular}{l}
Node ([(true, Leaf 1); (false, Leaf 2)])\\
\end{tabular}
\end{center}
It is possible to produce Unreachable examples by using
refutation clauses (a "dot" in the right-hand-side)
\begin{verbatim}
function
true -> 1
| false -> 2
| _ -> .
\end{verbatim}
that gets translated into
Node ([(true, Leaf 1); (false, Leaf 2)], Unreachable)
We trust this annotation, which is reasonable as the type-checker
verifies that it indeed holds.
Guard nodes of the tree are emitted whenever a guard is found. Guards
node contains a blackbox of code that is never evaluated and two
branches, one that is taken in case the guard evaluates to true and
the other one that contains the path taken when the guard evaluates to
true.
[ ] Finisci con Node
[ ] Spiega del fallback
[ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le lambda
[ ] Specifica che stesso algoritmo usato per compilare a lambda, piu` optimizations
The source code of a pattern matching function in has the
following form:
\begin{center}
\begin{tabular}{l}
@ -310,32 +415,61 @@ match variable with\\
\end{tabular}
\end{center}
It can be described more formally through a BNF grammar.
and can include any expression that is legal for the compiler,
such as "when" conditions and assignments. Patterns could or could not
be exhaustive.
Pattern matching code could also be written using the more compact form:
\begin{center}
\begin{tabular}{l}
function\\
\(\vert{}\) pattern₁ -> expr₁\\
\(\vert{}\) pattern₂ when guard -> expr₂\\
\(\vert{}\) pattern₃ as var -> expr₃\\
\\
\(\vert{}\) pₙ -> exprₙ\\
\end{tabular}
\end{center}
This BNF grammar describes formally the grammar of the source program:
\begin{verbatim}
start ::= "match" id "with" patterns | "function" patterns
patterns ::= (pattern0|pattern1) pattern1+
;; pattern0 and pattern1 are needed to distinguish the first case in which
;; we can avoid writing the optional vertical line
pattern0 ::= clause
pattern1 ::= "|" clause
clause ::= lexpr "->" rexpr
pattern ::= value-name
| _ ;; wildcard pattern
| constant ;; matches a constant value
| pattern as value-name ;; binds to value-name
| ( pattern ) ;; parenthesized pattern
| pattern | pattern ;; or-pattern
| constr pattern ;; variant pattern
| [ pattern { ; pattern } [ ; ] ] ;; list patterns
| pattern :: pattern ;; lists patterns using cons operator (::)
| [| pattern { ; pattern } [ ; ] |] ;; array pattern
| char-literal .. char-literal ;; match on a range of characters
| { field [: typexpr] [= pattern] { ; field [: typexpr] [= pattern] } \
[; _ ] [ ; ] } ;; patterns that match on records
lexpr ::= rule (ε|condition)
rexpr ::= _code ;; arbitrary code
rule ::= wildcard|variable|constructor_pattern|or_pattern ;;
;; rules
wildcard ::= "_"
variable ::= identifier
constructor_pattern ::= constructor (rule|ε) (assignment|ε)
constructor ::= int|float|char|string|bool
|unit|record|exn|objects|ref
|list|tuple|array
|variant|parameterized_variant ;; data types
or_pattern ::= wildcard|variable|constructor_pattern ("|" wildcard|variable|constructor_pattern)+
condition ::= "when" bexpr
assignment ::= "as" id
bexpr ::= _code ;; arbitrary code
\end{verbatim}
\begin{comment}
\begin{enumerate}
\item 1.2.1 Pattern matching compilation to lambda code
\label{sec:org4d27bd9}
Check that it is still this
\end{comment}
During compilation, patterns are in the form
Patterns are of the form
\begin{center}
\begin{tabular}{ll}
pattern & type of pattern\\
@ -347,10 +481,10 @@ c(p₁,p₂,\ldots{},pₙ) & constructor pattern\\
\end{tabular}
\end{center}
Expressions are compiled into lambda code and are referred as lambda
code actions.
During compilation by the translators expressions are compiled into
lambda code and are referred as lambda code actions lᵢ.
The entire pattern matching code can be represented as a clause matrix
The entire pattern matching code is represented as a clause matrix
that associates rows of patterns (p\(_{\text{i,1}}\), p\(_{\text{i,2}}\), \ldots{}, p\(_{\text{i,n}}\)) to
lambda code action lⁱ
\begin{equation*}
@ -399,7 +533,7 @@ are the same
\begin{enumerate}
\item Initial state of the compilation
\label{sec:orge9c6bc4}
\label{sec:org9a7b624}
Given a source of the following form:
@ -528,8 +662,6 @@ apply, and P₂ → L₂ containing the remaining rows. The algorithm is
applied to both matrices.
\end{enumerate}
\end{comment}
\begin{comment}
#+BEGIN_COMMENT
CITE paper?
@ -537,26 +669,4 @@ CITE paper?
\end{comment}
\end{enumerate}
\end{enumerate}
\subsection{Symbolic execution}
\label{sec:orgdb60b84}
\subsection{Translation validation}
\label{sec:org096d047}
Translators, such as translators and code generators, are huge pieces of
software usually consisting of multiple subsystem and
constructing an actual specification of a translator implementation for
formal validation is a very long task. Moreover, different
translators implement different algorithms, so the correctness proof of
a translator cannot be generalized and reused to prove another translator.
Translation validation is an alternative to the verification of
existing translators that consists of taking the source and the target
(compiled) program and proving \emph{a posteriori} their semantic equivalence.
\begin{itemize}
\item[{$\square$}] Techniques for translation validation
\item[{$\square$}] What does semantically equivalent mean
\item[{$\square$}] What happens when there is no semantic equivalence
\item[{$\square$}] Translation validation through symbolic execution
\end{itemize}
\end{document}