inizio source

2020-03-03 17:18:40 +01:00 · 2020-03-03 17:18:40 +01:00 · e4fc6b92aa
commit e4fc6b92aa
parent 2cd2b73451
4 changed files with 346 additions and 149 deletions
--- a/tesi/conv.py
+++ b/tesi/conv.py
@ -3,7 +3,7 @@ import re
 from sys import argv

 allsymbols = json.load(open('./unicode-latex.json'))
-mysymbols = ['≡', '≠', '≼', '→', '←', '⊀', '⋠', '≺', '∀', '∈', '₀', '₂', '₁', '₃', 'ₐ', 'ₖ', 'ₘ', 'ₙ', 'ᵢ', 'ⁱ', '⋮']
+mysymbols = ['≡', '≠', '≼', '→', '←', '⊀', '⋠', '≺', '∀', '∈', 'ε','₀', '₂', '₁', '₃', 'ₐ', 'ₖ', 'ₘ', 'ₙ', 'ᵢ', 'ⁱ', '⋮']

 symbols = {s: allsymbols[s] for s in mysymbols}
 mathsymbols = {s: '$'+allsymbols[s]+'$' for s in symbols}
--- a/tesi/tesi.pdf
+++ b/tesi/tesi.pdf
--- a/tesi/tesi_unicode.org
+++ b/tesi/tesi_unicode.org
@ -2,7 +2,7 @@
 * TODO Scaletta [1/5]
  - [X] Abstract
  - [-] Background [40%]
-    - [X] Ocaml
+    - [X] 
    - [ ] Lambda code [0%]
      - [ ] Compiler optimizations
      - [ ] other instructions
@ -23,7 +23,7 @@
  
 \end{comment}

-#+TITLE:   Translation Verification of the OCaml pattern matching compiler
+#+TITLE:   Translation Verification of the  pattern matching compiler
 #+AUTHOR:   Francesco Mecca
 #+EMAIL:    me@francescomecca.eu
 #+DATE:
@ -44,7 +44,7 @@
 #+STARTUP: showall
 \begin{abstract}

-This dissertation presents an algorithm for the translation validation of the OCaml
+This dissertation presents an algorithm for the translation validation of the 
 pattern matching compiler. Given the source representation of the target program and the
 target program compiled in untyped lambda form, the algoritmhm is capable of modelling
 the source program in terms of symbolic constraints on it's branches and apply symbolic
@ -53,17 +53,17 @@ produced a valid result.
 In this context a valid result means that for every input in the domain of the source
 program the untyped lambda translation produces the same output as the source program.
 The input of the program is modelled in terms of symbolic constraints closely related to
-the runtime representation of OCaml objects and the output consists of OCaml code
+the runtime representation of  objects and the output consists of OCaml code
 blackboxes that are not evaluated in the context of the verification.

 \end{abstract}

 * Background

-** OCaml
-Objective Caml (OCaml) is a dialect of the ML (Meta-Language) family of programming
+** 
+Objective Caml () is a dialect of the ML (Meta-Language) family of programming
 languages.
-OCaml shares many features with other dialects of ML, such as SML and Caml Light,
+ shares many features with other dialects of ML, such as SML and Caml Light,
 The main features of ML languages are the use of the Hindley-Milner type system that
 provides many advantages with respect to static type systems of traditional imperative and object
 oriented language such as C, C++ and Java, such as:
@ -99,33 +99,33 @@ oriented language such as C, C++ and Java, such as:
 Moreover ML languages are functional, meaning that functions are
 treated as first class citizens and variables are immutable,
 although mutable statements and imperative constructs are permitted.
-In addition to that OCaml features an object system, that provides
+In addition to that  features an object system, that provides
 inheritance, subtyping and dynamic binding, and modules, that
 provide a way to encapsulate definitions. Modules are checked
 statically and can be reificated through functors.

 ** Lambda form compilation
 \begin{comment}
-https://dev.realworldocaml.org/compiler-backend.html
+https://dev.realworld.org/compiler-backend.html
 \end{comment}

-OCaml provides compilation in form of a byecode executable with an
+ provides compilation in form of a byecode executable with an
 optionally embeddable interpreter and a native executable that could
 be statically linked to provide a single file executable.

-After the OCaml typechecker has proven that the program is type safe,
-the OCaml compiler lower the code to /Lambda/, an s-expression based
+After the  typechecker has proven that the program is type safe,
+the  compiler lower the code to /Lambda/, an s-expression based
 language that assumes that its input has already been proved safe.
 On the /Lambda/ representation of the source program, the compiler
 performes a series of optimization passes before translating the
 lambda form to assembly code.

-*** OCaml datatypes
+***  datatypes

-Most native data types in OCaml, such as integers, tuples, lists,
+Most native data types in , such as integers, tuples, lists,
 records, can be seen as instances of the following definition

-#+BEGIN_SRC ocaml
+#+BEGIN_SRC 
 type t = Nil | One of int | Cons of int * t 
 #+END_SRC
 that is a type /t/ with three constructors that define its complete
@ -134,7 +134,7 @@ Every constructor has an arity. Nil, a constructor of arity 0, is
 called a constant constructor. 

 *** Lambda form types
-A lambda form target file produced by the ocaml compiler consists of a
+A lambda form target file produced by the  compiler consists of a
 single s-expression. Every s-expression consist of /(/, a sequence of
 elements separated by a whitespace and a closing /)/.
 Elements of s-expressions are:
@ -182,7 +182,7 @@ if statements and switch statements.
 Pattern matching on the other hands express predicates through
 syntactic templates that also allow to bind on data structures of
 arbitrary shapes. One common example of pattern matching is the use of regular
-expressions on strings. OCaml provides pattern matching on ADT and
+expressions on strings.  provides pattern matching on ADT and
 primitive data types.
 The result of a pattern matching operation is always one of:
 - this value does not match this pattern”
@ -190,7 +190,7 @@ The result of a pattern matching operation is always one of:
  names to values and the jump to the expression pointed at the
  pattern.

-#+BEGIN_SRC ocaml
+#+BEGIN_SRC 
 type color = | Red | Blue | Green | Black | White

 match color with
@ -200,10 +200,10 @@ match color with
 | _ -> print "white or black"
 #+END_SRC

-OCaml provides tokens to express data destructoring.
+ provides tokens to express data destructoring.
 For example we can examine the content of a list with patten matching

-#+BEGIN_SRC ocaml
+#+BEGIN_SRC 

 begin match list with
 | [ ] -> print "empty list"
@ -216,7 +216,7 @@ Parenthesized patterns, such as the third one in the previous example,
 matches the same value as the pattern without parenthesis.

 The same could be done with tuples
-#+BEGIN_SRC ocaml
+#+BEGIN_SRC 

 begin match tuple with
 | (Some _, Some _) -> print "Pair of optional types"
@ -231,7 +231,7 @@ pattern₂ if it matches pattern₁ or pattern₂.
 Pattern clauses can make the use of /guards/ to test predicates and
 variables can captured (binded in scope).

-#+BEGIN_SRC ocaml
+#+BEGIN_SRC 

 begin match token_list with
 | "switch"::var::"{"::rest -> ...
@ -242,11 +242,97 @@ begin match token_list with

 #+END_SRC

-Moreover, the OCaml pattern matching compiler emits a warning when a
+Moreover, the  pattern matching compiler emits a warning when a
 pattern is not exhaustive or some patterns are shadowed by precedent ones.

-In general pattern matching on primitive and algebraic data types takes the
-following form.
+** Symbolic execution
+
+** Translation validation
+Translators, such as translators and code generators, are huge pieces of
+software usually consisting of multiple subsystem and
+constructing an actual specification of a translator implementation for
+formal validation is a very long task. Moreover, different
+translators implement different algorithms, so the correctness proof of
+a translator cannot be generalized and reused to prove another translator.
+Translation validation is an alternative to the verification of
+existing translators that consists of taking the source and the target
+(compiled) program and proving /a posteriori/ their semantic equivalence.
+
+- [ ] Techniques for translation validation
+- [ ] What does semantically equivalent mean
+- [ ] What happens when there is no semantic equivalence
+- [ ] Translation validation through symbolic execution
+
+** Translation validation of the Pattern Matching Compiler
+
+*** Source program
+The algorithm takes as its input a source program and translates it
+into an algebraic data structure called /constraint_tree/.
+
+#+BEGIN_SRC 
+type constraint_tree =
+  | Unreachable
+  | Failure
+  | Leaf of source_expr
+  | Guard of source_blackbox * constraint_tree * constraint_tree
+  | Node of accessor * (constructor * constraint_tree) list * constraint_tree
+#+END_SRC
+
+Unreachable, Leaf of source_expr and Failure are the terminals of the three.
+We distinguish
+- Unreachable: statically it is known that no value can go there
+- Failure: a value matching this part results in an error
+- Leaf: a value matching this part results into the evaluation of a
+  source blackbox of  code
+
+The algorithm doesn't support type-declaration-based analysis 
+to know the list of constructors at a given type.
+Let's consider some trivial examples:
+
+#+BEGIN_SRC 
+function true -> 1
+#+END_SRC
+
+[ ] Converti a disegni 
+
+Is translated to
+|Node ([(true, Leaf 1)], Failure)
+while
+#+BEGIN_SRC 
+function 
+true -> 1 
+| false -> 2
+#+END_SRC
+will give
+|Node ([(true, Leaf 1); (false, Leaf 2)])
+
+It is possible to produce Unreachable examples by using
+ refutation clauses (a "dot" in the right-hand-side)
+#+BEGIN_SRC 
+function 
+true -> 1 
+| false -> 2 
+| _ -> .
+#+END_SRC
+that gets translated into
+Node ([(true, Leaf 1); (false, Leaf 2)], Unreachable)
+
+We trust this annotation, which is reasonable as the  type-checker
+verifies that it indeed holds.
+
+Guard nodes of the tree are emitted whenever a guard is found. Guards
+node contains a blackbox of  code that is never evaluated and two
+branches, one that is taken in case the guard evaluates to true and
+the other one that contains the path taken when the guard evaluates to
+true.
+
+[ ] Finisci con Node
+[ ] Spiega del fallback
+[ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le lambda
+[ ] Specifica che stesso algoritmo usato per compilare a lambda, piu` optimizations
+
+The source code of a pattern matching function in  has the
+following form:

 |match variable with
 |\vert pattern₁ -> expr₁
@ -255,30 +341,57 @@ following form.
 |⋮
 |\vert pₙ -> exprₙ

-It can be described more formally through a BNF grammar.
+and can include any expression that is legal for the  compiler,
+such as "when" conditions and assignments. Patterns could or could not
+be exhaustive.
+
+Pattern matching code could also be written using the more compact form:
+|function
+|\vert pattern₁ -> expr₁
+|\vert pattern₂ when guard -> expr₂
+|\vert pattern₃ as var -> expr₃ 
+|⋮
+|\vert pₙ -> exprₙ
+
+
+This BNF grammar describes formally the grammar of the source program:

 #+BEGIN_SRC bnf
+start ::= "match" id "with" patterns | "function" patterns
+patterns ::= (pattern0|pattern1) pattern1+
+;; pattern0 and pattern1 are needed to distinguish the first case in which
+;; we can avoid writing the optional vertical line
+pattern0 ::= clause 
+pattern1 ::= "|" clause
+clause ::= lexpr "->" rexpr

-pattern ::= value-name  
-| _   ;; wildcard pattern
-| constant  ;; matches a constant value
-| pattern as  value-name  ;; binds to value-name
-| ( pattern ) ;; parenthesized pattern
-| pattern |  pattern ;; or-pattern 
-| constr  pattern ;; variant pattern
-| [ pattern  { ; pattern }  [ ; ] ] ;; list patterns 
-| pattern ::  pattern   ;; lists patterns using cons operator (::)
-| [| pattern  { ; pattern }  [ ; ] |] ;; array pattern 
-| char-literal ..  char-literal   ;; match on a range of characters
-| { field  [: typexpr]  [= pattern] { ; field  [: typexpr]  [= pattern] } \
-  [; _ ] [ ; ] } ;; patterns that match on records
+lexpr ::= rule (ε|condition)
+rexpr ::= _code ;; arbitrary code

+rule ::= wildcard|variable|constructor_pattern|or_pattern ;;
+
+;; rules
+wildcard ::= "_"
+variable ::= identifier
+constructor_pattern ::= constructor (rule|ε) (assignment|ε)
+
+constructor ::= int|float|char|string|bool
+                |unit|record|exn|objects|ref
+                |list|tuple|array
+                |variant|parameterized_variant ;;  data types
+
+or_pattern ::=  wildcard|variable|constructor_pattern ("|" wildcard|variable|constructor_pattern)+
+
+condition ::= "when" bexpr 
+assignment ::= "as" id
+bexpr ::= _code ;; arbitrary code
 #+END_SRC

 \begin{comment}
-*** 1.2.1 Pattern matching compilation to lambda code
+Check that it is still this
+\end{comment}

-During compilation, patterns are in the form
+Patterns are of the form
 | pattern         | type of pattern    |
 |-----------------+---------------------|
 | _               | wildcard            |
@ -286,10 +399,10 @@ During compilation, patterns are in the form
 | c(p₁,p₂,...,pₙ) | constructor pattern |
 | (p₁\vert p₂)    | or-pattern          |

-Expressions are compiled into lambda code and are referred as lambda
-code actions.
+During compilation by the translators expressions are compiled into
+lambda code and are referred as lambda code actions lᵢ.

-The entire pattern matching code can be represented as a clause matrix
+The entire pattern matching code is represented as a clause matrix
 that associates rows of patterns (p_{i,1}, p_{i,2}, ..., p_{i,n}) to
 lambda code action lⁱ
 \begin{equation*}
@ -449,34 +562,8 @@ following four rules:
   apply, and P₂ → L₂ containing the remaining rows. The algorithm is
   applied to both matrices.

-\end{comment}
-
 \begin{comment}
 #+BEGIN_COMMENT
 CITE paper?
 #+END_COMMENT’ 
 \end{comment}
-
-** Symbolic execution
-
-** Translation validation
-Translators, such as translators and code generators, are huge pieces of
-software usually consisting of multiple subsystem and
-constructing an actual specification of a translator implementation for
-formal validation is a very long task. Moreover, different
-translators implement different algorithms, so the correctness proof of
-a translator cannot be generalized and reused to prove another translator.
-Translation validation is an alternative to the verification of
-existing translators that consists of taking the source and the target
-(compiled) program and proving /a posteriori/ their semantic equivalence.
-
- [ ] Techniques for translation validation
- [ ] What does semantically equivalent mean
- [ ] What happens when there is no semantic equivalence
- [ ] Translation validation through symbolic execution
-
-** Translation validation of the Pattern Matching Compiler
-    - [ ] Source translation
-      - [ ] Formal Grammar
-      - [ ] Compilation of source patterns
-      - [ ] Rest?
--- a/tesi/tesi_unicode.tex
+++ b/tesi/tesi_unicode.tex
@ -1,4 +1,4 @@
-% Created 2020-03-02 Mon 14:31
+% Created 2020-03-03 Tue 17:18
 % Intended LaTeX compiler: pdflatex
 \documentclass[11pt]{article}
 \usepackage[utf8]{inputenc}
@ -24,10 +24,10 @@
 \usepackage{color}
 \author{Francesco Mecca}
 \date{}
-\title{Translation Verification of the OCaml pattern matching compiler}
+\title{Translation Verification of the  pattern matching compiler}
 \hypersetup{
 pdfauthor={Francesco Mecca},
- pdftitle={Translation Verification of the OCaml pattern matching compiler},
+ pdftitle={Translation Verification of the  pattern matching compiler},
 pdfkeywords={},
 pdfsubject={},
 pdfcreator={Emacs 26.3 (Org mode 9.1.9)}, 
@ -37,12 +37,13 @@
 \maketitle
 \begin{comment}
 \section{{\bfseries\sffamily TODO} Scaletta [1/5]}
-\label{sec:org62901c2}
+\label{sec:org7578cff}
 \begin{itemize}
 \item[{$\boxtimes$}] Abstract
 \item[{$\boxminus$}] Background [40\%]
 \begin{itemize}
-\item[{$\boxtimes$}] Ocaml
+\item[{$\boxtimes$}] 
+
 \item[{$\square$}] Lambda code [0\%]
 \begin{itemize}
 \item[{$\square$}] Compiler optimizations
@ -58,6 +59,7 @@
 \begin{itemize}
 \item[{$\square$}] Formal Grammar
 \item[{$\square$}] Compilation of source patterns
+\item[{$\square$}] Rest?
 \end{itemize}
 \item[{$\square$}] Target translation
 \begin{itemize}
@ -74,7 +76,7 @@

 \begin{abstract}

-This dissertation presents an algorithm for the translation validation of the OCaml
+This dissertation presents an algorithm for the translation validation of the 
 pattern matching compiler. Given the source representation of the target program and the
 target program compiled in untyped lambda form, the algoritmhm is capable of modelling
 the source program in terms of symbolic constraints on it's branches and apply symbolic
@ -83,19 +85,19 @@ produced a valid result.
 In this context a valid result means that for every input in the domain of the source
 program the untyped lambda translation produces the same output as the source program.
 The input of the program is modelled in terms of symbolic constraints closely related to
-the runtime representation of OCaml objects and the output consists of OCaml code
+the runtime representation of  objects and the output consists of OCaml code
 blackboxes that are not evaluated in the context of the verification.

 \end{abstract}

 \section{Background}
-\label{sec:orgbf4de70}
+\label{sec:org5b6accf}

-\subsection{OCaml}
-\label{sec:orga9a97c9}
-Objective Caml (OCaml) is a dialect of the ML (Meta-Language) family of programming
+\subsection{}
+\label{sec:org3c9e604}
+Objective Caml () is a dialect of the ML (Meta-Language) family of programming
 languages.
-OCaml shares many features with other dialects of ML, such as SML and Caml Light,
+ shares many features with other dialects of ML, such as SML and Caml Light,
 The main features of ML languages are the use of the Hindley-Milner type system that
 provides many advantages with respect to static type systems of traditional imperative and object
 oriented language such as C, C++ and Java, such as:
@ -133,33 +135,33 @@ in their definition and can be combined.
 Moreover ML languages are functional, meaning that functions are
 treated as first class citizens and variables are immutable,
 although mutable statements and imperative constructs are permitted.
-In addition to that OCaml features an object system, that provides
+In addition to that  features an object system, that provides
 inheritance, subtyping and dynamic binding, and modules, that
 provide a way to encapsulate definitions. Modules are checked
 statically and can be reificated through functors.

 \subsection{Lambda form compilation}
-\label{sec:org2d10d35}
+\label{sec:org6065c14}
 \begin{comment}
-https://dev.realworldocaml.org/compiler-backend.html
+https://dev.realworld.org/compiler-backend.html
 \end{comment}

-OCaml provides compilation in form of a byecode executable with an
+ provides compilation in form of a byecode executable with an
 optionally embeddable interpreter and a native executable that could
 be statically linked to provide a single file executable.

-After the OCaml typechecker has proven that the program is type safe,
-the OCaml compiler lower the code to \emph{Lambda}, an s-expression based
+After the  typechecker has proven that the program is type safe,
+the  compiler lower the code to \emph{Lambda}, an s-expression based
 language that assumes that its input has already been proved safe.
 On the \emph{Lambda} representation of the source program, the compiler
 performes a series of optimization passes before translating the
 lambda form to assembly code.

 \begin{enumerate}
-\item OCaml datatypes
-\label{sec:orgd605d09}
+\item datatypes
+\label{sec:org7b158eb}

-Most native data types in OCaml, such as integers, tuples, lists,
+Most native data types in , such as integers, tuples, lists,
 records, can be seen as instances of the following definition

 \begin{verbatim}
@ -171,8 +173,8 @@ Every constructor has an arity. Nil, a constructor of arity 0, is
 called a constant constructor. 

 \item Lambda form types
-\label{sec:org1ee20a6}
-A lambda form target file produced by the ocaml compiler consists of a
+\label{sec:org737fa2f}
+A lambda form target file produced by the  compiler consists of a
 single s-expression. Every s-expression consist of \emph{(}, a sequence of
 elements separated by a whitespace and a closing \emph{)}.
 Elements of s-expressions are:
@ -203,7 +205,7 @@ The are varios numeric operations defined:
 \end{itemize}

 \item Functions
-\label{sec:org914d5eb}
+\label{sec:org369db83}

 Functions are defined using the following syntax, and close over all
 bindings in scope: (lambda (arg1 arg2 arg3) BODY)
@ -211,19 +213,19 @@ and are applied using the following syntax: (apply FUNC ARG ARG ARG)
 Evaluation is eager.

 \item Bindings
-\label{sec:org055206b}
+\label{sec:org120bc74}
 The atom \emph{let} introduces a sequence of bindings:
 (let BINDING BINDING BINDING \ldots{} BODY)

 \item Other atoms
-\label{sec:org0f92182}
+\label{sec:org58bd28f}
 TODO: if, switch, stringswitch\ldots{}
 TODO: magari esempi
 \end{enumerate}


 \subsection{Pattern matching}
-\label{sec:org9876fb9}
+\label{sec:org5d3b2f5}

 Pattern matching is a widely adopted mechanism to interact with ADT.
 C family languages provide branching on predicates through the use of
@ -231,7 +233,7 @@ if statements and switch statements.
 Pattern matching on the other hands express predicates through
 syntactic templates that also allow to bind on data structures of
 arbitrary shapes. One common example of pattern matching is the use of regular
-expressions on strings. OCaml provides pattern matching on ADT and
+expressions on strings.  provides pattern matching on ADT and
 primitive data types.
 The result of a pattern matching operation is always one of:
 \begin{itemize}
@ -251,7 +253,7 @@ match color with
 | _ -> print "white or black"
 \end{verbatim}

-OCaml provides tokens to express data destructoring.
+ provides tokens to express data destructoring.
 For example we can examine the content of a list with patten matching

 \begin{verbatim}
@ -293,11 +295,114 @@ begin match token_list with

 \end{verbatim}

-Moreover, the OCaml pattern matching compiler emits a warning when a
+Moreover, the  pattern matching compiler emits a warning when a
 pattern is not exhaustive or some patterns are shadowed by precedent ones.

-In general pattern matching on primitive and algebraic data types takes the
-following form.
+\subsection{Symbolic execution}
+\label{sec:orge2e0205}
+
+\subsection{Translation validation}
+\label{sec:orgbafe7bc}
+Translators, such as translators and code generators, are huge pieces of
+software usually consisting of multiple subsystem and
+constructing an actual specification of a translator implementation for
+formal validation is a very long task. Moreover, different
+translators implement different algorithms, so the correctness proof of
+a translator cannot be generalized and reused to prove another translator.
+Translation validation is an alternative to the verification of
+existing translators that consists of taking the source and the target
+(compiled) program and proving \emph{a posteriori} their semantic equivalence.
+
+\begin{itemize}
+\item[{$\square$}] Techniques for translation validation
+\item[{$\square$}] What does semantically equivalent mean
+\item[{$\square$}] What happens when there is no semantic equivalence
+\item[{$\square$}] Translation validation through symbolic execution
+\end{itemize}
+
+\subsection{Translation validation of the Pattern Matching Compiler}
+\label{sec:org24ee133}
+
+\begin{enumerate}
+\item Source program
+\label{sec:org8c21912}
+The algorithm takes as its input a source program and translates it
+into an algebraic data structure called \emph{constraint\_tree}.
+
+\begin{verbatim}
+type constraint_tree =
+  | Unreachable
+  | Failure
+  | Leaf of source_expr
+  | Guard of source_blackbox * constraint_tree * constraint_tree
+  | Node of accessor * (constructor * constraint_tree) list * constraint_tree
+\end{verbatim}
+
+Unreachable, Leaf of source\_expr and Failure are the terminals of the three.
+We distinguish
+\begin{itemize}
+\item Unreachable: statically it is known that no value can go there
+\item Failure: a value matching this part results in an error
+\item Leaf: a value matching this part results into the evaluation of a
+source blackbox of  code
+\end{itemize}
+
+The algorithm doesn't support type-declaration-based analysis 
+to know the list of constructors at a given type.
+Let's consider some trivial examples:
+
+\begin{verbatim}
+function true -> 1
+\end{verbatim}
+
+[ ] Converti a disegni 
+
+Is translated to
+\begin{center}
+\begin{tabular}{l}
+Node ([(true, Leaf 1)], Failure)\\
+\end{tabular}
+\end{center}
+while
+\begin{verbatim}
+function 
+true -> 1 
+| false -> 2
+\end{verbatim}
+will give
+\begin{center}
+\begin{tabular}{l}
+Node ([(true, Leaf 1); (false, Leaf 2)])\\
+\end{tabular}
+\end{center}
+
+It is possible to produce Unreachable examples by using
+ refutation clauses (a "dot" in the right-hand-side)
+\begin{verbatim}
+function 
+true -> 1 
+| false -> 2 
+| _ -> .
+\end{verbatim}
+that gets translated into
+Node ([(true, Leaf 1); (false, Leaf 2)], Unreachable)
+
+We trust this annotation, which is reasonable as the  type-checker
+verifies that it indeed holds.
+
+Guard nodes of the tree are emitted whenever a guard is found. Guards
+node contains a blackbox of  code that is never evaluated and two
+branches, one that is taken in case the guard evaluates to true and
+the other one that contains the path taken when the guard evaluates to
+true.
+
+[ ] Finisci con Node
+[ ] Spiega del fallback
+[ ] rivedi compilazione per tenere in considerazione il tuo albero invece che le lambda
+[ ] Specifica che stesso algoritmo usato per compilare a lambda, piu` optimizations
+
+The source code of a pattern matching function in  has the
+following form:

 \begin{center}
 \begin{tabular}{l}
@ -310,32 +415,61 @@ match variable with\\
 \end{tabular}
 \end{center}

-It can be described more formally through a BNF grammar.
+and can include any expression that is legal for the  compiler,
+such as "when" conditions and assignments. Patterns could or could not
+be exhaustive.
+
+Pattern matching code could also be written using the more compact form:
+\begin{center}
+\begin{tabular}{l}
+function\\
+\(\vert{}\) pattern₁ -> expr₁\\
+\(\vert{}\) pattern₂ when guard -> expr₂\\
+\(\vert{}\) pattern₃ as var -> expr₃\\
+⋮\\
+\(\vert{}\) pₙ -> exprₙ\\
+\end{tabular}
+\end{center}
+
+
+This BNF grammar describes formally the grammar of the source program:

 \begin{verbatim}
+start ::= "match" id "with" patterns | "function" patterns
+patterns ::= (pattern0|pattern1) pattern1+
+;; pattern0 and pattern1 are needed to distinguish the first case in which
+;; we can avoid writing the optional vertical line
+pattern0 ::= clause 
+pattern1 ::= "|" clause
+clause ::= lexpr "->" rexpr

-pattern ::= value-name  
-| _   ;; wildcard pattern
-| constant  ;; matches a constant value
-| pattern as  value-name  ;; binds to value-name
-| ( pattern ) ;; parenthesized pattern
-| pattern |  pattern ;; or-pattern 
-| constr  pattern ;; variant pattern
-| [ pattern  { ; pattern }  [ ; ] ] ;; list patterns 
-| pattern ::  pattern   ;; lists patterns using cons operator (::)
-| [| pattern  { ; pattern }  [ ; ] |] ;; array pattern 
-| char-literal ..  char-literal   ;; match on a range of characters
-| { field  [: typexpr]  [= pattern] { ; field  [: typexpr]  [= pattern] } \
-  [; _ ] [ ; ] } ;; patterns that match on records
+lexpr ::= rule (ε|condition)
+rexpr ::= _code ;; arbitrary code

+rule ::= wildcard|variable|constructor_pattern|or_pattern ;;
+
+;; rules
+wildcard ::= "_"
+variable ::= identifier
+constructor_pattern ::= constructor (rule|ε) (assignment|ε)
+
+constructor ::= int|float|char|string|bool
+                |unit|record|exn|objects|ref
+                |list|tuple|array
+                |variant|parameterized_variant ;;  data types
+
+or_pattern ::=  wildcard|variable|constructor_pattern ("|" wildcard|variable|constructor_pattern)+
+
+condition ::= "when" bexpr 
+assignment ::= "as" id
+bexpr ::= _code ;; arbitrary code
 \end{verbatim}

 \begin{comment}
-\begin{enumerate}
-\item 1.2.1 Pattern matching compilation to lambda code
-\label{sec:org4d27bd9}
+Check that it is still this
+\end{comment}

-During compilation, patterns are in the form
+Patterns are of the form
 \begin{center}
 \begin{tabular}{ll}
 pattern & type of pattern\\
@ -347,10 +481,10 @@ c(p₁,p₂,\ldots{},pₙ) & constructor pattern\\
 \end{tabular}
 \end{center}

-Expressions are compiled into lambda code and are referred as lambda
-code actions.
+During compilation by the translators expressions are compiled into
+lambda code and are referred as lambda code actions lᵢ.

-The entire pattern matching code can be represented as a clause matrix
+The entire pattern matching code is represented as a clause matrix
 that associates rows of patterns (p\(_{\text{i,1}}\), p\(_{\text{i,2}}\), \ldots{}, p\(_{\text{i,n}}\)) to
 lambda code action lⁱ
 \begin{equation*}
@ -399,7 +533,7 @@ are the same

 \begin{enumerate}
 \item Initial state of the compilation
-\label{sec:orge9c6bc4}
+\label{sec:org9a7b624}

 Given a source of the following form:

@ -528,8 +662,6 @@ apply, and P₂ → L₂ containing the remaining rows. The algorithm is
 applied to both matrices.
 \end{enumerate}

-\end{comment}
-
 \begin{comment}
 #+BEGIN_COMMENT
 CITE paper?
@ -537,26 +669,4 @@ CITE paper?
 \end{comment}
 \end{enumerate}
 \end{enumerate}
-
-\subsection{Symbolic execution}
-\label{sec:orgdb60b84}
-
-\subsection{Translation validation}
-\label{sec:org096d047}
-Translators, such as translators and code generators, are huge pieces of
-software usually consisting of multiple subsystem and
-constructing an actual specification of a translator implementation for
-formal validation is a very long task. Moreover, different
-translators implement different algorithms, so the correctness proof of
-a translator cannot be generalized and reused to prove another translator.
-Translation validation is an alternative to the verification of
-existing translators that consists of taking the source and the target
-(compiled) program and proving \emph{a posteriori} their semantic equivalence.
-
-\begin{itemize}
-\item[{$\square$}] Techniques for translation validation
-\item[{$\square$}] What does semantically equivalent mean
-\item[{$\square$}] What happens when there is no semantic equivalence
-\item[{$\square$}] Translation validation through symbolic execution
-\end{itemize}
 \end{document}