accessors
After Width: | Height: | Size: 918 B |
After Width: | Height: | Size: 398 B |
After Width: | Height: | Size: 519 B |
After Width: | Height: | Size: 655 B |
After Width: | Height: | Size: 409 B |
After Width: | Height: | Size: 2.2 KiB |
After Width: | Height: | Size: 2.8 KiB |
After Width: | Height: | Size: 519 B |
BIN
tesi/tesi.pdf
|
@ -439,7 +439,6 @@ Our algorithm respects the following correctness statement:
|
|||
v_S \approx v_T \in S \;\land\; C_S(v_S) \neq C_T(v_T)
|
||||
\end{align*}
|
||||
* Background
|
||||
|
||||
** OCaml
|
||||
Objective Caml (OCaml) is a dialect of the ML (Meta-Language)
|
||||
family of programming that features with other dialects of ML, such
|
||||
|
@ -449,9 +448,10 @@ provides many advantages with respect to static type systems of traditional impe
|
|||
oriented language such as C, C++ and Java, such as:
|
||||
- Polymorphism: in certain scenarios a function can accept more than one
|
||||
type for the input parameters. For example a function that computes the length of a
|
||||
list doesn't need to inspect the type of the elements of the list and for this reason
|
||||
list doesn't need to inspect the type of the elements and for this reason
|
||||
a List.length function can accept lists of integers, lists of strings and in general
|
||||
lists of any type. Such languages offer polymorphic functions through subtyping at
|
||||
lists of any type. JVM languages offer polymorphic functions and
|
||||
classes through subtyping at
|
||||
runtime only, while other languages such as C++ offer polymorphism through compile
|
||||
time templates and function overloading.
|
||||
With the Hindley-Milner type system each well typed function can have more than one
|
||||
|
@ -459,8 +459,8 @@ oriented language such as C, C++ and Java, such as:
|
|||
For example the principal type of the List.length function is "For any /a/, function from
|
||||
list of /a/ to /int/" and /a/ is called the /type parameter/.
|
||||
- Strong typing: Languages such as C and C++ allow the programmer to operate on data
|
||||
without considering its type, mainly through pointers. Other languages such as C#
|
||||
and Go allow type erasure so at runtime the type of the data can't be queried.
|
||||
without considering its type, mainly through pointers[CIT]. Other languages such as Swift
|
||||
and Java performs type erasure[CIT] so at runtime the type of the data can't be queried.
|
||||
In the case of programming languages using an Hindley-Milner type system the
|
||||
programmer is not allowed to operate on data by ignoring or promoting its type.
|
||||
- Type Inference: the principal type of a well formed term can be inferred without any
|
||||
|
@ -479,10 +479,10 @@ oriented language such as C, C++ and Java, such as:
|
|||
Moreover ML languages are functional, meaning that functions are
|
||||
treated as first class citizens and variables are immutable,
|
||||
although mutable statements and imperative constructs are permitted.
|
||||
In addition to that features an object system, that provides
|
||||
In addition to that OCaml features an object system, that provides
|
||||
inheritance, subtyping and dynamic binding, and modules, that
|
||||
provide a way to encapsulate definitions. Modules are checked
|
||||
statically and can be reifycated through functors.
|
||||
statically and can be reifycated through functors[CIT].
|
||||
|
||||
** Compiling OCaml code
|
||||
|
||||
|
@ -496,17 +496,17 @@ is trasformed into an untyped syntax tree. Code with syntax errors is
|
|||
rejected at this stage.
|
||||
After that the AST is processed by the type checker that performs
|
||||
three steps at once:
|
||||
- type inference, using the classical /Algorithm W/
|
||||
- type inference, using the classical /Algorithm W/[CIT]
|
||||
- perform subtyping and gathers type information from the module system
|
||||
- ensures that the code obeys the rule of the OCaml type system
|
||||
At this stage, incorrectly typed code is rejected. In case of success,
|
||||
the untyped AST in trasformed into a /Typed Tree/.
|
||||
After the typechecker has proven that the program is type safe,
|
||||
the compiler lower the code to /Lambda/, an s-expression based
|
||||
language that assumes that its input has already been proved safe.
|
||||
language that assumes that its input has already been proved safe[CIT].
|
||||
After the Lambda pass, the Lambda code is either translated into
|
||||
bytecode or goes through a series of optimization steps performed by
|
||||
the /Flambda/ optimizer before being translated into assembly.
|
||||
the /Flambda/ optimizer[CIT] before being translated into assembly.
|
||||
\begin{comment}
|
||||
TODO: Talk about flambda passes?
|
||||
\end{comment}
|
||||
|
@ -549,6 +549,7 @@ set to 0.
|
|||
\begin{comment}
|
||||
https://dev.realworld.org/compiler-backend.html
|
||||
CITE: realworldocaml
|
||||
[CIT]
|
||||
\end{comment}
|
||||
A Lambda code target file is produced by the compiler and consists of a
|
||||
single s-expression. Every s-expression consist of /(/, a sequence of
|
||||
|
@ -577,7 +578,8 @@ let test = function
|
|||
| Fred -> "fred"
|
||||
#+END_SRC
|
||||
The Lambda output for this code can be obtained by running the
|
||||
compiler with the /-dlambda/ flag:
|
||||
compiler with the /-drawlambda/ flag or in a more compact form with
|
||||
the /-dlambda/ flag:
|
||||
#+BEGIN_SRC
|
||||
(setglobal Prova!
|
||||
(let
|
||||
|
@ -590,9 +592,9 @@ compiler with the /-dlambda/ flag:
|
|||
case int 3: "fred")))
|
||||
(makeblock 0 test/85)))
|
||||
#+END_SRC
|
||||
As outlined by the example, the /makeblock/ directive is responsible
|
||||
for allocating low level OCaml values and every constant constructor
|
||||
ot the algebraic type /t/ is stored in memory as an integer.
|
||||
As outlined by the example, the /makeblock/ directive allows to
|
||||
allocate low level OCaml values and every constant constructor
|
||||
of the algebraic type /t/ is stored in memory as an integer.
|
||||
The /setglobal/ directives declares a new binding in the global scope:
|
||||
Every concept of modules is lost at this stage of compilation.
|
||||
The pattern matching compiler uses a jump table to map every pattern
|
||||
|
@ -624,7 +626,7 @@ are mapped to cons cell that are accessed using the /tag/ directive.
|
|||
case tag 1: (if (!= (field 0 param/90) 0) "fred" "baz"))))
|
||||
(makeblock 0 test/89)))
|
||||
#+END_SRC
|
||||
In the Lambda language are several numeric types:
|
||||
In the Lambda language defines several numeric types:
|
||||
- integers: that us either 31 or 63 bit two's complement arithmetic
|
||||
depending on system word size, and also wrapping on overflow
|
||||
- 32 bit and 64 bit integers: that use 32-bit and 64-bit two's complement arithmetic
|
||||
|
@ -633,14 +635,12 @@ In the Lambda language are several numeric types:
|
|||
- floats: that use IEEE754 double-precision (64-bit) arithmetic with
|
||||
the addition of the literals /infinity/, /neg_infinity/ and /nan/.
|
||||
\\
|
||||
The are various numeric operations defined:
|
||||
|
||||
The are various numeric operations:
|
||||
- Arithmetic operations: +, -, *, /, % (modulo), neg (unary negation)
|
||||
- Bitwise operations: &, |, ^, <<, >> (zero-shifting), a>> (sign extending)
|
||||
- Numeric comparisons: <, >, <=, >=, ==
|
||||
|
||||
*** Functions
|
||||
|
||||
Functions are defined using the following syntax, and close over all
|
||||
bindings in scope: (lambda (arg1 arg2 arg3) BODY)
|
||||
and are applied using the following syntax: (apply FUNC ARG ARG ARG)
|
||||
|
@ -663,8 +663,7 @@ Spiega che la sintassi che supporti e` quella nella BNF
|
|||
|
||||
|
||||
** Pattern matching
|
||||
|
||||
Pattern matching is a widely adopted mechanism to interact with ADT.
|
||||
Pattern matching is a widely adopted mechanism to interact with ADT[CIT].
|
||||
C family languages provide branching on predicates through the use of
|
||||
if statements and switch statements.
|
||||
Pattern matching on the other hands express predicates through
|
||||
|
@ -673,7 +672,7 @@ arbitrary shapes. One common example of pattern matching is the use of regular
|
|||
expressions on strings. provides pattern matching on ADT and
|
||||
primitive data types.
|
||||
The result of a pattern matching operation is always one of:
|
||||
- this value does not match this pattern”
|
||||
- this value does not match this pattern
|
||||
- this value matches this pattern, resulting the following bindings of
|
||||
names to values and the jump to the expression pointed at the
|
||||
pattern.
|
||||
|
@ -687,54 +686,42 @@ match color with
|
|||
| Green -> print "green"
|
||||
| _ -> print "white or black"
|
||||
#+END_SRC
|
||||
|
||||
provides tokens to express data destructoring.
|
||||
Pattern matching clauses provide tokens to express data destructoring.
|
||||
For example we can examine the content of a list with pattern matching
|
||||
|
||||
#+BEGIN_SRC
|
||||
|
||||
begin match list with
|
||||
| [ ] -> print "empty list"
|
||||
| element1 :: [ ] -> print "one element"
|
||||
| (element1 :: element2) :: [ ] -> print "two elements"
|
||||
| head :: tail-> print "head followed by many elements"
|
||||
#+END_SRC
|
||||
\\
|
||||
Parenthesized patterns, such as the third one in the previous example,
|
||||
matches the same value as the pattern without parenthesis.
|
||||
\\
|
||||
The same could be done with tuples
|
||||
#+BEGIN_SRC
|
||||
|
||||
begin match tuple with
|
||||
| (Some _, Some _) -> print "Pair of optional types"
|
||||
| (Some _, None) | (None, Some _) -> print "Pair of optional types, one of which is null"
|
||||
| (None, None) -> print "Pair of optional types, both null"
|
||||
#+END_SRC
|
||||
|
||||
The pattern pattern₁ | pattern₂ represents the logical "or" of the
|
||||
two patterns pattern₁ and pattern₂. A value matches pattern₁ |
|
||||
pattern₂ if it matches pattern₁ or pattern₂.
|
||||
two patterns, pattern₁ and pattern₂. A value matches /pattern₁ | pattern₂/ if it matches pattern₁ or pattern₂.
|
||||
\\
|
||||
Pattern clauses can make the use of /guards/ to test predicates and
|
||||
variables can captured (binded in scope).
|
||||
|
||||
#+BEGIN_SRC
|
||||
|
||||
begin match token_list with
|
||||
| "switch"::var::"{"::rest -> ...
|
||||
| "case"::":"::var::rest when is_int var -> ...
|
||||
| "case"::":"::var::rest when is_string var -> ...
|
||||
| "}"::[ ] -> ...
|
||||
| "}"::rest -> error "syntax error: " rest
|
||||
|
||||
#+END_SRC
|
||||
\\
|
||||
Moreover, the pattern matching compiler emits a warning when a
|
||||
pattern is not exhaustive or some patterns are shadowed by precedent ones.
|
||||
|
||||
** Symbolic execution
|
||||
|
||||
Symbolic execution is a widely used techniques in the field of
|
||||
computer security.
|
||||
It allows to analyze different execution paths of a program
|
||||
|
@ -749,8 +736,12 @@ domain of all possible inputs of a program, detecting infeasible
|
|||
paths, dead code and proving that two code segments are equivalent.
|
||||
\\
|
||||
Let's take as example this signedness bug that was found in the
|
||||
FreeBSD kernel and allowed, when calling the getpeername function, to
|
||||
FreeBSD kernel [CIT] and allowed, when calling the /getpeername/ function, to
|
||||
read portions of kernel memory.
|
||||
\begin{comment}
|
||||
TODO: controlla paginazione
|
||||
\end{comment}
|
||||
\clearpage
|
||||
#+BEGIN_SRC
|
||||
int compat;
|
||||
{
|
||||
|
@ -775,10 +766,11 @@ bad:
|
|||
return (error);
|
||||
}
|
||||
#+END_SRC
|
||||
|
||||
The tree of the execution when the function is evaluated considering
|
||||
/int len/ our symbolic variable α, sa->sa_len as symbolic variable β
|
||||
and π as the set of constraints on a symbolic variable:
|
||||
The tree of the execution is presented below.
|
||||
It is built by evaluating the function by consider the integer
|
||||
variable /len/ the symbolic variable α, sa->sa_len the symbolic variable β
|
||||
and π indicates the set of constraints on the symbolic variables.
|
||||
The input values to the functions are identified by σ.
|
||||
[[./files/symb_exec.png]]
|
||||
\begin{comment}
|
||||
[1] compat (...) { π_{α}: -∞ < α < ∞ }
|
||||
|
@ -793,15 +785,15 @@ We can see that at step 3 the set of possible values of the scrutinee
|
|||
α is bigger than the set of possible values of the input σ to the
|
||||
/cast/ directive, that is: π_{α} ⊈ π_{σ}. For this reason the /cast/ may fail when α is /len/
|
||||
negative number, outside the domain π_{σ}. In C this would trigger undefined behaviour (signed
|
||||
overflow) that made the exploitation possible.
|
||||
\\
|
||||
Every step of evaluation can be modelled as the following transition:
|
||||
overflow) that made the exploit possible.
|
||||
*** Symbolic Execution in terms of Hoare Logic
|
||||
Every step of the evaluation in a symbolic engine can be modelled as the following transition:
|
||||
\[
|
||||
(π_{σ}, (πᵢ)ⁱ) → (π'_{σ}, (π'ᵢ)ⁱ)
|
||||
\]
|
||||
if we express the π constraints as logical formulas we can model the
|
||||
if we express the π transitions as logical formulas we can model the
|
||||
execution of the program in terms of Hoare Logic.
|
||||
State of the computation is a Hoare triple {P}C{Q} where P and Q are
|
||||
The state of the computation is a Hoare triple {P}C{Q} where P and Q are
|
||||
respectively the /precondition/ and the /postcondition/ that
|
||||
constitute the assertions of the program. C is the directive being
|
||||
executed.
|
||||
|
@ -812,7 +804,8 @@ In the Hoare rules assertions could also take the form of
|
|||
| P ::= \forall i. P \vert \exists i. P \vert P_{1} \Rightarrow P_{2}
|
||||
where i is a logical variable, but assertions of these kinds increases
|
||||
the complexity of the symbolic engine.
|
||||
Execution follows the rules of Hoare logic:
|
||||
\
|
||||
Execution follows the following inference rules:
|
||||
- Empty statement :
|
||||
\begin{mathpar}
|
||||
\infer{ }
|
||||
|
@ -826,7 +819,7 @@ Execution follows the rules of Hoare logic:
|
|||
\end{mathpar}
|
||||
\\
|
||||
- Composition : c₁ and c₂ are directives that are executed in order;
|
||||
{Q} is called the /midcondition/.
|
||||
{Q} is called the /mid condition/.
|
||||
\begin{mathpar}
|
||||
\infer { \{P\}c_1\{R\}, \{R\}c_2\{Q\} }
|
||||
{ \{P\}c₁;c₂\{Q\} }
|
||||
|
@ -839,7 +832,7 @@ Execution follows the rules of Hoare logic:
|
|||
\end{mathpar}
|
||||
\\
|
||||
- Loop : {P} is the loop invariant. After the loop is finished /P/
|
||||
holds and ¬̸b caused the loop to end.
|
||||
holds and ¬b caused the loop to end.
|
||||
\begin{mathpar}
|
||||
\infer { \{P \wedge b \}c\{P\} }
|
||||
{ \{P\}\text{while b do c} \{P \wedge \neg b\} }
|
||||
|
@ -847,14 +840,14 @@ Execution follows the rules of Hoare logic:
|
|||
\\
|
||||
Even if the semantics of symbolic execution engines are well defined,
|
||||
the user may run into different complications when applying such
|
||||
analysis to non trivial codebases.
|
||||
analysis to non trivial codebases.[CIT]
|
||||
For example, depending on the domain, loop termination is not
|
||||
guaranteed. Even when termination is guaranteed, looping causes
|
||||
exponential branching that may lead to path explosion or state
|
||||
explosion.
|
||||
explosion.[CIT]
|
||||
Reasoning about all possible executions of a program is not always
|
||||
feasible and in case of explosion usually symbolic execution engines
|
||||
implement heuristics to reduce the size of the search space.
|
||||
feasible[CIT] and in case of explosion usually symbolic execution engines
|
||||
implement heuristics to reduce the size of the search space.[CIT]
|
||||
|
||||
** Translation validation
|
||||
Translators, such as translators and code generators, are huge pieces of
|
||||
|
@ -873,10 +866,97 @@ existing translators that consists of taking the source and the target
|
|||
- [ ] Translation validation through symbolic execution
|
||||
|
||||
* Translation validation of the Pattern Matching Compiler
|
||||
** Accessors
|
||||
OCaml encourages widespread usage of composite types to encapsulate
|
||||
data. Composite types include /discriminated unions/, of which we have
|
||||
seen different use cases, and /records/, that are a form of /product
|
||||
types/ [CIT] such as /structures/ in C. \\
|
||||
\begin{minipage}{0.6\linewidth}
|
||||
\begin{verbatim}
|
||||
struct Shape {
|
||||
int centerx;
|
||||
int centery;
|
||||
enum ShapeKind kind;
|
||||
union {
|
||||
struct { int side; };
|
||||
struct { int length, height; };
|
||||
struct { int radius; };
|
||||
};
|
||||
};
|
||||
\end{verbatim}
|
||||
\end{minipage}
|
||||
\hfill
|
||||
\begin{minipage}{0.4\linewidth}
|
||||
\begin{verbatim}
|
||||
type shape = {
|
||||
x:int;
|
||||
y:int;
|
||||
kind:shapekind
|
||||
}
|
||||
and shapekind
|
||||
| Square of int
|
||||
| Rect of int * int
|
||||
| Circle of int
|
||||
\end{verbatim}
|
||||
\end{minipage}
|
||||
\begin{comment}
|
||||
TODO: metti linea qualcosa a dividere
|
||||
\end{comment}
|
||||
Primitive OCaml datatypes include aggregate types in the form of
|
||||
/tuples/ and /lists/. Other aggregate types are built using module
|
||||
functors. [CIT]
|
||||
Low level /Lambda/ untyped constructors of the form
|
||||
#+BEGIN_SRC
|
||||
type t = Constant | Cell of int * t
|
||||
#+END_SRC
|
||||
provides enough flexibility to encode source higher kinded types.
|
||||
This shouldn't surprise because the /Lambda/ language consists of
|
||||
s-expressions. The /field/ operation allows to address a /Cell/ value
|
||||
and the expressions /(field 0 x)/ and /(field 1 x)/ are equivalent to
|
||||
the Lisp primitives /(car x)/ and /(cdr x)/ respectively.
|
||||
\begin{comment}
|
||||
(* that can also be written *)
|
||||
let value = List.cons 1
|
||||
@@ List.cons 2
|
||||
@@ List.cons 3 []
|
||||
\end{comment}
|
||||
\begin{minipage}{0.3\linewidth}
|
||||
\begin{verbatim}
|
||||
let value = 1 :: 2 :: 3 :: []
|
||||
\end{verbatim}
|
||||
\end{minipage}
|
||||
\hfill
|
||||
\begin{minipage}{0.5\linewidth}
|
||||
\begin{verbatim}
|
||||
(field 0 x) = 1
|
||||
(field 0 (field 1 x)) = 2
|
||||
(field 0 (field 1 (field 1 x)) = 3
|
||||
(field 0 (field 1 (field 1 (field 1 x)) = []
|
||||
\end{verbatim}
|
||||
\end{minipage} \\
|
||||
\\
|
||||
We can represent the concrete value of a higher kinded type as a flat
|
||||
list of cells.
|
||||
In the prototype we call this "view" into the value of a datatype the
|
||||
\emph{accessor} /a/.
|
||||
| a ::= Here \vert n.a
|
||||
Accessors have some resemblance with the low level /Cell/ values, such
|
||||
as the fact that both don't encode type informations; for example the
|
||||
accessor of a list of integers is structurally equivalent to the
|
||||
accessor of a tuple containing the same elements.
|
||||
\\
|
||||
We can intuitively think of the /accessor/ as the
|
||||
access path to a value that can be reached by deconstructing the
|
||||
scrutinee.
|
||||
At the source level /accessors/ are constructed by inspecting the structure of
|
||||
the patterns at hand.
|
||||
At the target level /accessors/ are constructed by compressing the steps taken by
|
||||
the symbolic engine during the evaluation of a value.
|
||||
/Accessors/ don't store informations about the concrete value of the scrutinee or
|
||||
it's symbolic domain of values.
|
||||
|
||||
** Source program
|
||||
|
||||
The source code of a pattern matching function has the
|
||||
The OCaml source code of a pattern matching function has the
|
||||
following form:
|
||||
|
||||
|match variable with
|
||||
|
@ -895,10 +975,10 @@ Pattern matching code could also be written using the more compact form:
|
|||
|\vert pattern₃ as var \to expr₃
|
||||
|⋮
|
||||
|\vert pₙ \to exprₙ
|
||||
|
||||
|
||||
\begin{comment}
|
||||
TODO: paginazione seria per BNF
|
||||
\end{comment}
|
||||
This BNF grammar describes formally the grammar of the source program:
|
||||
|
||||
| start ::= "match" id "with" patterns \vert{} "function" patterns
|
||||
| patterns ::= (pattern0\vert{}pattern1) pattern1+
|
||||
| ;; pattern0 and pattern1 are needed to distinguish the first case in which
|
||||
|
@ -908,16 +988,17 @@ This BNF grammar describes formally the grammar of the source program:
|
|||
| clause ::= lexpr "->" rexpr
|
||||
| lexpr ::= rule (ε\vert{}condition)
|
||||
| rexpr ::= _code ;; arbitrary code
|
||||
| rule ::= wildcard\vert{}variable\vert{}constructor_pattern\vert or_pattern ;;
|
||||
| rule ::= wildcard\vert{}variable\vert{}constructor_pattern\vert or_pattern
|
||||
| wildcard ::= "_"
|
||||
| variable ::= identifier
|
||||
| constructor_pattern ::= constructor (rule\vert{}ε) (assignment\vert{}ε)
|
||||
| constructor ::= int\vert{}float\vert{}char\vert{}string\vert{}bool \vert{}unit\vert{}record\vert{}exn\vert{}objects\vert{}ref \vert{}list\vert{}tuple\vert{}array\vert{}variant\vert{}parameterized_variant ;; data types
|
||||
| constructor ::= int\vert{}float\vert{}char\vert{}string\vert{}bool \vert{}unit
|
||||
| \quad\quad\quad\quad\quad\quad\quad \vert{}record\vert{}exn\vert{}objects\vert{}ref \vert{}list\vert{}tuple\vert{}array\vert{}variant\vert{}parameterized_variant ;; data types
|
||||
| or_pattern ::= rule ("\vert{}" wildcard\vert{}variable\vert{}constructor_pattern)+
|
||||
| condition ::= "when" bexpr
|
||||
| condition ::= "when" b_guard
|
||||
| assignment ::= "as" id
|
||||
| bexpr ::= _code ;; arbitrary code
|
||||
The source program is parsed using the ocaml-compiler-libs library.
|
||||
| b_guard ::= ocaml_expression ;; arbitrary code
|
||||
The source program is parsed using the ocaml-compiler-libs [CIT] library.
|
||||
The result of parsing, when successful, results in a list of clauses
|
||||
and a list of type declarations.
|
||||
Every clause consists of three objects: a left-hand-side that is the
|
||||
|
@ -1037,17 +1118,15 @@ program tₛ:
|
|||
A sound approach for treating these blackboxes would be to inspect the
|
||||
OCaml compiler during translation to Lambda code and extract the
|
||||
blackboxes compiled in their Lambda representation.
|
||||
This would allow to test for equality with the respective blackbox at
|
||||
the target level.
|
||||
This would allow to test for structural equality with the counterpart
|
||||
Lambda blackboxes at the target level.
|
||||
Given that this level of introspection is currently not possibile, we
|
||||
decided to restrict the structure of blackboxes to the following (valid) OCaml
|
||||
code:
|
||||
|
||||
#+BEGIN_SRC
|
||||
external guard : 'a -> 'b = "guard"
|
||||
external observe : 'a -> 'b = "observe"
|
||||
#+END_SRC
|
||||
|
||||
We assume these two external functions /guard/ and /observe/ with a valid
|
||||
type that lets the user pass any number of arguments to them.
|
||||
All the guards are of the form \texttt{guard <arg> <arg> <arg>}, where the
|
||||
|
@ -1068,11 +1147,9 @@ let _ = function
|
|||
We note that the right hand side of /observe/ is just an arbitrary
|
||||
value and in this case just enumerates the order in which expressions
|
||||
appear.
|
||||
Even if this is an oversimplification of the problem for the
|
||||
prototype, it must be noted that at the compiler level we have the
|
||||
possibility to compile the pattern clauses in two separate steps so
|
||||
that the guards and right-hand-side expressions are semantically equal
|
||||
to their counterparts at the target program level.
|
||||
This oversimplification of the structure of arbitrary code blackboxes
|
||||
allows us to test for structural equivalence without querying the
|
||||
compiler during the /TypedTree/ to /Lambda/ translation phase.
|
||||
\begin{lstlisting}
|
||||
let _ = function
|
||||
| K1 -> lambda₀
|
||||
|
@ -1133,6 +1210,7 @@ following four rules:
|
|||
\end{equation*}
|
||||
where
|
||||
\begin{equation*}
|
||||
P' \to L' =
|
||||
\begin{pmatrix}
|
||||
p_{1,2} & \cdots & p_{1,n} & → & (let & y₁ & x₁) & l₁ \\
|
||||
p_{2,2} & \cdots & p_{2,n} & → & (let & y₂ & x₁) & l₂ \\
|
||||
|
@ -1141,9 +1219,9 @@ following four rules:
|
|||
\end{pmatrix}
|
||||
\end{equation*}
|
||||
|
||||
That means in every lambda action lᵢ there is a binding of x₁ to the
|
||||
variable that appears on the pattern p_{i,1}. Bindings are omitted
|
||||
for wildcard patterns and the lambda action lᵢ remains unchanged.
|
||||
That means in every lambda action lᵢ in the P' → L' matrix there is a binding of x₁ to the
|
||||
variable that appears on the pattern p_{i,1}. When there are
|
||||
wildcard patterns, bindings are omitted the lambda action lᵢ remains unchanged.
|
||||
|
||||
2) Constructor rule: if all patterns in the first column of P are
|
||||
constructors patterns of the form k(q₁, q₂, ..., q_{n'}) we define a
|
||||
|
@ -1157,7 +1235,7 @@ following four rules:
|
|||
p ← $q_{i,1}$, $q_{i,2}$, ..., $q_{i,n'}$, $p_{i,2}$, $p_{i,3}$, ..., $p_{i, n}$
|
||||
\end{lstlisting}
|
||||
Patterns of the form $q_{i,j}$ matches on the values of the
|
||||
constructor and we define new fresh variables y₁, y₂, ..., yₐ so
|
||||
constructor and we define the variables y₁, y₂, ..., yₐ so
|
||||
that the lambda action lᵢ becomes
|
||||
|
||||
\begin{lstlisting}[mathescape,columns=fullflexible,basicstyle=\fontfamily{lmvtt}\selectfont,]
|
||||
|
@ -1169,7 +1247,7 @@ following four rules:
|
|||
\end{lstlisting}
|
||||
|
||||
and the result of the compilation for the set of constructors
|
||||
{c₁, c₂, ..., cₖ} is:
|
||||
${\{c_{1}, c_{2}, ..., c_{k}\}}$ is:
|
||||
|
||||
\begin{lstlisting}[mathescape,columns=fullflexible,basicstyle=\fontfamily{lmvtt}\selectfont,]
|
||||
switch x₁ with
|
||||
|
@ -1180,27 +1258,30 @@ following four rules:
|
|||
default: exit
|
||||
\end{lstlisting}
|
||||
\\
|
||||
\begin{comment}
|
||||
TODO: il tre viene cambiato in uno e la lista divisa. Risolvi.
|
||||
\end{comment}
|
||||
3) Orpat rule: there are various strategies for dealing with
|
||||
or-patterns. The most naive one is to split the or-patterns.
|
||||
For example a row p containing an or-pattern:
|
||||
\begin{equation*}
|
||||
(p_{i,1}|q_{i,1}|r_{i,1}), p_{i,2}, ..., p_{i,m} → lᵢ
|
||||
(p_{i,1}|q_{i,1}|r_{i,1}), p_{i,2}, ..., p_{i,m} → l
|
||||
\end{equation*}
|
||||
results in three rows added to the clause matrix
|
||||
\begin{equation*}
|
||||
p_{i,1}, p_{i,2}, ..., p_{i,m} → lᵢ \\
|
||||
p_{i,1}, p_{i,2}, ..., p_{i,m} → l \\
|
||||
\end{equation*}
|
||||
\begin{equation*}
|
||||
q_{i,1}, p_{i,2}, ..., p_{i,m} → lᵢ \\
|
||||
q_{i,1}, p_{i,2}, ..., p_{i,m} → l \\
|
||||
\end{equation*}
|
||||
\begin{equation*}
|
||||
r_{i,1}, p_{i,2}, ..., p_{i,m} → lᵢ
|
||||
r_{i,1}, p_{i,2}, ..., p_{i,m} → l
|
||||
\end{equation*}
|
||||
4) Mixture rule:
|
||||
When none of the previous rules apply the clause matrix P → L is
|
||||
split into two clause matrices, the first P₁ → L₁ that is the
|
||||
When none of the previous rules apply the clause matrix ${P\space{}\to\space{}L}$ is
|
||||
split into two clause matrices, the first $P_{1}\space\to\space{}L_{1}$ that is the
|
||||
largest prefix matrix for which one of the three previous rules
|
||||
apply, and P₂ → L₂ containing the remaining rows. The algorithm is
|
||||
apply, and $P_{2}\space\to\space{}L_{2}$ containing the remaining rows. The algorithm is
|
||||
applied to both matrices.
|
||||
It is important to note that the application of the decomposition
|
||||
algorithm converges. This intuition can be verified by defining the
|
||||
|
@ -1232,41 +1313,6 @@ q_{1,1} & p_{1,2} & \cdots & p_{1,n} → l₁ \\
|
|||
\vdots & \vdots & \ddots & \vdots → \vdots \\
|
||||
\end{pmatrix}\big\rvert = n
|
||||
\end{equation*}
|
||||
In our prototype we make use of accessors to encode stored values.
|
||||
|
||||
\begin{minipage}{0.6\linewidth}
|
||||
\begin{verbatim}
|
||||
|
||||
let value = 1 :: 2 :: 3 :: []
|
||||
(* that can also be written *)
|
||||
let value = []
|
||||
|> List.cons 3
|
||||
|> List.cons 2
|
||||
|> List.cons 1
|
||||
\end{verbatim}
|
||||
\end{minipage}
|
||||
\hfill
|
||||
\begin{minipage}{0.5\linewidth}
|
||||
\begin{verbatim}
|
||||
|
||||
|
||||
(field 0 x) = 1
|
||||
(field 0 (field 1 x)) = 2
|
||||
(field 0 (field 1 (field 1 x)) = 3
|
||||
(field 0 (field 1 (field 1 (field 1 x)) = []
|
||||
\end{verbatim}
|
||||
\end{minipage}
|
||||
An \emph{accessor} /a/ represents the
|
||||
access path to a value that can be reached by deconstructing the
|
||||
scrutinee.
|
||||
| a ::= Here \vert n.a
|
||||
The above example, in encoded form:
|
||||
\begin{verbatim}
|
||||
Here = 1
|
||||
1.Here = 2
|
||||
1.1.Here = 3
|
||||
1.1.1.Here = []
|
||||
\end{verbatim}
|
||||
In our prototype the source matrix mₛ is defined as follows
|
||||
| SMatrix mₛ := (aⱼ)^{j∈J}, ((p_{ij})^{j∈J} → bbᵢ)^{i∈I}
|
||||
|
||||
|
|
1
todo.org
|
@ -126,6 +126,7 @@
|
|||
+ [ ] Line breaks
|
||||
+ [ ] Esempio full signature match: allinea
|
||||
+ [ ] correct. stat. di eq. checking: fuori margine
|
||||
+ [ ] Minipages: spazio o divisore dalle linee prima e dopo
|
||||
- [X] segreteria: tesi inglese
|
||||
- [X] Gatti: inglese
|
||||
- [X] Gatti: Coppo mio relatore
|
||||
|
|