symbolic execution done
This commit is contained in:
parent
42f531b3e3
commit
d19c25abbd
4 changed files with 903 additions and 1 deletions
|
@ -3,7 +3,7 @@ import re
|
||||||
from sys import argv
|
from sys import argv
|
||||||
|
|
||||||
allsymbols = json.load(open('./unicode-latex.json'))
|
allsymbols = json.load(open('./unicode-latex.json'))
|
||||||
mysymbols = ['≡', '≠', '≼', '→', '←', '⊀', '⋠', '≺', '∀', '∈', 'ε','₀', '₂', '₁', '₃', 'ₐ', 'ₖ', 'ₘ', 'ₙ', 'ᵢ', 'ⁱ', '⋮', 'ₛ', 'ₜ', '≃', '⇔', '∧', '∅', 'ℕ', 'ⱼ', 'ʲ', '⊥', 'π']
|
mysymbols = ['≡', '≠', '≼', '→', '←', '⊀', '⋠', '≺', '∀', '∈', 'ε','₀', '₂', '₁', '₃', 'ₐ', 'ₖ', 'ₘ', 'ₙ', 'ᵢ', 'ⁱ', '⋮', 'ₛ', 'ₜ', '≃', '⇔', '∧', '∅', 'ℕ', 'ⱼ', 'ʲ', '⊥', 'π', 'α', 'β', '∞', 'σ', '≤', '⊈', '∧', '∨', '∃', '⇒' ]
|
||||||
extrasymbols = {'〚': '\llbracket', '〛': '\rrbracket'}
|
extrasymbols = {'〚': '\llbracket', '〛': '\rrbracket'}
|
||||||
|
|
||||||
symbols = {s: allsymbols[s] for s in mysymbols}
|
symbols = {s: allsymbols[s] for s in mysymbols}
|
||||||
|
|
777
tesi/referenze/blexim.txt
Normal file
777
tesi/referenze/blexim.txt
Normal file
|
@ -0,0 +1,777 @@
|
||||||
|
|
||||||
|
==Phrack Inc.==
|
||||||
|
|
||||||
|
Volume 0x0b, Issue 0x3c, Phile #0x0a of 0x10
|
||||||
|
|
||||||
|
|
||||||
|
|=--------------------=[ Basic Integer Overflows ]=----------------------=|
|
||||||
|
|=-----------------------------------------------------------------------=|
|
||||||
|
|=-------------------=[ by blexim <blexim@hush.com> ]=-------------------=|
|
||||||
|
|
||||||
|
1: Introduction
|
||||||
|
1.1 What is an integer?
|
||||||
|
1.2 What is an integer overflow?
|
||||||
|
1.3 Why can they be dangerous?
|
||||||
|
|
||||||
|
2: Integer overflows
|
||||||
|
2.1 Widthness overflows
|
||||||
|
2.1.1 Exploiting
|
||||||
|
2.2 Arithmetic overflows
|
||||||
|
2.2.1 Exploiting
|
||||||
|
|
||||||
|
3: Signedness bugs
|
||||||
|
3.1 What do they look like?
|
||||||
|
3.1.1 Exploiting
|
||||||
|
3.2 Signedness bugs caused by integer overflows
|
||||||
|
|
||||||
|
4: Real world examples
|
||||||
|
4.1 Integer overflows
|
||||||
|
4.2 Signedness bugs
|
||||||
|
|
||||||
|
|
||||||
|
--[ 1.0 Introduction
|
||||||
|
|
||||||
|
In this paper I'm going to describe two classes of programming bugs which
|
||||||
|
can sometimes allow a malicious user to modify the execution path of an
|
||||||
|
affected process. Both of these classes of bug work by causing variables
|
||||||
|
to contain unexpected values, and so are not as "direct" as classes which
|
||||||
|
overwrite memory, e.g. buffer overflows or format strings. All the
|
||||||
|
examples given in the paper are in C, so a basic familiarity with C is
|
||||||
|
assumed. A knowledge of how integers are stored in memory is also useful,
|
||||||
|
but not essential.
|
||||||
|
|
||||||
|
|
||||||
|
----[ 1.1 What is an integer?
|
||||||
|
|
||||||
|
An integer, in the context of computing, is a variable capable of
|
||||||
|
representing a real number with no fractional part. Integers are typically
|
||||||
|
the same size as a pointer on the system they are compiled on (i.e. on a 32
|
||||||
|
bit system, such as i386, an integer is 32 bits long, on a 64 bit system,
|
||||||
|
such as SPARC, an integer is 64 bits long). Some compilers don't use
|
||||||
|
integers and pointers of the same size however, so for the sake of
|
||||||
|
simplicity all the examples refer to a 32 bit system with 32 bit integers,
|
||||||
|
longs and pointers.
|
||||||
|
|
||||||
|
Integers, like all variables are just regions of memory. When we talk
|
||||||
|
about integers, we usually represent them in decimal, as that is the
|
||||||
|
numbering system humans are most used to. Computers, being digital, cannot
|
||||||
|
deal with decimal, so internally to the computer integers are stored in
|
||||||
|
binary. Binary is another system of representing numbers which uses only
|
||||||
|
two numerals, 1 and 0, as opposed to the ten numerals used in decimal. As
|
||||||
|
well as binary and decimal, hexadecimal (base sixteen) is often used in
|
||||||
|
computing as it is very easy to convert between binary and hexadecimal.
|
||||||
|
|
||||||
|
Since it is often necessary to store negative numbers, there needs to be a
|
||||||
|
mechanism to represent negative numbers using only binary. The way this is
|
||||||
|
accomplished is by using the most significant bit (MSB) of a variable to
|
||||||
|
determine the sign: if the MSB is set to 1, the variable is interpreted as
|
||||||
|
negative; if it is set to 0, the variable is positive. This can cause some
|
||||||
|
confusion, as will be explained in the section on signedness bugs, because
|
||||||
|
not all variables are signed, meaning they do not all use the MSB to
|
||||||
|
determine whether they are positive or negative. These variable are known
|
||||||
|
as unsigned and can only be assigned positive values, whereas variables
|
||||||
|
which can be either positive or negative are called unsigned.
|
||||||
|
|
||||||
|
|
||||||
|
----[ 1.2 What is an integer overflow?
|
||||||
|
|
||||||
|
Since an integer is a fixed size (32 bits for the purposes of this paper),
|
||||||
|
there is a fixed maximum value it can store. When an attempt is made to
|
||||||
|
store a value greater than this maximum value it is known as an integer
|
||||||
|
overflow. The ISO C99 standard says that an integer overflow causes
|
||||||
|
"undefined behaviour", meaning that compilers conforming to the standard
|
||||||
|
may do anything they like from completely ignoring the overflow to aborting
|
||||||
|
the program. Most compilers seem to ignore the overflow, resulting in an
|
||||||
|
unexpected or erroneous result being stored.
|
||||||
|
|
||||||
|
|
||||||
|
----[ 1.3 Why can they be dangerous?
|
||||||
|
|
||||||
|
Integer overflows cannot be detected after they have happened, so there is
|
||||||
|
not way for an application to tell if a result it has calculated previously
|
||||||
|
is in fact correct. This can get dangerous if the calculation has to do
|
||||||
|
with the size of a buffer or how far into an array to index. Of course
|
||||||
|
most integer overflows are not exploitable because memory is not being
|
||||||
|
directly overwritten, but sometimes they can lead to other classes of bugs,
|
||||||
|
frequently buffer overflows. As well as this, integer overflows can be
|
||||||
|
difficult to spot, so even well audited code can spring surprises.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
--[ 2.0 Integer overflows
|
||||||
|
|
||||||
|
So what happens when an integer overflow does happen? ISO C99 has this to
|
||||||
|
say:
|
||||||
|
|
||||||
|
"A computation involving unsigned operands can never overflow,
|
||||||
|
because a result that cannot be represented by the resulting unsigned
|
||||||
|
integer type is reduced modulo the number that is one greater than
|
||||||
|
the largest value that can be represented by the resulting type."
|
||||||
|
|
||||||
|
NB: modulo arithmetic involves dividing two numbers and taking the
|
||||||
|
remainder,
|
||||||
|
e.g.
|
||||||
|
10 modulo 5 = 0
|
||||||
|
11 modulo 5 = 1
|
||||||
|
so reducing a large value modulo (MAXINT + 1) can be seen as discarding the
|
||||||
|
portion of the value which cannot fit into an integer and keeping the rest.
|
||||||
|
In C, the modulo operator is a % sign.
|
||||||
|
</NB>
|
||||||
|
|
||||||
|
This is a bit wordy, so maybe an example will better demonstrate the
|
||||||
|
typical "undefined behaviour":
|
||||||
|
|
||||||
|
We have two unsigned integers, a and b, both of which are 32 bits long. We
|
||||||
|
assign to a the maximum value a 32 bit integer can hold, and to b we assign
|
||||||
|
1. We add a and b together and store the result in a third unsigned 32 bit
|
||||||
|
integer called r:
|
||||||
|
|
||||||
|
a = 0xffffffff
|
||||||
|
b = 0x1
|
||||||
|
r = a + b
|
||||||
|
|
||||||
|
Now, since the result of the addition cannot be represented using 32 bits,
|
||||||
|
the result, in accordance with the ISO standard, is reduced modulo
|
||||||
|
0x100000000.
|
||||||
|
|
||||||
|
r = (0xffffffff + 0x1) % 0x100000000
|
||||||
|
r = (0x100000000) % 0x100000000 = 0
|
||||||
|
|
||||||
|
Reducing the result using modulo arithmetic basically ensures that only the
|
||||||
|
lowest 32 bits of the result are used, so integer overflows cause the
|
||||||
|
result to be truncated to a size that can be represented by the variable.
|
||||||
|
This is often called a "wrap around", as the result appears to wrap around
|
||||||
|
to 0.
|
||||||
|
|
||||||
|
|
||||||
|
----[ 2.1 Widthness overflows
|
||||||
|
|
||||||
|
So an integer overflow is the result of attempting to store a value in a
|
||||||
|
variable which is too small to hold it. The simplest example of this can
|
||||||
|
be demonstrated by simply assigning the contents of large variable to a
|
||||||
|
smaller one:
|
||||||
|
|
||||||
|
/* ex1.c - loss of precision */
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
|
int main(void){
|
||||||
|
int l;
|
||||||
|
short s;
|
||||||
|
char c;
|
||||||
|
|
||||||
|
l = 0xdeadbeef;
|
||||||
|
s = l;
|
||||||
|
c = l;
|
||||||
|
|
||||||
|
printf("l = 0x%x (%d bits)\n", l, sizeof(l) * 8);
|
||||||
|
printf("s = 0x%x (%d bits)\n", s, sizeof(s) * 8);
|
||||||
|
printf("c = 0x%x (%d bits)\n", c, sizeof(c) * 8);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
/* EOF */
|
||||||
|
|
||||||
|
The output of which looks like this:
|
||||||
|
|
||||||
|
nova:signed {48} ./ex1
|
||||||
|
l = 0xdeadbeef (32 bits)
|
||||||
|
s = 0xffffbeef (16 bits)
|
||||||
|
c = 0xffffffef (8 bits)
|
||||||
|
|
||||||
|
Since each assignment causes the bounds of the values that can be stored in
|
||||||
|
each type to be exceeded, the value is truncated so that it can fit in the
|
||||||
|
variable it is assigned to.
|
||||||
|
|
||||||
|
It is worth mentioning integer promotion here. When a calculation
|
||||||
|
involving operands of different sizes is performed, the smaller operand is
|
||||||
|
"promoted" to the size of the larger one. The calculation is then
|
||||||
|
performed with these promoted sizes and, if the result is to be stored in
|
||||||
|
the smaller variable, the result is truncated to the smaller size again.
|
||||||
|
For example:
|
||||||
|
|
||||||
|
int i;
|
||||||
|
short s;
|
||||||
|
|
||||||
|
s = i;
|
||||||
|
|
||||||
|
A calculation is being performed with different sized operands here. What
|
||||||
|
happens is that the variable s is promoted to an int (32 bits long), then
|
||||||
|
the contents of i is copied into the new promoted s. After this, the
|
||||||
|
contents of the promoted variable are "demoted" back to 16 bits in order to
|
||||||
|
be saved in s. This demotion can cause the result to be truncated if it is
|
||||||
|
greater than the maximum value s can hold.
|
||||||
|
|
||||||
|
------[ 2.1.1 Exploiting
|
||||||
|
|
||||||
|
Integer overflows are not like most common bug classes. They do not allow
|
||||||
|
direct overwriting of memory or direct execution flow control, but are much
|
||||||
|
more subtle. The root of the problem lies in the fact that there is no way
|
||||||
|
for a process to check the result of a computation after it has happened,
|
||||||
|
so there may be a discrepancy between the stored result and the correct
|
||||||
|
result. Because of this, most integer overflows are not actually
|
||||||
|
exploitable. Even so, in certain cases it is possible to force a crucial
|
||||||
|
variable to contain an erroneous value, and this can lead to problems later
|
||||||
|
in the code.
|
||||||
|
|
||||||
|
Because of the subtlety of these bugs, there is a huge number of situations
|
||||||
|
in which they can be exploited, so I will not attempt to cover all
|
||||||
|
exploitable conditions. Instead, I will provide examples of some
|
||||||
|
situations which are exploitable, in the hope of inspiring the reader in
|
||||||
|
their own research :)
|
||||||
|
|
||||||
|
Example 1:
|
||||||
|
|
||||||
|
/* width1.c - exploiting a trivial widthness bug */
|
||||||
|
#include <stdio.h>
|
||||||
|
#include <string.h>
|
||||||
|
|
||||||
|
int main(int argc, char *argv[]){
|
||||||
|
unsigned short s;
|
||||||
|
int i;
|
||||||
|
char buf[80];
|
||||||
|
|
||||||
|
if(argc < 3){
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
i = atoi(argv[1]);
|
||||||
|
s = i;
|
||||||
|
|
||||||
|
if(s >= 80){ /* [w1] */
|
||||||
|
printf("Oh no you don't!\n");
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
printf("s = %d\n", s);
|
||||||
|
|
||||||
|
memcpy(buf, argv[2], i);
|
||||||
|
buf[i] = '\0';
|
||||||
|
printf("%s\n", buf);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
While a construct like this would probably never show up in real life code,
|
||||||
|
it serves well as an example. Take a look at the following inputs:
|
||||||
|
|
||||||
|
nova:signed {100} ./width1 5 hello
|
||||||
|
s = 5
|
||||||
|
hello
|
||||||
|
nova:signed {101} ./width1 80 hello
|
||||||
|
Oh no you don't!
|
||||||
|
nova:signed {102} ./width1 65536 hello
|
||||||
|
s = 0
|
||||||
|
Segmentation fault (core dumped)
|
||||||
|
|
||||||
|
The length argument is taken from the command line and held in the integer
|
||||||
|
i. When this value is transferred into the short integer s, it is
|
||||||
|
truncated if the value is too great to fit into s (i.e. if the value is
|
||||||
|
greater than 65535). Because of this, it is possible to bypass the bounds
|
||||||
|
check at [w1] and overflow the buffer. After this, standard stack smashing
|
||||||
|
techniques can be used to exploit the process.
|
||||||
|
|
||||||
|
|
||||||
|
----[ 2.2 Arithmetic overflows
|
||||||
|
|
||||||
|
As shown in section 2.0, if an attempt is made to store a value in an
|
||||||
|
integer which is greater than the maximum value the integer can hold, the
|
||||||
|
value will be truncated. If the stored value is the result of an
|
||||||
|
arithmetic operation, any part of the program which later uses the result
|
||||||
|
will run incorrectly as the result of the arithmetic being incorrect.
|
||||||
|
Consider this example demonstrating the wrap around shown earlier:
|
||||||
|
|
||||||
|
/* ex2.c - an integer overflow */
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
|
int main(void){
|
||||||
|
unsigned int num = 0xffffffff;
|
||||||
|
|
||||||
|
printf("num is %d bits long\n", sizeof(num) * 8);
|
||||||
|
printf("num = 0x%x\n", num);
|
||||||
|
printf("num + 1 = 0x%x\n", num + 1);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
/* EOF */
|
||||||
|
|
||||||
|
The output of this program looks like this:
|
||||||
|
|
||||||
|
nova:signed {4} ./ex2
|
||||||
|
num is 32 bits long
|
||||||
|
num = 0xffffffff
|
||||||
|
num + 1 = 0x0
|
||||||
|
|
||||||
|
Note:
|
||||||
|
The astute reader will have noticed that 0xffffffff is decimal -1, so it
|
||||||
|
appears that we're just doing
|
||||||
|
1 + (-1) = 0
|
||||||
|
Whilst this is one way at looking at what's going on, it may cause some
|
||||||
|
confusion since the variable num is unsigned and therefore all arithmetic
|
||||||
|
done on it will be unsigned. As it happens, a lot of signed arithmetic
|
||||||
|
depends on integer overflows, as the following demonstrates (assume both
|
||||||
|
operands are 32 bit variables):
|
||||||
|
|
||||||
|
-700 + 800 = 100
|
||||||
|
0xfffffd44 + 0x320 = 0x100000064
|
||||||
|
|
||||||
|
Since the result of the addition exceeds the range of the variable, the
|
||||||
|
lowest 32 bits are used as the result. These low 32 bits are 0x64, which
|
||||||
|
is equal to decimal 100.
|
||||||
|
</note>
|
||||||
|
|
||||||
|
Since an integer is signed by default, an integer overflow can cause a
|
||||||
|
change in signedness which can often have interesting effects on subsequent
|
||||||
|
code. Consider the following example:
|
||||||
|
|
||||||
|
/* ex3.c - change of signedness */
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
|
int main(void){
|
||||||
|
int l;
|
||||||
|
|
||||||
|
l = 0x7fffffff;
|
||||||
|
|
||||||
|
printf("l = %d (0x%x)\n", l, l);
|
||||||
|
printf("l + 1 = %d (0x%x)\n", l + 1 , l + 1);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
/* EOF */
|
||||||
|
|
||||||
|
The output of which is:
|
||||||
|
|
||||||
|
nova:signed {38} ./ex3
|
||||||
|
l = 2147483647 (0x7fffffff)
|
||||||
|
l + 1 = -2147483648 (0x80000000)
|
||||||
|
|
||||||
|
Here the integer is initialised with the highest positive value a signed
|
||||||
|
long integer can hold. When it is incremented, the most significant bit
|
||||||
|
(indicating signedness) is set and the integer is interpreted as being
|
||||||
|
negative.
|
||||||
|
|
||||||
|
Addition is not the only arithmetic operation which can cause an integer to
|
||||||
|
overflow. Almost any operation which changes the value of a variable can
|
||||||
|
cause an overflow, as demonstrated in the following example:
|
||||||
|
|
||||||
|
/* ex4.c - various arithmetic overflows */
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
|
int main(void){
|
||||||
|
int l, x;
|
||||||
|
|
||||||
|
l = 0x40000000;
|
||||||
|
|
||||||
|
printf("l = %d (0x%x)\n", l, l);
|
||||||
|
|
||||||
|
x = l + 0xc0000000;
|
||||||
|
printf("l + 0xc0000000 = %d (0x%x)\n", x, x);
|
||||||
|
|
||||||
|
x = l * 0x4;
|
||||||
|
printf("l * 0x4 = %d (0x%x)\n", x, x);
|
||||||
|
|
||||||
|
x = l - 0xffffffff;
|
||||||
|
printf("l - 0xffffffff = %d (0x%x)\n", x, x);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
/* EOF */
|
||||||
|
|
||||||
|
Output:
|
||||||
|
|
||||||
|
nova:signed {55} ./ex4
|
||||||
|
l = 1073741824 (0x40000000)
|
||||||
|
l + 0xc0000000 = 0 (0x0)
|
||||||
|
l * 0x4 = 0 (0x0)
|
||||||
|
l - 0xffffffff = 1073741825 (0x40000001)
|
||||||
|
|
||||||
|
The addition is causing an overflow in exactly the same way as the first
|
||||||
|
example, and so is the multiplication, although it may seem different. In
|
||||||
|
both cases the result of the arithmetic is too great to fit in an integer,
|
||||||
|
so it is reduced as described above. The subtraction is slightly
|
||||||
|
different, as it is causing an underflow rather than an overflow: an
|
||||||
|
attempt is made to store a value lower than the minimum value the integer
|
||||||
|
can hold, causing a wrap around. In this way we are able to force an
|
||||||
|
addition to subtract, a multiplication to divide or a subtraction to add.
|
||||||
|
|
||||||
|
------[ 2.2.1 Exploiting
|
||||||
|
|
||||||
|
One of the most common ways arithmetic overflows can be exploited is when a
|
||||||
|
calculation is made about how large a buffer must be allocated. Often a
|
||||||
|
program must allocate space for an array of objects, so it uses the
|
||||||
|
malloc(3) or calloc(3) routines to reserve the space and calculates how
|
||||||
|
much space is needed by multiplying the number of elements by the size of
|
||||||
|
an object. As has been previously shown, if we are able to control either
|
||||||
|
of these operands (number of elements or object size) we may be able to
|
||||||
|
mis-size the buffer, as the following code fragment shows:
|
||||||
|
|
||||||
|
int myfunction(int *array, int len){
|
||||||
|
int *myarray, i;
|
||||||
|
|
||||||
|
myarray = malloc(len * sizeof(int)); /* [1] */
|
||||||
|
if(myarray == NULL){
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
for(i = 0; i < len; i++){ /* [2] */
|
||||||
|
myarray[i] = array[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
return myarray;
|
||||||
|
}
|
||||||
|
|
||||||
|
This seemingly innocent function could bring about the downfall of a system
|
||||||
|
due to its lack of checking of the len parameter. The multiplication at
|
||||||
|
[1] can be made to overflow by supplying a high enough value for len, so we
|
||||||
|
can force the buffer to be any length we choose. By choosing a suitable
|
||||||
|
value for len, we can cause the loop at [2] to write past the end of the
|
||||||
|
myarray buffer, resulting in a heap overflow. This could be leveraged into
|
||||||
|
executing arbitrary code on certain implementations by overwriting malloc
|
||||||
|
control structures, but that is beyond the scope of this article.
|
||||||
|
|
||||||
|
Another example:
|
||||||
|
|
||||||
|
int catvars(char *buf1, char *buf2, unsigned int len1,
|
||||||
|
unsigned int len2){
|
||||||
|
char mybuf[256];
|
||||||
|
|
||||||
|
if((len1 + len2) > 256){ /* [3] */
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
memcpy(mybuf, buf1, len1); /* [4] */
|
||||||
|
memcpy(mybuf + len1, buf2, len2);
|
||||||
|
|
||||||
|
do_some_stuff(mybuf);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
In this example, the check at [3] can be bypassed by using suitable values
|
||||||
|
for len1 and len2 that will cause the addition to overflow and wrap around
|
||||||
|
to a low number. For example, the following values:
|
||||||
|
|
||||||
|
len1 = 0x104
|
||||||
|
len2 = 0xfffffffc
|
||||||
|
|
||||||
|
when added together would result in a wrap around with a result of 0x100
|
||||||
|
(decimal 256). This would pass the check at [3], then the memcpy(3)'s at
|
||||||
|
[4] would copy data well past the end of the buffer.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
--[ 3 Signedness Bugs
|
||||||
|
|
||||||
|
Signedness bugs occur when an unsigned variable is interpreted as signed,
|
||||||
|
or when a signed variable is interpreted as unsigned. This type of
|
||||||
|
behaviour can happen because internally to the computer, there is no
|
||||||
|
distinction between the way signed and unsigned variables are stored.
|
||||||
|
Recently, several signedness bugs showed up in the FreeBSD and OpenBSD
|
||||||
|
kernels, so there are many examples readily available.
|
||||||
|
|
||||||
|
|
||||||
|
----[ 3.1 What do they look like?
|
||||||
|
|
||||||
|
Signedness bugs can take a variety of forms, but some of the things to look
|
||||||
|
out for are:
|
||||||
|
* signed integers being used in comparisons
|
||||||
|
* signed integers being used in arithmetic
|
||||||
|
* unsigned integers being compared to signed integers
|
||||||
|
|
||||||
|
Here is classic example of a signedness bug:
|
||||||
|
|
||||||
|
int copy_something(char *buf, int len){
|
||||||
|
char kbuf[800];
|
||||||
|
|
||||||
|
if(len > sizeof(kbuf)){ /* [1] */
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
return memcpy(kbuf, buf, len); /* [2] */
|
||||||
|
}
|
||||||
|
|
||||||
|
The problem here is that memcpy takes an unsigned int as the len parameter,
|
||||||
|
but the bounds check performed before the memcpy is done using signed
|
||||||
|
integers. By passing a negative value for len, it is possible to pass the
|
||||||
|
check at [1], but then in the call to memcpy at [2], len will be interpeted
|
||||||
|
as a huge unsigned value, causing memory to be overwritten well past the
|
||||||
|
end of the buffer kbuf.
|
||||||
|
|
||||||
|
Another problem that can stem from signed/unsigned confusion occurs when
|
||||||
|
arithmetic is performed. Consider the following example:
|
||||||
|
|
||||||
|
int table[800];
|
||||||
|
|
||||||
|
int insert_in_table(int val, int pos){
|
||||||
|
if(pos > sizeof(table) / sizeof(int)){
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
table[pos] = val;
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
Since the line
|
||||||
|
table[pos] = val;
|
||||||
|
is equivalent to
|
||||||
|
*(table + (pos * sizeof(int))) = val;
|
||||||
|
we can see that the problem here is that the code does not expect a
|
||||||
|
negative operand for the addition: it expects (table + pos) to be greater
|
||||||
|
than table, so providing a negative value for pos causes a situation which
|
||||||
|
the program does not expect and can therefore not deal with.
|
||||||
|
|
||||||
|
------[ 3.1.1 Exploiting
|
||||||
|
|
||||||
|
This class of bug can be problematic to exploit, due to the fact that
|
||||||
|
signed integers, when interpreted as unsigned, tend to be huge. For
|
||||||
|
example, -1 when represented in hexadecimal is 0xffffffff. When
|
||||||
|
interpreted as unsiged, this becomes the greatest value it is possible to
|
||||||
|
represent in an integer (4,294,967,295), so if this value is passed to
|
||||||
|
mempcpy as the len parameter (for example), memcpy will attempt to copy 4GB
|
||||||
|
of data to the destination buffer. Obviously this is likely to cause a
|
||||||
|
segfault or, if not, to trash a large amount of the stack or heap.
|
||||||
|
Sometimes it is possible to get around this problem by passing a very low
|
||||||
|
value for the source address and hope, but this is not always possible.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
----[ 3.2 Signedness bugs caused by integer overflows
|
||||||
|
|
||||||
|
Sometimes, it is possible to overflow an integer so that it wraps around to
|
||||||
|
a negative number. Since the application is unlikely to expect such a
|
||||||
|
value, it may be possible to trigger a signedness bug as described above.
|
||||||
|
|
||||||
|
An example of this type of bug could look like this:
|
||||||
|
|
||||||
|
int get_two_vars(int sock, char *out, int len){
|
||||||
|
char buf1[512], buf2[512];
|
||||||
|
unsigned int size1, size2;
|
||||||
|
int size;
|
||||||
|
|
||||||
|
if(recv(sock, buf1, sizeof(buf1), 0) < 0){
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
if(recv(sock, buf2, sizeof(buf2), 0) < 0){
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* packet begins with length information */
|
||||||
|
memcpy(&size1, buf1, sizeof(int));
|
||||||
|
memcpy(&size2, buf2, sizeof(int));
|
||||||
|
|
||||||
|
size = size1 + size2; /* [1] */
|
||||||
|
|
||||||
|
if(size > len){ /* [2] */
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
memcpy(out, buf1, size1);
|
||||||
|
memcpy(out + size1, buf2, size2);
|
||||||
|
|
||||||
|
return size;
|
||||||
|
}
|
||||||
|
|
||||||
|
This example shows what can sometimes happen in network daemons, especially
|
||||||
|
when length information is passed as part of the packet (in other words, it
|
||||||
|
is supplied by an untrusted user). The addition at [1], used to check that
|
||||||
|
the data does not exceed the bounds of the output buffer, can be abused by
|
||||||
|
setting size1 and size2 to values that will cause the size variable to wrap
|
||||||
|
around to a negative value. Example values could be:
|
||||||
|
size1 = 0x7fffffff
|
||||||
|
size2 = 0x7fffffff
|
||||||
|
(0x7fffffff + 0x7fffffff = 0xfffffffe (-2)).
|
||||||
|
When this happens, the bounds check at [2] passes, and a lot more of the
|
||||||
|
out buffer can be written to than was intended (in fact, arbitrary memory
|
||||||
|
can be written to, as the (out + size1) dest parameter in the second memcpy
|
||||||
|
call allows us to get to any location in memory).
|
||||||
|
|
||||||
|
These bugs can be exploited in exactly the same way as regular signedness
|
||||||
|
bugs and have the same problems associated with them - i.e. negative values
|
||||||
|
translate to huge positive values, which can easily cause segfaults.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
--[ 4 Real world examples
|
||||||
|
|
||||||
|
There are many real world applications containing integer overflows and
|
||||||
|
signedness bugs, particularly network daemons and, frequently, in operating
|
||||||
|
system kernels.
|
||||||
|
|
||||||
|
----[ 4.1 Integer overflows
|
||||||
|
|
||||||
|
This (non-exploitable) example was taken from a security module for linux.
|
||||||
|
This code runs in the kernel context:
|
||||||
|
|
||||||
|
int rsbac_acl_sys_group(enum rsbac_acl_group_syscall_type_t call,
|
||||||
|
union rsbac_acl_group_syscall_arg_t arg)
|
||||||
|
{
|
||||||
|
...
|
||||||
|
switch(call)
|
||||||
|
{
|
||||||
|
case ACLGS_get_group_members:
|
||||||
|
if( (arg.get_group_members.maxnum <= 0) /* [A] */
|
||||||
|
|| !arg.get_group_members.group
|
||||||
|
)
|
||||||
|
{
|
||||||
|
...
|
||||||
|
rsbac_uid_t * user_array;
|
||||||
|
rsbac_time_t * ttl_array;
|
||||||
|
|
||||||
|
user_array = vmalloc(sizeof(*user_array) *
|
||||||
|
arg.get_group_members.maxnum); /* [B] */
|
||||||
|
if(!user_array)
|
||||||
|
return -RSBAC_ENOMEM;
|
||||||
|
ttl_array = vmalloc(sizeof(*ttl_array) *
|
||||||
|
arg.get_group_members.maxnum); /* [C] */
|
||||||
|
if(!ttl_array)
|
||||||
|
{
|
||||||
|
vfree(user_array);
|
||||||
|
return -RSBAC_ENOMEM;
|
||||||
|
}
|
||||||
|
|
||||||
|
err =
|
||||||
|
rsbac_acl_get_group_members(arg.get_group_members.group,
|
||||||
|
user_array,
|
||||||
|
ttl_array,
|
||||||
|
|
||||||
|
arg.get_group_members.max
|
||||||
|
num);
|
||||||
|
...
|
||||||
|
}
|
||||||
|
|
||||||
|
In this example, the bounds checking at [A] is not sufficient to prevent
|
||||||
|
the integer overflows at [B] and [C]. By passing a high enough (i.e.
|
||||||
|
greater than 0xffffffff / 4) value for arg.get_group_members.maxnum, we
|
||||||
|
can cause the multiplications at [B] and [C] to overflow and force the
|
||||||
|
buffers ttl_array and user_array to be smaller than the application
|
||||||
|
expects. Since rsbac_acl_get_group_members copies user controlled data
|
||||||
|
to these buffers, it is possible to write past the end of the user_array
|
||||||
|
and ttl_array buffers. In this case, the application used vmalloc() to
|
||||||
|
allocate the buffers, so an attempt to write past the end of the buffers
|
||||||
|
will simply raise an error, so it cannot be exploited. Even so, it
|
||||||
|
provides an example of what these bugs can look like in real code.
|
||||||
|
|
||||||
|
Another example of a recent real world integer overflow vulnerability
|
||||||
|
was the problem in the XDR RPC library (discovered by ISS X-Force). In this
|
||||||
|
case, user supplied data was used in the calculation of the size of a
|
||||||
|
dynamically allocated buffer which was filled with user supplied data. The
|
||||||
|
vulnerable code was this:
|
||||||
|
|
||||||
|
bool_t
|
||||||
|
xdr_array (xdrs, addrp, sizep, maxsize, elsize, elproc)
|
||||||
|
XDR *xdrs;
|
||||||
|
caddr_t *addrp; /* array pointer */
|
||||||
|
u_int *sizep; /* number of elements */
|
||||||
|
u_int maxsize; /* max numberof elements */
|
||||||
|
u_int elsize; /* size in bytes of each element */
|
||||||
|
xdrproc_t elproc; /* xdr routine to handle each element */
|
||||||
|
{
|
||||||
|
u_int i;
|
||||||
|
caddr_t target = *addrp;
|
||||||
|
u_int c; /* the actual element count */
|
||||||
|
bool_t stat = TRUE;
|
||||||
|
u_int nodesize;
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
c = *sizep;
|
||||||
|
if ((c > maxsize) && (xdrs->x_op != XDR_FREE))
|
||||||
|
{
|
||||||
|
return FALSE;
|
||||||
|
}
|
||||||
|
nodesize = c * elsize; /* [1] */
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
*addrp = target = mem_alloc (nodesize); /* [2] */
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
for (i = 0; (i < c) && stat; i++)
|
||||||
|
{
|
||||||
|
stat = (*elproc) (xdrs, target, LASTUNSIGNED); /* [3] */
|
||||||
|
target += elsize;
|
||||||
|
}
|
||||||
|
|
||||||
|
As you can see, by supplying large values for elsize and c (sizep), it
|
||||||
|
was possible to cause the multiplication at [1] to overflow and cause
|
||||||
|
nodesize to be much smaller than the application expected. Since
|
||||||
|
nodesize was then used to allocate a buffer at [2], the buffer could be
|
||||||
|
mis-sized leading to a heap overflow at [3]. For more information on this
|
||||||
|
hole, see the CERT advisory listed in the appendix.
|
||||||
|
|
||||||
|
|
||||||
|
----[ 4.2 Signedness bugs
|
||||||
|
|
||||||
|
Recently, several signedness bugs were brought to light in the freebsd
|
||||||
|
kernel. These allowed large portions of kernel memory to be read by
|
||||||
|
passing
|
||||||
|
negative length paramters to various syscalls. The getpeername(2) function
|
||||||
|
had such a problem and looked like this:
|
||||||
|
|
||||||
|
static int
|
||||||
|
getpeername1(p, uap, compat)
|
||||||
|
struct proc *p;
|
||||||
|
register struct getpeername_args /* {
|
||||||
|
int fdes;
|
||||||
|
caddr_t asa;
|
||||||
|
int *alen;
|
||||||
|
} */ *uap;
|
||||||
|
int compat;
|
||||||
|
{
|
||||||
|
struct file *fp;
|
||||||
|
register struct socket *so;
|
||||||
|
struct sockaddr *sa;
|
||||||
|
int len, error;
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
error = copyin((caddr_t)uap->alen, (caddr_t)&len, sizeof (len));
|
||||||
|
if (error) {
|
||||||
|
fdrop(fp, p);
|
||||||
|
return (error);
|
||||||
|
}
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
len = MIN(len, sa->sa_len); /* [1] */
|
||||||
|
error = copyout(sa, (caddr_t)uap->asa, (u_int)len);
|
||||||
|
if (error)
|
||||||
|
goto bad;
|
||||||
|
gotnothing:
|
||||||
|
error = copyout((caddr_t)&len, (caddr_t)uap->alen, sizeof (len));
|
||||||
|
bad:
|
||||||
|
if (sa)
|
||||||
|
FREE(sa, M_SONAME);
|
||||||
|
fdrop(fp, p);
|
||||||
|
return (error);
|
||||||
|
}
|
||||||
|
|
||||||
|
This is a classic example of a signedness bug - the check at [1] did not
|
||||||
|
take into account the fact that len could be negative, in which case the
|
||||||
|
MIN macro would always return len. When this negative len parameter was
|
||||||
|
passed to copyout, it was interpretted as a huge positive integer which
|
||||||
|
caused copyout to copy up to 4GB of kernel memory to user space.
|
||||||
|
|
||||||
|
|
||||||
|
--[ Conclusion
|
||||||
|
|
||||||
|
Integer overflows can be extremely dangerous, partly because it is
|
||||||
|
impossible to detect them after they have happened. If an integer overflow
|
||||||
|
takes place, the application cannot know that the calculation it has
|
||||||
|
performed is incorrect, and it will continue under the assumption that it
|
||||||
|
is. Even though they can be difficult to exploit, and frequently cannot be
|
||||||
|
exploited at all, they can cause unepected behaviour, which is never a good
|
||||||
|
thing in a secure system.
|
||||||
|
|
||||||
|
|
||||||
|
--[ Appendix
|
||||||
|
|
||||||
|
CERT advisory on the XDR bug:
|
||||||
|
http://www.cert.org/advisories/CA-2002-25.html
|
||||||
|
FreeBSD advisory: http://online.securityfocus.com/advisories/4407
|
||||||
|
|
||||||
|
|
||||||
|
|=[ EOF ]=---------------------------------------------------------------=|
|
||||||
|
|
BIN
tesi/tesi.pdf
BIN
tesi/tesi.pdf
Binary file not shown.
|
@ -636,6 +636,131 @@ pattern is not exhaustive or some patterns are shadowed by precedent ones.
|
||||||
|
|
||||||
Symbolic execution is a widely used techniques in the field of
|
Symbolic execution is a widely used techniques in the field of
|
||||||
computer security.
|
computer security.
|
||||||
|
It allows to analyze different execution paths of a program
|
||||||
|
simultanously while tracking which inputs trigger the execution of
|
||||||
|
different parts of the program.
|
||||||
|
Inputs are modelled symbolically rather than taking "concrete" values.
|
||||||
|
A symbolic execution engine keeps track of expressions and variables
|
||||||
|
in terms of these symbolic symbols and attaches logical constraints to every
|
||||||
|
branch that is being followed.
|
||||||
|
Symbolic execution engines are used to track bugs by modelling the
|
||||||
|
domain of all possible inputs of a program, detecting infeasible
|
||||||
|
paths, dead code and proving that two code segments are equivalent.
|
||||||
|
|
||||||
|
Let's take as example this signedness bug that was found in the
|
||||||
|
FreeBSD kernel and allowed, when calling the getpeername function, to
|
||||||
|
read portions of kernel memory.
|
||||||
|
#+BEGIN_SRC
|
||||||
|
int compat;
|
||||||
|
{
|
||||||
|
struct file *fp;
|
||||||
|
register struct socket *so;
|
||||||
|
struct sockaddr *sa;
|
||||||
|
int len, error;
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
len = MIN(len, sa->sa_len); /* [1] */
|
||||||
|
error = copyout(sa, (caddr_t)uap->asa, (u_int)len);
|
||||||
|
if (error)
|
||||||
|
goto bad;
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
bad:
|
||||||
|
if (sa)
|
||||||
|
FREE(sa, M_SONAME);
|
||||||
|
fdrop(fp, p);
|
||||||
|
return (error);
|
||||||
|
}
|
||||||
|
#+END_SRC
|
||||||
|
|
||||||
|
The tree of the execution when the function is evaluated considering
|
||||||
|
/int len/ our symbolic variable α, sa->sa_len as symbolic variable β
|
||||||
|
and π as the set of constraints on a symbolic variable:
|
||||||
|
|
||||||
|
#+BEGIN_SRC
|
||||||
|
[1] compat (...) { π_{α}: -∞ < α < ∞ }
|
||||||
|
|
|
||||||
|
[2] min (σ₁, σ₂) { π_{σ}: -∞ < (σ₁,σ₂) < ∞ ; π_{α}: -∞ < α < β ; π_{β}: ...}
|
||||||
|
|
|
||||||
|
[3] cast(u_int) (...) { π_{σ}: 0 ≤ (σ) < ∞ ; π_{α}: -∞ < α < β ; π_{β}: ...}
|
||||||
|
|
|
||||||
|
... // rest of the execution
|
||||||
|
#+END_SRC
|
||||||
|
We can see that at step 3 the set of possible values of the scrutinee
|
||||||
|
α is bigger than the set of possible values of the input σ to the
|
||||||
|
/cast/ directive, that is: π_{α} ⊈ π_{σ}. For this reason the /cast/ may fail when α is /len/
|
||||||
|
negative number, outside the domain π_{σ}. In C this would trigger undefined behaviour (signed
|
||||||
|
overflow) that made the exploitation possible.
|
||||||
|
|
||||||
|
Every step of evaluation can be modelled as the following transition:
|
||||||
|
#+BEGIN_SRC
|
||||||
|
(π_{σ}, (πᵢ)ⁱ) → (π'_{σ}, (π'ᵢ)ⁱ)
|
||||||
|
#+END_SRC
|
||||||
|
if we express the π constraints as logical formulas we can model the
|
||||||
|
execution of the program in terms of Hoare Logic.
|
||||||
|
State of the computation is a Hoare triple {P}C{Q} where P and Q are
|
||||||
|
respectively the /precondition/ and the /postcondition/ that
|
||||||
|
constitute the assertions of the program. C is the directive being
|
||||||
|
executed.
|
||||||
|
The language of the assertions P is:
|
||||||
|
#+BEGIN_SRC
|
||||||
|
P ::= true | false | a < b | P₁ ∧ P₂ | P₁ ∨ P₂ | ~P
|
||||||
|
#+END_SRC
|
||||||
|
where a and b are numbers.
|
||||||
|
In the Hoare rules assertions could also take the form of
|
||||||
|
#+BEGIN_SRC
|
||||||
|
P ::= ∀i. P | ∃i. P | P₁ ⇒ P₂
|
||||||
|
#+END_SRC
|
||||||
|
where i is a logical variable, but assertions of these kinds increases
|
||||||
|
the complexity of the symbolic engine.
|
||||||
|
Execution follows the rules of Hoare logic:
|
||||||
|
- Empty statement :
|
||||||
|
\begin{verbatim}
|
||||||
|
————————————
|
||||||
|
{P}/skip/{P}
|
||||||
|
\end{verbatim}
|
||||||
|
- Assignment statement : The truthness of P[a/x] is equivalent to the
|
||||||
|
truth of {P} after the assignment.
|
||||||
|
\begin{verbatim}
|
||||||
|
————————————
|
||||||
|
{P[a/x]}x:=a{P}
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
- Composition : c₁ and c₂ are directives that are executed in order;
|
||||||
|
{Q} is the /midcondition/.
|
||||||
|
\begin{verbatim}
|
||||||
|
{P}c₁{R}, {R}c₂{Q}
|
||||||
|
——————————————————
|
||||||
|
{P}c₁;c₂{Q}
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
- Conditional :
|
||||||
|
\begin{verbatim}
|
||||||
|
{P∧b}c₁{Q}, {P∧~b}c₂{Q}
|
||||||
|
————————————————————————
|
||||||
|
{P}if b then c₁ else c₂{Q}
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
- Loop : {P} is the loop invariant. After the loop is finished /P/
|
||||||
|
holds and ~b caused the loop to end.
|
||||||
|
\begin{verbatim}
|
||||||
|
{P∧b}c{P}
|
||||||
|
————————————————————————
|
||||||
|
{P}while b do c{P∧~b}
|
||||||
|
\end{verbatim}
|
||||||
|
|
||||||
|
Even if the semantics of symbolic execution engines are well defined,
|
||||||
|
the user may run into different complications when applying such
|
||||||
|
analysis to non trivial codebases.
|
||||||
|
For example, depending on the domain, loop termination is not
|
||||||
|
guaranteed. Even when termination is guaranteed, looping causes
|
||||||
|
exponential branching that may lead to path explosion or state
|
||||||
|
explosion.
|
||||||
|
Reasoning about all possible executions of a program is not always
|
||||||
|
feasible and in case of explosion usually symbolic execution engines
|
||||||
|
implement heuristics to reduce the size of the search space.
|
||||||
|
|
||||||
** Translation validation
|
** Translation validation
|
||||||
Translators, such as translators and code generators, are huge pieces of
|
Translators, such as translators and code generators, are huge pieces of
|
||||||
|
|
Loading…
Reference in a new issue