- Daniel Kroening
- SMT Lists/Sets/Maps
- Google groups:
SATABS – Predicate Abstraction with SAT
Tutorials
Example: Buffer Overflow in a Mail Transfer Agent
We explain how to model check Aeon version 0.2a, a small mail transfer agent written by Piotr Benetkiewicz. The description advertises Aeon as a "good choice for hardened or minimalistic boxes". The sources are available here.
Our first naive attempt to verify Aeon using
satabs *.c
produces a positive result, but also warns us that the property holds
trivially. It also reveals that a large number library functions are
missing: SATABS is unable to find the source code for library functions like
send
, write
and close
.
Now, do you have to provide a body for all missing library functions? There is no easy answer to this question, but a viable answer would be "most likely not". It is necessary to understand how SATABS handles functions without bodies: It simply assumes that such a function returns an arbitrary value, but that no other locations than the one on the left hand side of the assignment are changed. Obviously, there are cases in which this assumption is unsound, since the function potentially modifies all memory locations that it can somehow address.
We now use static analysis to generate array bounds checks for Aeon:
satabs *.c --pointer-check --bounds-check --show-properties
SATABS will show about 300 properties in various functions
(read this for more information
on the property instrumentation).
Now consider the first few lines of the main
function
of Aeon:
int main(int argc, char **argv)
{
char settings[MAX_SETTINGS][MAX_LEN];
...
numSet = getConfig(settings);
if (numSet == -1) {
logEntry("Missing config file!");
exit(1);
}
...
and the function getConfig
in lib_aeon.c
:
int getConfig(char settings[MAX_SETTINGS][MAX_LEN])
{
char home[MAX_LEN];
FILE *fp; /* .rc file handler */
int numSet = 0; /* number of settings */
strcpy(home, getenv("HOME")); /* get home path */
strcat(home, "/.aeonrc"); /* full path to rc file */
fp = fopen(home, "r");
if (fp == NULL) return -1; /* no cfg - ERROR */
while (fgets(settings[numSet], MAX_LEN-1, fp)
&& (numSet < MAX_SETTINGS)) numSet++;
fclose(fp);
return numSet;
}
The function getConfig
makes calls to
strcpy
, strcat
, getenv
, fopen
,
fgets
, and fclose
.
It is very easy to provide an implementation for the functions from
the string library (string.h), and SATABS comes with meaningful definitions
for most of them.
The definition of getenv
is not so straight-forward.
The man-page of getenv
(which we obtain by entering
man 3 getenv
in a Unix or cygwin command prompt) tells us:
`getenv'
searches the list of environment variable names and values (using the global pointerchar **environ
) for a variable whose name matches the string atNAME
. If a variable name matches,getenv
returns a pointer to the associated value.
SATABS has no information whatsoever about the content of
environ
. Even if SATABS could access the
environment variables on your computer, a successful verification
of Aeon
would then only guarantee that the properties for
this program hold on your computer with a specific set of
environment variables. We have to assume that environ
contains environment variables that have an arbitrary content
of arbitrary length. The content of environment
variables is not only arbitrary but could be malefic, since it
can be modified by the user. The approximation of the behavior of
getenv
that is shipped with SATABS completely ignores the
content of the string.
Now let us have another look at the properties that SATABS generates for the
models of the the string library and for getenv
. Most of these
properties require that we verify that the upper and lower bounds of buffers or
arrays are not violated. Let us look at one of the properties that SATABS
generates for the code in function getConfig
:
Claim getConfig.3:
file lib_aeon.c line 19 function getConfig
dereference failure: NULL plus offset pointer
!(SAME-OBJECT(src, NULL))
The model of the function strcpy
dereferences the pointer returned by getenv
, which may
return a NULL pointer. This possibility is detected by the static
analysis, and thus a corresponding property is generated. Let us
check this specific property:
satabs *.c --pointer-check --bounds-check --property getConfig.3
SATABS immediately returns a counterexample path
that demonstrates how getenv
returns a NULL, which
is subsequently dereferenced. We have identified the first
bug in this program: it requires that the environment variable
HOME is set, and crashes otherwise.
Let us examine one more property in the same function:
Claim getConfig.7:
file lib_aeon.c line 19 function getConfig
dereference failure: array `home' upper bound
!(POINTER_OFFSET(dst) + (int)i >= 512) || !(SAME-OBJECT(dst, &home[0]))
This property asserts that the upper bound of the array home
is not violated. The variable home
looks familiar: We encountered it in the function getConfig
given above. The function getenv
in combination with functions
strcpy
, strcat
or sprintf
is indeed
often the source for buffer overflows. Therefore, we try to use SATABS
to check the upper bound of the array home
:
satabs *.c --pointer-check --bounds-check --property getConfig.7
SATABS runs for quite a while and will eventually give up,
telling us that its upper bound for abstraction refinement iterations
has been exceeded. This is not exactly the result we were hoping for,
and we could now increase the bound for iterations with help of the
--iterations
command line switch of SATABS.
Before we do this, let us investigate why SATABS has failed to provide
a useful result. The function strcpy
contains a loop
that counts from 1 to the length of the input string. Predicate
abstraction, the mechanism SATABS is based on, is unable to detect
such loops and will therefore unroll the loop body as often as necessary.
The array home
has MAX_LEN
elements, and
MAX_LEN
is defined to be 512 in aeon.h
.
Therefore, SATABS would have to run through at least 512 iterations, only to
verify (or reject) one of the more than 300 properties! Does this fact
defeat the purpose of static verification?
We can make the job easier: after reducing the value of MAX_LEN
in aeon.h
to a small value, say to 10, SATABS provides a
counterexample trace that demonstrates how the buffer overflow be
reproduced. If you use the Eclipse plugin (as described here), you can step through this
counterexample. The trace contains the string that is returned
by getenv
.