SATABS Manual

SATABS – Predicate Abstraction with SAT

Tutorials

Example: Buffer Overflow in a Mail Transfer Agent

We explain how to model check Aeon version 0.2a, a small mail transfer agent written by Piotr Benetkiewicz. The description advertises Aeon as a "good choice for hardened or minimalistic boxes". The sources are available here.

Our first naive attempt to verify Aeon using

satabs *.c

produces a positive result, but also warns us that the property holds trivially. It also reveals that a large number library functions are missing: SATABS is unable to find the source code for library functions like send, write and close.

Now, do you have to provide a body for all missing library functions? There is no easy answer to this question, but a viable answer would be "most likely not". It is necessary to understand how SATABS handles functions without bodies: It simply assumes that such a function returns an arbitrary value, but that no other locations than the one on the left hand side of the assignment are changed. Obviously, there are cases in which this assumption is un­sound, since the function potentially modifies all memory locations that it can somehow address.

We now use static analysis to generate array bounds checks for Aeon:

satabs *.c --pointer-check --bounds-check --show-properties

SATABS will show about 300 properties in various functions (read this for more information on the property instrumentation). Now consider the first few lines of the main function of Aeon:

int main(int argc, char **argv)
{
  char settings[MAX_SETTINGS][MAX_LEN];
  ...
  numSet = getConfig(settings);
  if (numSet == -1) {
    logEntry("Missing config file!");
    exit(1);
  }
  ...

and the function getConfig in lib_aeon.c:

int getConfig(char settings[MAX_SETTINGS][MAX_LEN])
{
  char home[MAX_LEN];
FILE *fp; /* .rc file handler */
int numSet = 0; /* number of settings */
 
strcpy(home, getenv("HOME"));  /* get home path */
strcat(home, "/.aeonrc"); /* full path to rc file */
fp = fopen(home, "r");
if (fp == NULL) return -1; /* no cfg - ERROR */
  while (fgets(settings[numSet], MAX_LEN-1, fp)
    && (numSet < MAX_SETTINGS)) numSet++;
fclose(fp);
 
return numSet;
}

The function getConfig makes calls to strcpy, strcat, getenv, fopen, fgets, and fclose. It is very easy to provide an implementation for the functions from the string library (string.h), and SATABS comes with meaningful definitions for most of them. The definition of getenv is not so straight-forward. The man-page of getenv (which we obtain by entering man 3 getenv in a Unix or cygwin command prompt) tells us:

`getenv' searches the list of en­vi­ron­ment variable names and values (using the global pointer char **environ) for a variable whose name matches the string at NAME. If a variable name matches, getenv returns a pointer to the associated value.

SATABS has no information whatsoever about the content of environ. Even if SATABS could access the en­vi­ron­ment variables on your computer, a successful verification of Aeon would then only guarantee that the properties for this program hold on your computer with a specific set of en­vi­ron­ment variables. We have to assume that environ contains en­vi­ron­ment variables that have an arbitrary content of arbitrary length. The content of en­vi­ron­ment variables is not only arbitrary but could be malefic, since it can be modified by the user. The approximation of the behavior of getenv that is shipped with SATABS completely ignores the content of the string.

Now let us have another look at the properties that SATABS generates for the models of the the string library and for getenv. Most of these properties require that we verify that the upper and lower bounds of buffers or arrays are not violated. Let us look at one of the properties that SATABS generates for the code in function getConfig:

Claim getConfig.3:
  file lib_aeon.c line 19 function getConfig
  dereference failure: NULL plus offset pointer
  !(SAME-OBJECT(src, NULL))

The model of the function strcpy dereferences the pointer returned by getenv, which may return a NULL pointer. This possibility is detected by the static analysis, and thus a corresponding property is generated. Let us check this specific property:

satabs *.c --pointer-check --bounds-check --property getConfig.3

SATABS immediately returns a counterexample path that demonstrates how getenv returns a NULL, which is subsequently dereferenced. We have identified the first bug in this program: it requires that the environment variable HOME is set, and crashes otherwise.

Let us examine one more property in the same function:

Claim getConfig.7:
  file lib_aeon.c line 19 function getConfig
  dereference failure: array `home' upper bound
  !(POINTER_OFFSET(dst) + (int)i >= 512) || !(SAME-OBJECT(dst, &home[0]))

This property asserts that the upper bound of the array home is not violated. The variable home looks familiar: We encountered it in the function getConfig given above. The function getenv in combination with functions strcpy, strcat or sprintf is indeed often the source for buffer overflows. Therefore, we try to use SATABS to check the upper bound of the array home:

satabs *.c --pointer-check --bounds-check --property getConfig.7

SATABS runs for quite a while and will eventually give up, telling us that its upper bound for abstraction refinement iterations has been exceeded. This is not exactly the result we were hoping for, and we could now increase the bound for iterations with help of the --iterations command line switch of SATABS.

Before we do this, let us investigate why SATABS has failed to provide a useful result. The function strcpy contains a loop that counts from 1 to the length of the input string. Predicate abstraction, the mechanism SATABS is based on, is unable to detect such loops and will therefore unroll the loop body as often as necessary. The array home has MAX_LEN elements, and MAX_LEN is defined to be 512 in aeon.h. Therefore, SATABS would have to run through at least 512 iterations, only to verify (or reject) one of the more than 300 properties! Does this fact defeat the purpose of static verification?

We can make the job easier: after reducing the value of MAX_LEN in aeon.h to a small value, say to 10, SATABS provides a counterexample trace that demonstrates how the buffer overflow be reproduced. If you use the Eclipse plugin (as described here), you can step through this counterexample. The trace contains the string that is returned by getenv.