SATABS Manual

SATABS – Predicate Abstraction with SAT

Background

Sound Abstractions

This section provides background information on how SATABS operates. Even for very trivial C programs it is impossible to exhaustively examine their state space (which is potentially unbounded). However, not all details in a C program necessarily contribute to a bug, so it may be sufficient to only examine those parts of the program that are somehow related to a bug.

In practice, many static verification tools (such as lint) try to achieve this goal by applying heuristics. This approach comes at a cost: bugs might be overlooked because the heuristics do not cover all relevant aspects of the program. Therefore, the conclusion that a program is correct whenever such a static verification tool is unable to find an error is invalid.

CEGAR Loop

A more sophisticated approach that has been very successful recently is to generate a sound abstraction of the original program. In this context, soundness refers to the fact that the abstract program contains (at least) all relevant behaviors (i.e., bugs) that are present in the original program. In the Figure above, the first component strips details from the original program. The number of possible behaviors increases as the number of details in the abstract program decreases. Intuitively, the reason is that whenever the model checking tool lacks the information that is necessary to make an accurate decision on whether a branch of an control flow statement can be taken or not, both branches have to be considered.

In the resulting abstract program, a set of concrete states is subsumed by means of a single abstract state. Consider the following figure:

The concrete states x1 and x2 are mapped to an abstract state X, and similarly Y subsumes y1 and y2. However, all transitions that are possible in the concrete program are also possible in the abstract model. The abstract transition XY summarizes the concrete transitions x1y1 and x1x1, and YX corresponds to x1x2. The behavior XYX is feasible in the original program, because it maps to x1x1x2. However, YXY is feasible only in the abstract model.

Spurious Counterexamples

The consequence is that the model checker (component number two in the figure above) possibly reports a spurious counterexample. We call a counterexample spurious whenever it is feasible in the current abstract model but not in the original program. However, whenever the model checker is unable to find an execution trace that violates the given property, we can conclude that there is no such trace in the original program, either.

The feasibility of counterexamples is checked by symbolic simulation (performed by component three in the figure above). If the counterexample is indeed feasible, SATABS found a bug in the original program and reports it to the user.

Automatic Refinement

On the other hand, infeasible counterexamples (that originate from abstract behaviors that result from the omission of details and are not present in the original program) are never reported to the user. Instead, the information is used in order to refine the abstraction such that the spurious counterexample is not part of the refined model anymore. For instance, the reason for the infeasibility of YXY is that neither y1 nor x1 can be reached from x2. Therefore, the abstraction can be refined by partitioning X.

The refinement steps can be illustrated as follows:

Iterative refinement

The first step (1) is to generate a very coarse abstraction with a very small state space. This abstraction is then successively refined (2, 3, ...) until either a feasible counterexample is found or the abstract program is detailed enough to show that there is no path that leads to a violation of the given property. The problem is that this point is not necessarily reached for every input program, i.e., it is possible that the the abstraction refinement loop never terminates. Therefore, SATABS allows to specify an upper bound for the number of iterations.

When this upper bound is reached and no counterexample was found, this does not necessarily mean that there is none. In this case, you cannot make any conclusions at all with respect to the correctness of the input program.