Memory Layouts and Pointers
Specifying Memory Layout
Programs are about more than just numeric values. First Example: Counting Set Bits describes a program that works on integer values, but most C programs involve changes to values in memory. In addition to describing the return value, specifying most C programs involves describing an initial state of the heap and then relating it to the state of the heap after the program has run. SAW supports specifying programs that involve heaps and pointers.
The specification for popcount
could get away with talking only about the integer values of arguments to a function and its return value. This section introduces minmax
, which swaps two pointers if the first pointer’s target is greater than the second pointer’s target. The return value is -1
if the first pointer’s original target was less than the second’s, 0
if they were equal, and 1
if the second pointer’s original target was greater than the first’s.
A reference implementation of minmax
follows the English specification closely:
However, the ordering of the modifications to memory and the comparisons of values can be tricky to get right in C. Instead of using a C program as the specification, this section will use a specification written in a language called Cryptol.
Cryptol
SAWScript has good facilities for describing memory layouts and pre- and postconditions, but not for specifying algorithms. It is often used together with Cryptol, a domain-specific language for implementing low-level cryptographic algorithms or DSP transforms that reads much like a mathematical description. This helps bridge the gap between formal descriptions and real implementations.
A Cryptol specification for minmax
looks like this:
The first line of the file is a module header. It states that the current module is named MinMax
. In this module, there are two definitions: minmax
, which specifies the values expected in the pointers’ targets after running minmax
, and minmax_return
, which specifies the value to be returned from minmax
.
Each definition begins with a type declaration. These are optional: Cryptol always type checks code, but it can usually infer types on its own. Nevertheless, they make the specification easier to understand. Also, Cryptol’s type system is very general, and some of the types that it finds on its own may be complicated. The type of minmax
can be read as “a function that takes an pair of 64-bit values as an argument, and returns a pair of 64-bit values” (the arrow ->
separates the argument type from the return type). The type of minmax_return
can be read as “a function that takes a pair of 64-bit values as an argument, and returns a single 8-bit value”.
The Cryptol definition of minmax
uses pattern matching to name the first and second elements of the incoming pair as x
and y
, respectively. The right side of the =
specifies that the return value is the pair (y, x)
if x
is greater than y
, or the original argument pair (x, y)
otherwise. Because Cryptol’s type system doesn’t distinguish between signed and unsigned integers, the operator >$
is used for signed comparison, while >
is used for unsigned comparison.
Alternatively, the definition could be written without pattern matching. In this case, the first and second elements of the pair are accessed using the .1
and .0
operators. Pairs can be seen as analogous to structs whose fields are named by numbers.
Cryptol is useful in two different ways in SAW: it is used as a standalone specification language, and it also provides a syntax for explicit expressions in SAWScript specification, in which case it occurs in double braces ({{ }}
).
Here is the complete SAWScript for verifying our minmax
function.
After including helpers.saw
, the first step in using a Cryptol specification for minmax
is to load the Cryptol module.
Note: In SAWScript, include
is used to include the contents of a SAWScript file, while import
is used for Cryptol files.
The SAWScript definition minmax_ok
specifies the following:
Symbolic integers and pointers to them in the heap are established.
pointer_to_fresh
returns a tuple - the first element is a symbolic variable that’s accessible from Cryptol, the second element is a pointer to allocated memory of some type (in this case,int64_t
). The pointer’s value is set to point at the allocated memory. This is done twice, once for each argument.The arguments to be provided to
minmax
are specified usingexecute
. In this case, the function will be called on the two pointers.The desired targets of the pointers (that is, the values that they should point at after the function call) are specified using
points_to
afterexecute
. In this case, the Cryptolminmax
function is called, and the resulting pair is saved inresult_spec
, which is then used to provide the pointers’ targets.The return value is specified in the same manner as that of
popcount
, by usingreturns
. In this case, rather than specifying the constantTRUE
, the result is also given by a Cryptol specification.
Note: Cryptol snippets in double braces can refer to both minmax
and to x
and y
. The Cryptol snippets can refer to anything imported from a Cryptol module with import
, and also to any name in scope that refers to a SAWCore term. In other words, the SAWScript name x
can also be used as a Cryptol name to point at a SAWCore term.
Finally, verification is invoked just as in
popcount
, usingllvm_verify
.
Exercises: Getting Started with SAW and Pointers
This exercise does not require the use of Cryptol.
Write a C function that zeroes out the target of a pointer. It should have the following prototype:
Write a C function
zero_spec
that returnstrue
whenzero
is correct for some input. It should have the following prototype:Use SAW to verify that
zero_spec
always returnstrue
for your implementation ofzero
.
Exercise: Unsigned Arithmetic
Create a version of minmax
that specifies its arguments as uint64_t
instead of int64_t
, and attempt to verify it using minmax_ok
. What does the counterexample tell you about the bug that is introduced?
Exercise: Alternative Implementations
This version of minmax
avoids conditional statements, relying heavily on C’s ternary operator. Verify that it fulfills the specification.
Now, implement a version of minmax
that uses the XOR swap trick to move the values instead of a temporary variable. Verify it.
Exercise: Swapping and Rotating
Using SAW, write a specification for a C function that unconditionally swaps the targets of two pointers. Implement the function in C, and verify that it fulfills the specification. Both the specification and the implementation are simpler versions of minmax
, and the specification for swap
can be written without a Cryptol specification.
In the course of ordinary software development, requirements change over time. As requirements change, both programs and their specifications must evolve. A verification-oriented workflow can help maintain a correspondence between updated specifcations and code.
Modify the specification so that it describes a function rotr3
. After invoking rotr3
on pointers x
, y
, and z
, x
points to the previous target of y
, y
points to the previous target of z
, and z
points to the previous target of x
. Note the error message that occurs when using this specification for swap
.
Implement rotr3
, and verify it using the new specification.
Exercise: Arrays
In SAW, a C array type can be referred to using llvm_array
, which takes the number of elements and their type as arguments. For instance, uint32[3]
can be represented as llvm_array 3 (llvm_int 32)
. Similarly, the setup value that corresponds to an index in an array can be referred to using element
. For instance, if arr
refers to an array allocated using alloc
, then element arr 0
is the first element in arr
. These can be used with points_to
.
Write a version of rotr3
that expects its argument to be an array of three integers. Verify it using SAW.
Last updated