1 of 12

Tutorials and Publications

Tutorials

Program Verification with SAW
Rust Verification with SAW
Verification with LLVM/Java

Publications

"Constructing semantic models of programs with the software analysis workbench" by Robert Dockins, Adam Foltzer, Joe Hendrix, Brian Huffman, Dylan McNamee, and Aaron Tomb
“Verified Cryptographic Code for Everybody” by Brett Boston, Samuel Breese, Joey Dodds, Mike Dodds, Brian Huffman, Adam Petcher, and Andrei Stefanescu
“Continuous Formal Verification of Amazon s2n” by Andrey Chudnov, Nathan Collins, Byron Cook, Joey Dodds, Brian Huffman, Colm MacCárthaigh, Stephen Magill, Eric Mertens, Eric Mullen, Serdar Tasiran, Aaron Tomb, and Eddy Westbrook

Program Verification with SAW

SAW, the Software Analysis Workbench, is a tool for program verification. This example-driven tutorial demonstrates how to use SAW for common tasks and how to integrate it into the software development lifecycle. The tutorial makes use of several example files.

Getting Started

This tutorial is intended to be an interactive experience. It is much easier to learn how to use a new tool by actually using it, making mistakes, and recovering from them. Please follow along in the provided exercises.

Background

This tutorial is written for programmers who know C, but who do not necessarily have any experience with formal verification. Deep knowledge of C is not required, but familiarity with pointers and the concept of undefined behavior are assumed. It is not necessary to be able to identify undefined behavior on sight.

Notation

Code examples, filenames, and commands that should be literally typed into a computer are represented with monospace font. For instance, the file main.c might contain the function

int main(int argc, char** argv) { ... }

which has an argument called argc. At times, italic text is used to represent mathematical variables. For instance, when relating programs to mathematical specifications, the program variable n might have the mathematical value $𝑥2$ .

Exercises: Initial Setup

The first step is to install all of the necessary tools. For this tutorial, you’ll need the following:

SAW
SAW can be dowloaded from the SAW web page.
Yices and Z3
This tutorial uses Yices and Z3. If you plan to work seriously with SAW, it is also
a good idea to install the other solvers listed on the SAW download page.
Cryptol
Cryptol is included with SAW. Please use the version of Cryptol that’s included,
because each SAW release requires a specific Cryptol version.
LLVM and Clang
Please make sure that you have LLVM and clang installed.

To make sure that you have everything working, download the example files. In the examples/intro directory, run the following commands:

make swap.bc
saw swap_cryptol.saw
cryptol Swap.cry

If everything succeeds, you’ll be at a Cryptol prompt. Use :q to exit Cryptol.

Troubleshooting / Installation Alternatives

If things don’t succeed, the most likely cause is that you have a newly-released version of LLVM. SAW is dependent on LLVM’s bitcode format, which often change between releases. If you get an error along these lines:

Are you sure you’re using a supported version of LLVM/Clang?
Check here: https://github.com/GaloisInc/llvm-pretty-bc-parser

you have a couple options:

Install an earlier version of clang and configure your platform’s PATH to use it instead of the current version, or
Use docker or vagrant to run saw and its tools in a virtual machine. The SAW VM configurations for docker and vagrant include known-good versions of all of SAW’s dependencies. The SAW install page describes how to install SAW in a Docker container.

Using Vagrant to Install and Use SAW

In some cases, it can be easiest to run the SAW tools in a virtual machine. Vagrant is a tool that manages the installation, configuration, starting and stopping of virtual machines with Linux guests. Here’s how to run SAW in a Vagrant virtual machine:

Install VirtualBox - instructions here
Install Vagrant - instructions here
cd to the examples directory unpacked from example files, which includes a Vagrantfile
Start and log in to the virtual machine with the SAW tools configured with these commands:

vagrant up       # launch the virtual machine
vagrant ssh      # log in to your virtual machine
cd /vagrant/examples/intro
make popcount.bc
saw pop.saw      # should run to completion

The first time you type vagrant up the system will download and configure SAW and its dependencies, so it will take a few minutes. Subsequent launches will be much faster.
When you’re done with a session, log out of the guest and cleanly shut down your virtual machine with the host command vagrant halt
Editing files while logged in to a virtual machine can be inconvenient. Vagrant guests have access to the host file system in the directory with the Vagrantfile, which is located in the guest at /vagrant, so it can be convenient to do your work in that directory, editing on your host, but running the SAW tools inside the virtual machine. In some cases you may have to install the “VirtualBox guest additions” to enable the shared vagrant folder.

Specifications and Verification

First Example: Counting Set Bits

Most developers are used to techniques like testing, continuous integration, and thoughtful documentation that can help prevent mistakes from being introduced into a system during its development and evolution. These techniques can be relatively effective, but they risk missing certain classes of bugs. For the most important systems, like those that protect human life or information security, it can make sense to also use formal verification, in which a program is mathematically proved to be correct for all inputs.

Testing takes the actual binary executable and runs it on a subset of the possible inputs to check for expected outputs. The downside of this approach is that tests may miss some critical case. Compared to testing, verification is the process of building a mathematical model of the software and proving properties of that model for all possible inputs.

In this lesson you’ll learn how to use a system called SAW, the Software Analysis Workbench, to build models of functions written in C. You’ll learn how to specify what those functions are supposed to do, and how to write a program in SAWScript that orchestrates the proof that the functions meet their specifications.

The Code

The first program to be verified is pop_count. This function takes a 32-bit integer and returns the number of bits that are set (“populated”). For example pop_count(0) is 0, pop_count(3) is 2, and pop_count(8) is 1. This description is an English language specification of the pop_count function. A specification can be written in a number of formats, including English sentences, but also in machine-readable forms. The advantage of machine-readable specifications is that they can be used as part of an automated workflow.

Note: The pop_count function has uses in many kinds of algorithms and has an interesting folklore.

Here is a sophisticated implementation of pop_count from the book Hacker’s Delight by Henry S. Warren Jr.:

int pop_count(uint32_t x) {
    x = x - ((x >> 1) & 0x55555555);
    x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
    x = (x + (x >> 4)) & 0x0F0F0F0F;
    x = x + (x >> 8);
    x = x + (x >> 16);
    return x & 0x0000003F;
}

Exercise: A Safe and a Broken `pop_count`

Write a version of pop_count that you believe to be correct, and also a version that includes the kind of error that could be made by accident (perhaps by adding a typo to the optimized version). Add them as pop_count_ok and pop_count_broken1 to popcount.c in the examples/intro directory.

Testing Programs

You’re not likely to be able to convince yourself that the optimized pop_count function is correct just by inspection. A unit test, like the following pop_check can help:

/* Test pop_count on a few values to make sure it's at least sometimes correct */
bool pop_check() {
    return (pop_count(0x0) == 0) &&
           (pop_count(0x3) == 2) &&
           (pop_count(0xFFFFFFFF) == 32) &&
           (pop_count(0xAAAAAAAA) == 16) &&
           (pop_count(0x55555555) == 16);
}

There are some downsides to testing only with chosen values, however. First off, these tests are usually selected by the author of the code, and there is a risk that important values are not tested. This can be ameliorated by a systematic, disciplined approach to choosing test values, but it can never be completely eliminated. This particular unit test is likely to catch egregiously buggy versions of popcount, but not subtle or tricky bugs.

A second approach to testing is to choose many random values at each execution. This approach may eventually find subtle or tricky mistakes, but not reliably or in a predictable amount of time.

Testing with random values requires an executable specification. This specification may just describe some properties of the output (e.g. that the length of two appended lists is the sum of the lengths of the input lists, or that the output of a sorting function is sorted), or it may be a simpler, more straightforward version of the code that uses an easier algorithm. An executable specification for popcount can loop over the bits in the word, masking them off one at a time. While this implementation is straightforward, it is also slow.

int pop_spec(uint32_t x) {
    uint32_t pop = 0;
    uint32_t mask = 1;
    for (int i = 0; i < 32; i++) {
        if (x & mask) { pop++; }
        mask = mask << 1;
    }
    return pop;
}

The function random_value_test performs randomized testing of a provided population count function, comparing its output to that of pop_spec. When they are not identical, it prints the offending input, which can aid in debugging.

void random_value_test(int (*fun)(uint32_t), char *name) {
    srand(time(NULL));

    int failures = 0;
    for (int i = 0; i < 100000; i ++) {
        uint32_t x = rand();
        int test = (*fun)(x);
        int check = pop_spec(x);
        if (test != check) {
            printf("Test failure: %s(%u) was %u, != %u\n",
                    name, x, test, check);
            failures++;
        }
    }
    if (failures == 0) {
        printf("Testing %s succeeded!\n", name);
    }
}

Finally, one could attempt to exhaustively check the values by enumerating and testing all possible combinations. In the simple case of pop_count, which only takes one 32-bit integer, this will take about 20 seconds. With a 64-bit version of the program, however, the test would take longer than a normal human lifetime, so this technique is not practical for ongoing software development.

The way formal verification addresses this is by reasoning about mathematical models of a program, which allows it to eliminate huge regions of the state space with single steps. There are many tools and techniques for performing full formal verification, each suitable to different classes of problem. SAW is particularly suited to imperative programs that don’t contain potentially-unbounded loops. In general, the cost of verification is that it requires specialized knowledge and developing mathematical proofs can take much longer than writing test cases. However, for many programs, automated tools like SAW can be used with similar levels of effort to testing, but resulting in much stronger guarantees. At the same time, re-checking a proof can sometimes be much faster than testing large parts of the input space, leading to quicker feedback during development.

Exercise: Testing `popcount`

Write a test that detects the defects in your pop_count_broken1 function, and also check that your pop_count_ok and the optimized pop_count function have no defects by using manual and random testing. How much confidence do those techniques provide?

Finally, consider pop_count_broken2, which is only incorrect for exactly one input value. Check how often the randomized test detects the one buggy input.

int pop_count_broken2(uint32_t x) {
    if (x == 0xDEADBEEF) return 22;
    return pop_count(x);
}

Symbolic Execution

The way SAW can prove properties about programs is by converting them into an internal representation that is much closer to a pure mathematical function. For instance, pop_count might be converted to a function like:

There are complications, of course, such as what to do with conditional branches, but as a user of the tool you won’t have to worry about them except when they introduce limitations to what you can reason about. The main such limitation is that symbolic simulation can’t effectively deal with loops whose termination depends on a symbolic value. For example, this simple implementation of add would not be easily analyzed:

unsigned int add(unsigned int x, unsigned int y) {
    for (unsigned int i = 0; i < y; i ++) {
        x++;
    }
    return x;
}

Running SAW

Note: This section uses a library of SAW helpers, in the file helpers.saw. If you’re comparing this text to the SAW manual, you may notice that a few operations have been abbreviated.

SAW is a tool for extracting models from compiled programs and then applying both automatic and manual reasoning to compare them against a specification of some kind. SAW builds models of programs by symbolically executing them, and is capable of building models from LLVM bitcode, JVM bytecode, x86 machine code, Rust’s MIR internal representation, and a number of other formats.

The first step to verifying pop_count with SAW is to use clang to construct its representation in LLVM bitcode. It is important to pass clang the -O1 flag, because important symbols are stripped at higher optimization levels, while lower optimization levels yield code that is less amenable to symbolic execution. The -g flag leaves symbols in the output which helps SAW produce helpful messages when verification fails. It can be convenient to include this rule in a Makefile:

.SUFFIXES: .c .bc

%.bc : %.c
	clang -g -O1 -c -emit-llvm $< -o $@

After building the LLVM bitcode file (by typing make popcount.bc), the next step is to use SAW to verify that the program meets its specification. SAW is controlled using a language called SAWScript. SAWScript contains commands for loading code artifacts, for describing program specifications, for comparing code artifacts to specifications, and for helping SAW in situations when fully automatic proofs are impossible.

The specific fact to be verified using SAW is that pop_count and pop_spec always return the same answer, no matter their input. For any particular input, this can be checked using pop_spec_check:

bool pop_spec_check(uint32_t x) {
    return (pop_spec(x) == pop_count(x));
}

The SAWScript to verify pop_count is really checking that pop_spec_check always returns true.

include "helpers.saw";
popmod <- llvm_load_module "popcount.bc";

let pop_is_ok = do {
     x <- symbolic_variable uint32_t "x";
     execute [x];
     returns TRUE;
};

llvm_verify popmod "pop_spec_check" [] pop_is_ok;

To execute the verification, we invoke saw on the SAWScript file:

$ saw pop.saw
[20:24:45.159] Loading file "/.../pop.saw"
[20:24:45.160] Loading file "/.../helpers.saw"
[20:24:45.282] Verifying pop_spec ...
[20:24:45.282] Simulating pop_spec ...
[20:24:45.291] Checking proof obligations pop_spec ...
[20:24:46.212] Proof succeeded! pop_spec

The Proof succeeded! message indicates to us that our pop_spec_check function returns True for all possible inputs. Hooray!

Returning to the SAWScript we used, it has three parts:

Lines 1–2 load helper functions and the LLVM module to be verified. This step builds the model from your code.
Lines 4–8 defines the pop_is_ok SAWScript specification, which sets up the symbolic inputs to the pop_spec function, calls the function on those symbolic inputs, and asserts that the return value is True.
Line 10 instructs SAW to verify that pop_is_ok is true for all possible input values.

The LLVM module is loaded using the llvm_load_module command. This command takes a string that contains the filename as an argument, and returns the module itself. In SAWScript, the results of a command are saved using the <- operator; here, the name popmod is made to refer to the module.

SAW specifications have three main parts:

Preconditions which state what the code being verified may assume to be true when it is called,
Instructions for executing the code.
Postconditions which state what the code must ensure to be true after it is called.

The function is invoked using the execute command, which takes an array of SAWScript variables that correspond to the function’s arguments. The function being executed is the one named by the string argument in the call to llvm_verify.

In the postcondition, the expected return value of the function is specified using returns. In this example, the function is expected to return TRUE.

Translated to English, pop_is_ok says:

In other words, pop_is_ok wraps the C function pop_spec_check. This C function computes the believed-correct result (by calling pop_spec), calls the pop_count function we are analyzing and returns TRUE if the results agree. The SAW wrapper creates the symbolic input variable, executes the function on its input, and ensures that the return value is TRUE.

Note: SAWScript distinguishes between defining a name and saving the result of a command. Use let to define a name, which may refer to a command or a value, and <- to run a command and save the result under the given name. Defining a command with let is analogous to defining a C function, and invoking commands with <- is analogous to calling it.

The arguments to llvm_verify (on line 10 above) are popmod, which specifies the LLVM module that contains the code to be verified; "pop_spec_check", the C function to be symbolically executed; and pop_is_ok, the SAW specification to check "pop_spec_check" against. The empty list ([]) is an optional list of previously proven statements, which is used in larger verification projects as described later in this tutorial. This verification script provides the same level of assurance that exhaustive testing would provide, but it completes in a tiny fraction of the time, fast enough to be part of a standard CI (continuous integration) workflow.

Exercise: Verifying Clever Versions of `popcount`

The following versions of popcount are quite different from the preceding implementations, but they should always return the same value. For both pop_count_mul and pop_count_sparse, do the following:

Write a C function, analogous to pop_spec_check, that relates pop_spec to the new implementation.
Use pop_is_ok in pop.saw together with additional calls to llvm_verify to asserts that the modified versions pop_spec_check also always return true. The string argument to llvm_verify states the name of the C function being verified - modify it to point to your new specification.
Use SAW to verify the implementation. Remember to rebuild the bitcode file after modifying the C sources.

/* A version of popcount that uses multiplication */
int pop_count_mul(uint32_t x) {
    x = x - ((x >> 1) & 0x55555555);
    x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
    x = ((x + (x >> 4)) & 0x0F0F0F0F);
    return (x * 0x01010101) >> 24;
}

/* A version of popcount that uses an indefinite while loop(!) */
int pop_count_sparse(uint32_t x) {
    int n;
    n = 0;
    while (x != 0) {
        n = n + 1;
        x = x & (x - 1);
    }
    return n;
}

Exercise: Verifying Your `pop_count` Implementations

Verification is useful for more than just carefully-chosen examples. This exercise is about your programs.

Start with your solutions pop_count_ok and pop_count_broken1 from the first exercise. Repeat the tasks from the previous exercise, creating specifications and extending pop.saw to attempt to verify the functions.

$ make popcount.bc
$ saw pop.saw
...
[19:27:38.518] Proof succeeded! pop_ok_check
[19:27:38.520] Verifying pop_broken1_check ...
... many lines deleted
[19:27:38.856] ----------Counterexample----------
[19:27:38.856]   x: 3735928559
[19:27:38.856] ----------------------------------

As in the output above, you should see one successful verification (for the wrapper corresponding to pop_count_ok) and one failed one (for pop_count_broken1). SAW’s messages for failed verifications are quite verbose, but the most important part is the counterexample, which is a concrete input value for which the program fails to return TRUE. Next apply verification popcount_broken2 from the exercise above, which is only incorrect for exactly one input value, you will see SAW comes up with exactly that counterexample without any guidance from you.

Memory Layouts and Pointers

Specifying Memory Layout

Programs are about more than just numeric values. First Example: Counting Set Bits describes a program that works on integer values, but most C programs involve changes to values in memory. In addition to describing the return value, specifying most C programs involves describing an initial state of the heap and then relating it to the state of the heap after the program has run. SAW supports specifying programs that involve heaps and pointers.

The specification for popcount could get away with talking only about the integer values of arguments to a function and its return value. This section introduces minmax, which swaps two pointers if the first pointer’s target is greater than the second pointer’s target. The return value is -1 if the first pointer’s original target was less than the second’s, 0 if they were equal, and 1 if the second pointer’s original target was greater than the first’s.

A reference implementation of minmax follows the English specification closely:

int8_t minmax(int64_t *x, int64_t *y) {
    if (*x > *y) {
        int64_t tmp = *x;
        *x = *y;
        *y = tmp;
        return 1;
    }
    else {
        return -(*x != *y);
    }
}

However, the ordering of the modifications to memory and the comparisons of values can be tricky to get right in C. Instead of using a C program as the specification, this section will use a specification written in a language called Cryptol.

Cryptol

SAWScript has good facilities for describing memory layouts and pre- and postconditions, but not for specifying algorithms. It is often used together with Cryptol, a domain-specific language for implementing low-level cryptographic algorithms or DSP transforms that reads much like a mathematical description. This helps bridge the gap between formal descriptions and real implementations.

A Cryptol specification for minmax looks like this:

module MinMax where

minmax : ([64], [64]) -> ([64], [64])
minmax (x, y) =
  if x >$ y
  then (y, x)
  else (x, y)

minmax_return : ([64], [64]) -> [8]
minmax_return (x, y) =
  if x <$ y then -1
   | x == y then 0
   else 1

The first line of the file is a module header. It states that the current module is named MinMax. In this module, there are two definitions: minmax, which specifies the values expected in the pointers’ targets after running minmax, and minmax_return, which specifies the value to be returned from minmax.

Each definition begins with a type declaration. These are optional: Cryptol always type checks code, but it can usually infer types on its own. Nevertheless, they make the specification easier to understand. Also, Cryptol’s type system is very general, and some of the types that it finds on its own may be complicated. The type of minmax can be read as “a function that takes an pair of 64-bit values as an argument, and returns a pair of 64-bit values” (the arrow -> separates the argument type from the return type). The type of minmax_return can be read as “a function that takes a pair of 64-bit values as an argument, and returns a single 8-bit value”.

The Cryptol definition of minmax uses pattern matching to name the first and second elements of the incoming pair as x and y, respectively. The right side of the = specifies that the return value is the pair (y, x) if x is greater than y, or the original argument pair (x, y) otherwise. Because Cryptol’s type system doesn’t distinguish between signed and unsigned integers, the operator >$ is used for signed comparison, while > is used for unsigned comparison.

Alternatively, the definition could be written without pattern matching. In this case, the first and second elements of the pair are accessed using the .1 and .0 operators. Pairs can be seen as analogous to structs whose fields are named by numbers.

minmax : ([64], [64]) -> ([64], [64])
minmax pair =
  if pair.0 >$ pair.1
  then (pair.1, pair.0)
  else (pair.0, pair.1)

Cryptol is useful in two different ways in SAW: it is used as a standalone specification language, and it also provides a syntax for explicit expressions in SAWScript specification, in which case it occurs in double braces ({{ }}).

Here is the complete SAWScript for verifying our minmax function.

include "helpers.saw";
import "MinMax.cry";

minmax_mod <- llvm_load_module "minmax.bc";

let minmax_ok = do {
     // 1. Establish the symbolic integers and pointers to them
     (x, xp) <- pointer_to_fresh int64_t "x";
     (y, yp) <- pointer_to_fresh int64_t "y";

     // 2. Call the function being verified with the two pointers
     execute [xp, yp];

     // 3. Use Cryptol to specify the desired values at the pointers' new targets
     let result_spec = {{ minmax (x, y) }};
     points_to xp (from_cryptol {{ result_spec.0 }});
     points_to yp (from_cryptol {{ result_spec.1 }});

     // 4. Use Cryptol to specify the desired return value
     let return_spec = {{ minmax_return (x, y) }};
     returns (from_cryptol return_spec);
};

// 5. Verify C function minmax using minmax_ok
llvm_verify minmax_mod "minmax" [] minmax_ok;

After including helpers.saw, the first step in using a Cryptol specification for minmax is to load the Cryptol module.

Note: In SAWScript, include is used to include the contents of a SAWScript file, while import is used for Cryptol files.

The SAWScript definition minmax_ok specifies the following:

Symbolic integers and pointers to them in the heap are established. pointer_to_fresh returns a tuple - the first element is a symbolic variable that’s accessible from Cryptol, the second element is a pointer to allocated memory of some type (in this case, int64_t). The pointer’s value is set to point at the allocated memory. This is done twice, once for each argument.
The arguments to be provided to minmax are specified using execute. In this case, the function will be called on the two pointers.
The desired targets of the pointers (that is, the values that they should point at after the function call) are specified using points_to after execute. In this case, the Cryptol minmax function is called, and the resulting pair is saved in result_spec, which is then used to provide the pointers’ targets.
The return value is specified in the same manner as that of popcount, by using returns. In this case, rather than specifying the constant TRUE, the result is also given by a Cryptol specification.

Note: Cryptol snippets in double braces can refer to both minmax and to x and y. The Cryptol snippets can refer to anything imported from a Cryptol module with import, and also to any name in scope that refers to a SAWCore term. In other words, the SAWScript name x can also be used as a Cryptol name to point at a SAWCore term.

Finally, verification is invoked just as in popcount, using llvm_verify.

Exercises: Getting Started with SAW and Pointers

This exercise does not require the use of Cryptol.

Write a C function that zeroes out the target of a pointer. It should have the following prototype:
```
void zero(uint32_t* x);
```
Write a C function zero_spec that returns true when zero is correct for some input. It should have the following prototype:
```
bool zero_spec(uint32_t x);
```
Use SAW to verify that zero_spec always returns true for your implementation of zero.

Exercise: Unsigned Arithmetic

Create a version of minmax that specifies its arguments as uint64_t instead of int64_t, and attempt to verify it using minmax_ok. What does the counterexample tell you about the bug that is introduced?

Exercise: Alternative Implementations

This version of minmax avoids conditional statements, relying heavily on C’s ternary operator. Verify that it fulfills the specification.

int8_t minmax_ternary(int64_t *x, int64_t *y) {
    int64_t xv = *x, yv = *y;
    *x = xv < yv ? xv : yv;
    *y = xv < yv ? yv : xv;
    return xv < yv ? -1 : xv == yv ? 0 : 1;
}

Now, implement a version of minmax that uses the XOR swap trick to move the values instead of a temporary variable. Verify it.

Exercise: Swapping and Rotating

Using SAW, write a specification for a C function that unconditionally swaps the targets of two pointers. Implement the function in C, and verify that it fulfills the specification. Both the specification and the implementation are simpler versions of minmax, and the specification for swap can be written without a Cryptol specification.

In the course of ordinary software development, requirements change over time. As requirements change, both programs and their specifications must evolve. A verification-oriented workflow can help maintain a correspondence between updated specifcations and code.

Modify the specification so that it describes a function rotr3. After invoking rotr3 on pointers x, y, and z, x points to the previous target of y, y points to the previous target of z, and z points to the previous target of x. Note the error message that occurs when using this specification for swap.

Implement rotr3, and verify it using the new specification.

Exercise: Arrays

In SAW, a C array type can be referred to using llvm_array, which takes the number of elements and their type as arguments. For instance, uint32[3] can be represented as llvm_array 3 (llvm_int 32). Similarly, the setup value that corresponds to an index in an array can be referred to using element. For instance, if arr refers to an array allocated using alloc, then element arr 0 is the first element in arr. These can be used with points_to.

Write a version of rotr3 that expects its argument to be an array of three integers. Verify it using SAW.

Compositional Verification and Salsa20

demonstrates verification and maintenance for a small standalone function. Most interesting programs are not just single functions, however. Good software engineering practice entails splitting programs into smaller functions, each of which can be understood and tested independently. Compositional verification in SAW allows this structure to be reflected in proofs as well, so that each function can be verified independently. In addition to being more maintainable, this can greatly increase the performance of a verification script.

This section describes the verification of an implementation of the Salsa20 encryption algorithm. Complete example code can be found in the examples/salsa20 directory of the example code.

Salsa20 Verification Overview

is a stream cipher developed in 2005 by Daniel J. Bernstein, built on a pseudorandom function utilizing add-rotate-XOR (ARX) operations on 32-bit words. The original specification can be found .

The specification for this task is a trusted implementation written in . This is analogous to what is covered in in the minmax example, but for a larger system. Some examples from this specification are explored below for the sake of showing what larger Cryptol programs look like.

The implementation to be verified is written in C. This implementation is shown in part alongside the specification for comparison purposes.

A SAWScript containing the specifications of memory layouts and orchestration of the verification itself ties everything together. This will be covered last, including some performance comparisons between compositional and non-compositional verification.

A Cryptol Specification

The Cryptol specification in examples/salsa20/salsa20.cry directly implements the functions defined in Bernstein’s . Because there is so much code, this section will only go through some of the functions in detail, in order to highlight some features of Cryptol.

The first example function is quarterround. Its type is [4][32] -> [4][32], which means that it is a function that maps sequences of four 32-bit words into sequences of four 32-bit words. The [y0, y1, y2, y3] notation is pattern matching that pulls apart the four elements of the input sequence, naming each 32-bit word. The Cryptol operator <<< performs a left rotation on a sequence.

This Cryptol code closely resembles the definition in Section 3 of the specification. The definition reads:

Contrast this with the C implementation of s20_quarterround, which makes heavy use of in-place mutation rather than the functional paradigm of building and returning a new sequence:

This function directly modifies the targets of its argument pointers, a shift in paradigm that will be highlighted by the SAW specification since that is where the memory management of the C is connected to the pure computation of the Cryptol.

quarterround is used in the definition of two other functions, rowround and columnround, which perform the operation on the rows and columns of a particular matrix, represented as a flat sequence of 16 32-bit words.

These two operations are composed (rowround after columnround) to form the doubleround operation. The Cryptol code for this composition closely resembles the definition in the specification:

Combined with some utility functions for mapping sequences of four bytes to and from little-endian 32-bit words, doubleround gives us the Salsa20 hash function:

All three definitions in the where clause are sequence comprehensions, which are similar to Python’s generator expressions or C#’s LINQ. A sequence comprehension consists of square brackets that contain an expression, and then one or more branches. Branches begin with a vertical bar, and they contain one or more comma-separated bindings. Each binding is a name, an arrow, and another sequence.

The value of a comprehension with one branch is found by evaluating the expression for each element of the sequence in the branch, with the name to the left of the arrow set to the current element. The value of [x + 1 | x <- [1,2,3]] is [2, 3, 4]. If there are multiple bindings in the branch, later bindings are repeated for each earlier value. So the value of [(x + 1, y - 1) | x <- [1,2], y <- [11, 12]] is [(2, 10), (2, 11), (3, 10), (3, 11)]. The value of a comprehension with multiple branches is found by evaluating each branch in parallel; thus, the value of [(x + 1, y - 1) | x <- [1,2] | y <- [11,12]] is [(2, 10), (3, 11)].

In the where clause, the definition of xw can be read as “First split xs into 4-byte words, then combine them in a little-endian manner to obtain 32-bit words.” The specific sizes are automatically found by Cryptol’s type checker.

The definition of zs is an infinite sequence. It begins with xw, the little-endian reorganization of xs from the previous paragraph. The # operator appends sequences. The rest of the sequence consists of doubleround applied to each element of zs itself. In other words, the second element is found by applying doubleround to xw, the third by applying doubleround to the second, and so forth. Stepping through the evaluation yields this sequence:

The final definition is ar, which adds xw to the tenth element of zs, which is the result of applying doubleround ten times to xw. In Cryptol, + is extended over sequences so that adding two sequences adds their elements. The final result of Salsa20 is computed by re-joining the split words into the appropriate-sized sequence.

The C implementation uses in-place mutation and an explicit loop. Due to the use of mutation, it must be careful to copy data that will be used again later.

Note again the pervasive use of in-place mutation - as with s20_quarterround, the connection between this and the functionally pure Cryptol specification will be made clear through the SAW specification.

Salsa20 supports two key sizes: 16 and 32 bytes. Rather than writing two separate implementations, Salsa20_expansion uses two unique feature of Cryptol’s type system to implement both at once. These features are numbers in types and arithmetic predicates. Numbers in types, seen earlier, are used for the lengths of sequences, and it is possible to write functions that work on any length.

In Cryptol, some types accept arguments, which are written at the beginning of the type in curly braces. For instance, the most general type signature for a swap function on pairs is swap : {a, b} (a, b) -> (b, a). This is equivalent to the Java signature Pair<B, A> swap<A, B> (Pair<A, B> x). The {a, b} corresponds to the <A,B> immediately after swap. Arguments to types can be both ordinary types, like [8] or ([16][8], [8]), or numbers.

Type arguments can additionally be constrained. This means that a type or number argument must satisfy certain properties in order to be used. These constraints are written in parentheses and followed by a double arrow. For instance, the type of a function that takes the first element of a sequence is {n, a} (n > 0) => [n]a -> a, where n must be greater than zero (because empty sequences have no first element).

The beginning of the type signature for Salsa20_expansion reads {a} (a >= 1, 2 >= a) => ..., which says that a can only be 1 or 2. Later on in the type, [16*a][8] is used for the key length, resulting in a length of either 16 or 32 8-bit bytes. The back-tick operator allows a program to inspect the value of a length from a type, which is used in the if expression to select the appropriate input to Salsa20. Cryptol strings, like C string literals, represent sequences of ASCII byte values. The specific strings used here come from the Salsa20 specification.

SAW Specification and Verification

The SAW specification for this Salsa20 implementation is comprised of a couple of convenient helper functions, a specification for each of the interesting functions in the Salsa20 specification (i.e. the functions detailed in Bernstein’s specification document), and a defined command main that performs the actual verification.

One big difference between the Cryptol specification and the C implementation is that Cryptol, a functional language, returns new values, while programs in C, an imperative language, tend to write new values to a pointer’s target. In this case, the C version of the program overwrites an argument with the value that the Cryptol version returns. This pattern is abstracted over in oneptr_update_func, a SAWScript command that describes this relationship between the C and Cryptol versions of a function. The arguments are type : LLVMType that describes the parameter type, name : String that names the parameter for pretty-printing, and the function f : Term to apply to the parameter.

Note: If you haven’t already, look at the file helpers.saw - it defines a number of SAW functions that factor out common patterns as in oneptr_update_func, but also give more user-friendly names to various functions. Feel free to use, modify or ignore helpers.saw in SAW programs you write, and be on the lookout for new helpful functions when you work with SAW programs written by others. Good choice of names can make SAW programs much more readable.

All of Salsa20 depends on s20_quarterround. Here is its specification:

The specification for s20_hash is an example of one for which oneptr_update_func is useful.

The third argument to crucible_llvm_verify is a list of CrucibleMethodSpec objects. While performing verification, the work that was done to construct a CrucibleMethodSpec is re-used. Specifically, instead of recursively symbolically executing a verified function, the prior specification is used as an axiomatization of its behavior. In the definition of main, the results of earlier verifications are passed along:

This example also uses the fourth argument to crucible_llvm_verify. During symbolic execution, conditionals require that both branches be explored. If the fourth argument is true, then an SMT solver is used to rule out impossible branches. For some problems, the overhead of the solver exceeds the time saved on exploring branches; for others, a short time spent in the solver saves a long time spent in the symbolic execution engine. Ruling out impossible branches can also allow termination of programs in which the number of iterations can depend on a symbolic value. This is called path satisfiability checking.

The 16-byte version of Salsa20 is not verified, because the C program does not implement it. Also, Salsa20 is verified only with respect to some particular message lengths, because SAW is not yet capable of verifying infinite programs. This is why main verifies multiple lengths, in the hope that this is sufficient to increase our confidence.

Comparing Compositional and Non-compositional Verification

In examples/salsa20, there are two SAW specifications: salsa20_compositional.saw, which contains main as presented above, and salsa20_noncompositional.saw, which replaces the CrucibleMethodSpec list parameter in each call to crucible_llvm_verify with the empty list, effectively disabling compositional verification. The one exception to this is in the verification of s20_hash; not using compositional verification for this function did not terminate in a reasonable amount of time.

These two verification tasks were run on a 2019 15-inch MacBook Pro, 2.4 GHz 8-Core Intel i9 processor, 32 GB DDR4 RAM. The values shown are the average over five runs:

Even with this limited data set, the benefits of using compositional verification are clear: There’s effectively a 2x increase in speed in this example, even accounting for the fact that the verification of s20_hash is still treated compositionally.

Exercise: Rot13

Rot13 is a Caesar cipher that is its own inverse. In it, each letter is mapped to the letter that is 13 places greater than it in the alphabet, modulo 26. Non-letters are untouched, and case is preserved. For instance, “abc” becomes “nop”, and “SAW is fun!” becomes “FNJ vf sha!”.

Your task is to implement rot13 in C, and verify it using SAW.

Start by writing a function that performs a single character of rot13, assuming 7-bit ASCII encoding. Verify it using SAW and Cryptol.

Then, write a function that uses your single-character rot13 to perform rot13 on a string with precisely 20 characters in it. Verify this using SAW and Cryptol with compositional verification.

Extended Exercise: HMAC Maintenance

Proof Maintenance Exercises: s2n HMAC

The evolution of a program is accompanied by the evolution of its specifications. A key part of using SAW and Cryptol to verify a software system is the ongoing maintenance of proof artifacts through the software development lifecycle.

is the process of preserving the correspondence between a program, its specification, and its proof of correctness as requirements change over time. This section poses as an exercise an extended proof-maintenance task, adapted from . The code’s file structure has been reorganized slightly, but the code itself is untouched.

This task will be approached as if the changes to the implementation are given, and the goal will be to evolve the relevant specifications to match. While completing the exercises, take note of the correspondences between the changes to the code and the changes to the specifications.

Background: The Updates to the Implementation

This section provides an overview of the changes to the implementation that form the basis of the proof maintenance task to be completed.

The s2n HMAC implementation needed to be updated to make use of an additional piece of hashing state, outer_just_key, for the implementation of TLS. At its core, this change is captured by the addition of a new field to the s2n_hmac_state structure as it is defined in s2n_hmac_old.h. The resulting structure looks like this:

The addition of this new field saw corresponding changes to the implementation code, which can be found in s2n_hmac_new.c, below.

These changes included memory allocations, initializations, updates, and frees. The following code sample gives a good sense of the types of changes involved:

The complete diff between s2n_hmac_old.c and s2n_hmac_new.c shows a number of updates similar to that above:

From these changes alone, the work needed to keep the proofs up-to-date with the implementation can be very reasonably estimated. In this case, it will be necessary to complete the following tasks:

Add the new field to the correct type(s) in the Cryptol reference implementation
Add the relevant implementation details to the function(s) using the changed type
Update the SAWScript to reflect new memory layouts, initializations, etc implied by the updated type

Exercise: Update the Cryptol Specification

In order for verification to go through, the Cryptol specification (that is, the implementation trusted to be correct) must be updated to reflect the existence of the new state field introduced above.

Your task is to perform these updates in HMAC_iterative_old.cry.

Use the bullet points above as a rough guide, and if you get stuck, there is a complete solution presented on the next page.

Exercise: Update the SAW Specifications

The final step to proof maintenance is updating the SAW portion of the specification. This can range in difficulty from simply updating memory layouts to changing what the specification actually asserts about the program. For the HMAC updates, the necessary changes are closer to the former rather than the latter, since the implementation change was the addition of a data field rather than overall changes to the control flow.

In this exercise, you will edit the file HMAC_old.saw ...

... to add the memory layout information for the state field added to the C implementation. Hint: A reliable strategy for updating HMAC_old.saw to account for outer_just_key is a simple search for the names of other fields already present in the structure; these will likely appear where memory layouts and initializations that need to be augmented are specified.

Note: HMAC_old.saw does not use the helpers.saw file as the previous examples did. Feel free to consult helpers.saw to help understand what the various functions do, and perhaps even rewrite HMAC_old.saw to use the helper functions.

As before, if you get stuck, there is a complete solution presented on the next page.

Example Solution: HMAC Maintenance

Proof Maintenance Exercises: Solutions

This section provides a detailed solution to the two exercises in .

Updating the Cryptol Specification

The Cryptol type corresponding to the updated state container must, like the C structure, be augmented with an outer_just_key field that has the appropriate type, like so:

This very clearly corresponds to the change to the s2n_hmac_state structure in the C implementation, other than the specialization to SHA512. In the C implementation, the code is abstracted over the chosen hashing algorithm.

Here is a sample of how the functions that use the HMAC_c_state type must change:

Take note of how similar these changes are to those in the analogous C code; this is true more generally, as can be seen in the complete diff between HMAC_iterative_old.cry and HMAC_iterative_new.cry:

Updating the SAW Specifications

Using the hint given in the exercise, a search for the term “outer” in HMAC_old.saw reveals not only where memory layouts are specified, but embedded Cryptol terms of the type adjusted in the previous section. One of the memory layout specifications found through this search looks like this:

Another improvement that can be made to this code is to use the crucible_field primitive instead of crucible_elem, which allows reference to structure fields by name rather than by index. This, and the necessary change to memory layout, appear below.

The other change necessary is the aforementioned update to embedded Cryptol terms using the HMAC_c_state type augmented in the previous section. The original code found by searching looks like this:

And the update corresponds exactly to the one in the Cryptol specification:

The complete set of changes to the SAW specification can be seen in the diff between HMAC_old.saw and HMAC_new.saw:

With this, the specifications have been updated to account for the changes to the implementation, and verification via SAW will go through as intended.

Glossary

compositional verification

A verification technique based on the idea that, when proving properties of a given method or function, we can make use of properties we have already proved about its callees.

Cryptol

A specification language for algorithms. Used as the notation for in .

proof maintenance

The process of keeping verification artifacts, such as specifications and proofs, up to date with changes in a software system over time.

SAWCore

The internal representation for programs in SAW.

SAWScript

The language used to write specifications and describe verification tasks in SAW.

SetupValue

A SAWScript SetupValue can be either a or a pointer. Arguments passed to symbolically executed functions must be SetupValues.

specification

A description of what is desired of a program. Specifications can be written in anything from informal English to precise, machine-readable logical formulations.

symbolic execution

symbolic value

testing

Term

verification

Memory Layouts and Pointers

Specifying Memory Layout

A reference implementation of minmax follows the English specification closely:

int8_t minmax(int64_t *x, int64_t *y) {
    if (*x > *y) {
        int64_t tmp = *x;
        *x = *y;
        *y = tmp;
        return 1;
    }
    else {
        return -(*x != *y);
    }
}

Cryptol

A Cryptol specification for minmax looks like this:

module MinMax where

minmax : ([64], [64]) -> ([64], [64])
minmax (x, y) =
  if x >$ y
  then (y, x)
  else (x, y)

minmax_return : ([64], [64]) -> [8]
minmax_return (x, y) =
  if x <$ y then -1
   | x == y then 0
   else 1

minmax : ([64], [64]) -> ([64], [64])
minmax pair =
  if pair.0 >$ pair.1
  then (pair.1, pair.0)
  else (pair.0, pair.1)

Here is the complete SAWScript for verifying our minmax function.

include "helpers.saw";
import "MinMax.cry";

minmax_mod <- llvm_load_module "minmax.bc";

let minmax_ok = do {
     // 1. Establish the symbolic integers and pointers to them
     (x, xp) <- pointer_to_fresh int64_t "x";
     (y, yp) <- pointer_to_fresh int64_t "y";

     // 2. Call the function being verified with the two pointers
     execute [xp, yp];

     // 3. Use Cryptol to specify the desired values at the pointers' new targets
     let result_spec = {{ minmax (x, y) }};
     points_to xp (from_cryptol {{ result_spec.0 }});
     points_to yp (from_cryptol {{ result_spec.1 }});

     // 4. Use Cryptol to specify the desired return value
     let return_spec = {{ minmax_return (x, y) }};
     returns (from_cryptol return_spec);
};

// 5. Verify C function minmax using minmax_ok
llvm_verify minmax_mod "minmax" [] minmax_ok;

After including helpers.saw, the first step in using a Cryptol specification for minmax is to load the Cryptol module.

Note: In SAWScript, include is used to include the contents of a SAWScript file, while import is used for Cryptol files.

The SAWScript definition minmax_ok specifies the following:

Symbolic integers and pointers to them in the heap are established. pointer_to_fresh returns a tuple - the first element is a symbolic variable that’s accessible from Cryptol, the second element is a pointer to allocated memory of some type (in this case, int64_t). The pointer’s value is set to point at the allocated memory. This is done twice, once for each argument.
The arguments to be provided to minmax are specified using execute. In this case, the function will be called on the two pointers.
The desired targets of the pointers (that is, the values that they should point at after the function call) are specified using points_to after execute. In this case, the Cryptol minmax function is called, and the resulting pair is saved in result_spec, which is then used to provide the pointers’ targets.
The return value is specified in the same manner as that of popcount, by using returns. In this case, rather than specifying the constant TRUE, the result is also given by a Cryptol specification.

Finally, verification is invoked just as in popcount, using llvm_verify.

Exercises: Getting Started with SAW and Pointers

This exercise does not require the use of Cryptol.

Write a C function that zeroes out the target of a pointer. It should have the following prototype:
```
void zero(uint32_t* x);
```
Write a C function zero_spec that returns true when zero is correct for some input. It should have the following prototype:
```
bool zero_spec(uint32_t x);
```
Use SAW to verify that zero_spec always returns true for your implementation of zero.

Exercise: Unsigned Arithmetic

Exercise: Alternative Implementations

This version of minmax avoids conditional statements, relying heavily on C’s ternary operator. Verify that it fulfills the specification.

int8_t minmax_ternary(int64_t *x, int64_t *y) {
    int64_t xv = *x, yv = *y;
    *x = xv < yv ? xv : yv;
    *y = xv < yv ? yv : xv;
    return xv < yv ? -1 : xv == yv ? 0 : 1;
}

Now, implement a version of minmax that uses the XOR swap trick to move the values instead of a temporary variable. Verify it.

Exercise: Swapping and Rotating

Implement rotr3, and verify it using the new specification.

Exercise: Arrays

Write a version of rotr3 that expects its argument to be an array of three integers. Verify it using SAW.

Specifications and Verification

First Example: Counting Set Bits

The Code

Note: The pop_count function has uses in many kinds of algorithms and has an interesting folklore.

Here is a sophisticated implementation of pop_count from the book Hacker’s Delight by Henry S. Warren Jr.:

int pop_count(uint32_t x) {
    x = x - ((x >> 1) & 0x55555555);
    x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
    x = (x + (x >> 4)) & 0x0F0F0F0F;
    x = x + (x >> 8);
    x = x + (x >> 16);
    return x & 0x0000003F;
}

Exercise: A Safe and a Broken `pop_count`

Testing Programs

You’re not likely to be able to convince yourself that the optimized pop_count function is correct just by inspection. A unit test, like the following pop_check can help:

/* Test pop_count on a few values to make sure it's at least sometimes correct */
bool pop_check() {
    return (pop_count(0x0) == 0) &&
           (pop_count(0x3) == 2) &&
           (pop_count(0xFFFFFFFF) == 32) &&
           (pop_count(0xAAAAAAAA) == 16) &&
           (pop_count(0x55555555) == 16);
}

A second approach to testing is to choose many random values at each execution. This approach may eventually find subtle or tricky mistakes, but not reliably or in a predictable amount of time.

int pop_spec(uint32_t x) {
    uint32_t pop = 0;
    uint32_t mask = 1;
    for (int i = 0; i < 32; i++) {
        if (x & mask) { pop++; }
        mask = mask << 1;
    }
    return pop;
}

void random_value_test(int (*fun)(uint32_t), char *name) {
    srand(time(NULL));

    int failures = 0;
    for (int i = 0; i < 100000; i ++) {
        uint32_t x = rand();
        int test = (*fun)(x);
        int check = pop_spec(x);
        if (test != check) {
            printf("Test failure: %s(%u) was %u, != %u\n",
                    name, x, test, check);
            failures++;
        }
    }
    if (failures == 0) {
        printf("Testing %s succeeded!\n", name);
    }
}

Exercise: Testing `popcount`

Finally, consider pop_count_broken2, which is only incorrect for exactly one input value. Check how often the randomized test detects the one buggy input.

int pop_count_broken2(uint32_t x) {
    if (x == 0xDEADBEEF) return 22;
    return pop_count(x);
}

Symbolic Execution

\text{pop\_count}(\text{bit\_string}) = \sum_{i=0}^{32} \text{bit\_string}_i

In this version, the details of the call stack, registers vs. memory and the specific execution model of the CPU have been removed. The technique for doing this conversion is called symbolic execution or symbolic simulation. It works by first replacing some of the inputs to a program with symbolic values, which are akin to mathematical variables. The term concrete values is used to describe honest-to-goodness bits and bytes. As the program runs, operations on symbolic values result in descriptions of operations rather than actual values. Just as adding 1 to the concrete value 5 yields the concrete value 6, adding 1 to the symbolic value $𝑥x$ yields the symbolic value $𝑥+1x+1$ . Incrementing the values again yields 7 and $𝑥+2x+2$ , respectively. By simulating the entire function this way, SAW creates a mathematical function out of the C function you provide.

unsigned int add(unsigned int x, unsigned int y) {
    for (unsigned int i = 0; i < y; i ++) {
        x++;
    }
    return x;
}

The problem is that the loop termination depends on the symbolic value $𝑦y$ , rather than on some pre-determined concrete number. This means that each time through the for loop two new branches must be explored: one in which the present concrete value of i is less than the symbolic value of $𝑦y$ , and one in which it is not. The key thing to remember is that symbolic execution is most applicable to programs that “obviously” terminate, or programs in which the number of loop iterations do not depend on which specific input is provided.

Running SAW

Note: This section uses a library of SAW helpers, in the file helpers.saw. If you’re comparing this text to the SAW manual, you may notice that a few operations have been abbreviated.

.SUFFIXES: .c .bc

%.bc : %.c
	clang -g -O1 -c -emit-llvm $< -o $@

bool pop_spec_check(uint32_t x) {
    return (pop_spec(x) == pop_count(x));
}

The SAWScript to verify pop_count is really checking that pop_spec_check always returns true.

include "helpers.saw";
popmod <- llvm_load_module "popcount.bc";

let pop_is_ok = do {
     x <- symbolic_variable uint32_t "x";
     execute [x];
     returns TRUE;
};

llvm_verify popmod "pop_spec_check" [] pop_is_ok;

To execute the verification, we invoke saw on the SAWScript file:

$ saw pop.saw
[20:24:45.159] Loading file "/.../pop.saw"
[20:24:45.160] Loading file "/.../helpers.saw"
[20:24:45.282] Verifying pop_spec ...
[20:24:45.282] Simulating pop_spec ...
[20:24:45.291] Checking proof obligations pop_spec ...
[20:24:46.212] Proof succeeded! pop_spec

The Proof succeeded! message indicates to us that our pop_spec_check function returns True for all possible inputs. Hooray!

Returning to the SAWScript we used, it has three parts:

Lines 1–2 load helper functions and the LLVM module to be verified. This step builds the model from your code.
Lines 4–8 defines the pop_is_ok SAWScript specification, which sets up the symbolic inputs to the pop_spec function, calls the function on those symbolic inputs, and asserts that the return value is True.
Line 10 instructs SAW to verify that pop_is_ok is true for all possible input values.

SAW specifications have three main parts:

Preconditions which state what the code being verified may assume to be true when it is called,
Instructions for executing the code.
Postconditions which state what the code must ensure to be true after it is called.

Here, the precondition consists of creating one symbolic variable. Internally, symbolic variables are represented in the internal language SAWCore. symbolic_variable takes two arguments: the new variable’s type and a string that names the symbolic variable (which may show up in error messages). After the precondition, the SAWScript variable x is bound to the respective symbolic value $𝑥x$ . In more complicated verifications the preconditions are more interesting, as we’ll see soon.

In the postcondition, the expected return value of the function is specified using returns. In this example, the function is expected to return TRUE.

Translated to English, pop_is_ok says:

Let $𝑥x$ be a 32-bit integer. The result of calling pop_spec_check on $𝑥x$ is TRUE.

If verification reports success, we know that this is the case for all possible values of $𝑥x$ and $𝑦y$ .

Exercise: Verifying Clever Versions of `popcount`

Write a C function, analogous to pop_spec_check, that relates pop_spec to the new implementation.
Use pop_is_ok in pop.saw together with additional calls to llvm_verify to asserts that the modified versions pop_spec_check also always return true. The string argument to llvm_verify states the name of the C function being verified - modify it to point to your new specification.
Use SAW to verify the implementation. Remember to rebuild the bitcode file after modifying the C sources.

/* A version of popcount that uses multiplication */
int pop_count_mul(uint32_t x) {
    x = x - ((x >> 1) & 0x55555555);
    x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
    x = ((x + (x >> 4)) & 0x0F0F0F0F);
    return (x * 0x01010101) >> 24;
}

/* A version of popcount that uses an indefinite while loop(!) */
int pop_count_sparse(uint32_t x) {
    int n;
    n = 0;
    while (x != 0) {
        n = n + 1;
        x = x & (x - 1);
    }
    return n;
}

Exercise: Verifying Your `pop_count` Implementations

Verification is useful for more than just carefully-chosen examples. This exercise is about your programs.

$ make popcount.bc
$ saw pop.saw
...
[19:27:38.518] Proof succeeded! pop_ok_check
[19:27:38.520] Verifying pop_broken1_check ...
... many lines deleted
[19:27:38.856] ----------Counterexample----------
[19:27:38.856]   x: 3735928559
[19:27:38.856] ----------------------------------

Compositional Verification and Salsa20

This section describes the verification of an implementation of the Salsa20 encryption algorithm. Complete example code can be found in the examples/salsa20 directory of the example code.

Salsa20 Verification Overview

is a stream cipher developed in 2005 by Daniel J. Bernstein, built on a pseudorandom function utilizing add-rotate-XOR (ARX) operations on 32-bit words. The original specification can be found .

The implementation to be verified is written in C. This implementation is shown in part alongside the specification for comparison purposes.

A Cryptol Specification

This Cryptol code closely resembles the definition in Section 3 of the specification. The definition reads:

Contrast this with the C implementation of s20_quarterround, which makes heavy use of in-place mutation rather than the functional paradigm of building and returning a new sequence:

static void s20_quarterround(uint32_t *y0, uint32_t *y1, uint32_t *y2, uint32_t *y3)
{
  *y1 = *y1 ^ rotl(*y0 + *y3, 7);
  *y2 = *y2 ^ rotl(*y1 + *y0, 9);
  *y3 = *y3 ^ rotl(*y2 + *y1, 13);
  *y0 = *y0 ^ rotl(*y3 + *y2, 18);
}

These two operations are composed (rowround after columnround) to form the doubleround operation. The Cryptol code for this composition closely resembles the definition in the specification:

doubleround : [16][32] -> [16][32]
doubleround(xs) = rowround(columnround(xs))

Combined with some utility functions for mapping sequences of four bytes to and from little-endian 32-bit words, doubleround gives us the Salsa20 hash function:

Salsa20 : [64][8] -> [64][8]
Salsa20 xs = join ar
  where
    ar = [ littleendian_inverse words | words <- xw + zs@10 ]
    xw = [ littleendian xi | xi <- split xs ]
    zs = [xw] # [ doubleround zi | zi <- zs ]

[xw] # [ doubleround zi | zi <- zs ]

[xw] # [ doubleround zi | zi <- [xw] # [doubleround zi | zi <- zs] ]

[xw] # [doubleround xw] # [ doubleround zi | zi <- [doubleround zi | zi <- zs] ]

[xw] # [doubleround xw] # [ doubleround zi | zi <- [doubleround zi | zi <- [xw] # [doubleround zi | zi <- zs]] ]

[xw] # [doubleround xw] # [ doubleround zi | zi <- [doubleround xw] # [doubleround zi | zi <- [doubleround zi | zi <- zs]] ]

[xw] # [doubleround xw] # [doubleround (doubleround xw)] # [ doubleround zi | zi <- [doubleround zi | zi <- [doubleround zi | zi <- zs]] ]

The resulting sequence consists of doubleround applied times to xw at position . This process could, in principle, continue forever. In Cryptol, however, sequences are computed lazily, so as long as nothing ever asks for the last element, the program will still terminate.

The C implementation uses in-place mutation and an explicit loop. Due to the use of mutation, it must be careful to copy data that will be used again later.

// The core function of Salsa20
static void s20_hash(uint8_t seq[static 64])
{
  int i;
  uint32_t x[16];
  uint32_t z[16];

  // Create two copies of the state in little-endian format
  // First copy is hashed together
  // Second copy is added to first, word-by-word
  for (i = 0; i < 16; ++i)
    x[i] = z[i] = s20_littleendian(seq + (4 * i));

  for (i = 0; i < 10; ++i)
    s20_doubleround(z);

  for (i = 0; i < 16; ++i) {
    z[i] += x[i];
    s20_rev_littleendian(seq + (4 * i), z[i]);
  }
}

// Salsa 20 supports two key sizes, [16][8] and [32][8]
Salsa20_expansion : {a} (a >= 1, 2 >= a) => ([16*a][8], [16][8]) -> [64][8]
Salsa20_expansion(k, n) = z
  where
    [s0, s1, s2, s3] = split "expand 32-byte k" : [4][4][8]
    [t0, t1, t2, t3] = split "expand 16-byte k" : [4][4][8]
    x = if(`a == 2) then s0 # k0 # s1 # n # s2 # k1 # s3
                    else t0 # k0 # t1 # n # t2 # k0 # t3
    z = Salsa20(x)
    [k0, k1] = (split(k#zero)):[2][16][8]

The encryption function takes a tuple of three parameters: a key k, an eight-byte v, and a message m of at most bytes. In accordance with Section 10 of the specification, it computes the Salsa20_expansion of the nonce and sufficient subsequent numbers, and take truncates it to the desired length. The message is combined with this sequence, yielding the result.

Salsa20_encrypt : {a, l} (a >= 1, 2 >= a, l <= 2^^70) => ([16*a][8], [8][8], [l][8]) -> [l][8]
Salsa20_encrypt(k, v, m) = c
  where
    salsa = take (join [ Salsa20_expansion(k, v#(reverse (split i))) | i <- [0, 1 ... ] ])
    c = m ^ salsa

SAW Specification and Verification

let oneptr_update_func (type : LLVMType) (name : String) (f : Term) = do {
    (x, p) <- pointer_to_fresh type name;
    crucible_execute_func [p];
    crucible_points_to p (crucible_term {{ f x }});
};

All of Salsa20 depends on s20_quarterround. Here is its specification:

let quarterround_setup : CrucibleSetup () = do {
    (y0, p0) <- pointer_to_fresh (llvm_int 32) "y0";
    (y1, p1) <- pointer_to_fresh (llvm_int 32) "y1";
    (y2, p2) <- pointer_to_fresh (llvm_int 32) "y2";
    (y3, p3) <- pointer_to_fresh (llvm_int 32) "y3";

    crucible_execute_func [p0, p1, p2, p3];

    let zs = {{ quarterround [y0,y1,y2,y3] }};
    crucible_points_to p0 (crucible_term {{ zs@0 }});
    crucible_points_to p1 (crucible_term {{ zs@1 }});
    crucible_points_to p2 (crucible_term {{ zs@2 }});
    crucible_points_to p3 (crucible_term {{ zs@3 }});
};

The helper pointer_to_fresh is the same as the one in . It allocates space for a new symbolic variable of the given type, returning both the symbolic value and the pointer to it. The symbolic values are passed to the Cryptol function quarterround to compute the expected result values. Because the function’s inputs are symbolic, the outputs are also mathematical expressions that reflect the function’s behavior. These expected result values are then used as the expected targets of the pointers in the post-condition of the SAW specification.

The specification for s20_hash is an example of one for which oneptr_update_func is useful.

let salsa20_setup =
  oneptr_update_func (llvm_array 64 (llvm_int 8)) "seq" {{ Salsa20 }};

Putting everything together, main verifies the implementation functions according to these specifications. main has the type TopLevel () — this is the type of commands that can be run at the top level of a SAWScript program. In , crucible_llvm_verify was used on its own, and its return value was discarded. However, verification actually returns a useful result: it returns an association between a specification and the fact that the given function has been verified to fulfill it. In SAWScript, this association has the type CrucibleMethodSpec. Because crucible_llvm_verify is a command, the returned value is saved using the <- operator.

let main : TopLevel () = do {
    m      <- llvm_load_module "salsa20.bc";
    qr     <- crucible_llvm_verify m "s20_quarterround" []      false quarterround_setup   z3;
    rr     <- crucible_llvm_verify m "s20_rowround"     [qr]    false rowround_setup       z3;
    cr     <- crucible_llvm_verify m "s20_columnround"  [qr]    false columnround_setup    z3;
    dr     <- crucible_llvm_verify m "s20_doubleround"  [cr,rr] false doubleround_setup    z3;
    s20    <- crucible_llvm_verify m "s20_hash"         [dr]    false salsa20_setup        z3;
    s20e32 <- crucible_llvm_verify m "s20_expand32"     [s20]   true  salsa20_expansion_32 z3;
    s20encrypt_63 <- crucible_llvm_verify m "s20_crypt32" [s20e32] true (s20_encrypt32 63) z3;
    s20encrypt_64 <- crucible_llvm_verify m "s20_crypt32" [s20e32] true (s20_encrypt32 64) z3;
    s20encrypt_65 <- crucible_llvm_verify m "s20_crypt32" [s20e32] true (s20_encrypt32 65) z3;

    print "Done!";
};

Comparing Compositional and Non-compositional Verification

These two verification tasks were run on a 2019 15-inch MacBook Pro, 2.4 GHz 8-Core Intel i9 processor, 32 GB DDR4 RAM. The values shown are the average over five runs:

Compositional

Non-Compositional

Exercise: Rot13

Your task is to implement rot13 in C, and verify it using SAW.

Start by writing a function that performs a single character of rot13, assuming 7-bit ASCII encoding. Verify it using SAW and Cryptol.

Then, write a function that uses your single-character rot13 to perform rot13 on a string with precisely 20 characters in it. Verify this using SAW and Cryptol with compositional verification.

Tutorial (Rust)

A tutorial on Rust verification with SAW

SAW is a special-purpose programming environment developed by Galois to help orchestrate and track the results of the large collection of proof tools necessary for analysis and verification of complex software artifacts.

SAW adopts the functional paradigm, and largely follows the structure of many other typed functional languages, with some special features specifically targeted at the coordination of verification and analysis tasks.

This tutorial introduces the details of the language by walking through several examples, and showing how simple verification tasks can be described. The complete examples are available . Most of the examples make use of inline specifications written in Cryptol, a language originally designed for high-level descriptions of cryptographic algorithms. For readers unfamiliar with Cryptol, various documents describing its use are available .

This tutorial also include a on how to use SAW to verify a real-world implementation of the Salsa20 stream cipher based on the Rust project. The code for this case study is also available .

Prerequisites

In order to run the examples in this tutorial, you will need to install the following prerequisite tools:

SAW itself, which can be installed by following the instructions .
The Z3 and Yices SMT solvers. Z3 can be downloaded from , and Yices can be downloaded from .
The mir-json tool, which can be installed by following the instructions .

About `mir-json`

We are interested in verifying code written in Rust, but Rust is an extremely rich programming language that has many distinct language features. To make the process of verifying Rust simpler, SAW targets an intermediate language used in the Rust compiler called (short for “Mid-level IR”). MIR takes the variety of different features and syntactic extensions in Rust and boils them down to a minimal subset that is easier for a computer program to analyze.

The process of extracting MIR code that is suitable for SAW’s needs is somewhat tricky, so we wrote a suite of tools called mir-json to automate this process. mir-json provides a plugin for tools like rustc and cargo that lets you compile Rust code as you normally would and produce an additional .json file as output. This .json file contains the MIR code that was produced internally by the Rust compiler, with some additional minor tweaks to make it suitable for SAW’s needs.

mir-json is not a single tool but rather a suite of related tools that leverage the same underlying plugin. For SAW purposes, the two mir-json tools that are most relevant are:

saw-rustc: A thin wrapper around rustc (the Rust compiler), which is suitable for individual .rs files.
cargo-saw-build: A thin wrapper around the cargo build command, which is suitable for cargo-based Rust projects.

Most of the examples in this tutorial involve self-contained examples, which will use saw-rustc. Later in the tutorial, we will examine a Salsa20 case study that involves a cargo-based project, which will use cargo-saw-build.

A first example with `saw-rustc`

Let’s try out saw-rustc on a small example file, which we’ll name first-example.rs:

pub fn id_u8(x: u8) -> u8 {
    x
}

This is the identity function, but specialized to the type u8. We can compile this with saw-rustc like so:

$ saw-rustc first-example.rs
<snip>
note: Emitting MIR for first_example/abef32c5::id_u8

linking 1 mir files into first-example.linked-mir.json
<snip>

saw-rustc prints out some additional information that rustc alone does not print, and we have displayed the parts of this information that are most interesting. In particular:

saw-rustc notes that is is Emitting MIR for first_example/abef32c5::id_u8, where first_example/abef32c5::id_u8 is the full identifier used to uniquely refer to the id_u8 function. It’s entirely possible that the abef32c5 bit will look different on your machine; we’ll talk more about identifiers in the “Identifiers” section.
Once saw-rustc produced a MIR JSON file named first-example.linked-mir.json. This is an important bit of information, as SAW will ingest this JSON file.

If you’d like, you can inspect the first-example.linked-mir.json file with JSON tools (e.g., ), but it is not important to understand everything that is going on there. This is machine-generated JSON, and as such, it is not meant to be particularly readable by human eyes.

The `SAW_RUST_LIBRARY_PATH` environment variable

Rust has a large set of standard libraries that ship with the compiler, and parts of the standard library are quite low-level and tricky. SAW’s primary objective is to provide a tool that can analyze code in a tractable way. For this reason, SAW sometimes needs to invoke simplified versions of Rust standard library functions that are more reasonable for an SMT-based tool like SAW to handle. These simplified functions are equivalent in behavior, but avoid using problematic code patterns (e.g., gnarly pointer arithmetic or the function).

If you are only ever compiling self-contained pieces of code with saw-rustc, there is a good chance that you can get away without needing to use SAW’s custom version of the Rust standard libraries. However, if you ever need to build something more complicated than that (e.g., the Salsa20 case study later in this tutorial), then you will need the custom libraries. For this reason, it is worthwhile to teach SAW the location of the custom libraries now.

At present, the best way to obtain the custom version of the Rust standard libraries is to perform the following steps:

Clone the repo like so:

$ git clone https://github.com/GaloisInc/crucible

Navigate to the subdirectory of the crucible checkout:
```
$ cd crucible/crux-mir/
```
Run the translate_libs.sh script:
```
$ ./translate_libs.sh
```
This will compile the custom versions of the Rust standard libraries using mir-json, placing the results under the rlibs subdirectory.
Finally, define a SAW_RUST_LIBRARY_PATH environment variable that points to the newly created rlibs subdirectory:
```
$ export SAW_RUST_LIBRARY_PATH=<...>/crucible/crux-mir/rlibs
```

An upcoming release of SAW will include these custom libraries pre-built, which will greatly simplify the steps above. Either way, you will need to set the SAW_RUST_LIBRARY_PATH environment variable to point to the location of the custom libraries.

A note about generics

The id_u8 function above is likely not how most Rust programmers would define the identity function. Instead, it would seem more natural to define it generically, that is, by parameterizing the function by a type parameter:

pub fn id<A>(x: A) -> A {
    x
}

If you compile this with saw-rustc, however, the resulting JSON file will lack a definition for id! We can see this by using jq:

$ saw-rustc generics-take-1.rs
<snip>
$ jq . generics-take-1.linked-mir.json
{
  "fns": [],
  "adts": [],
  "statics": [],
  "vtables": [],
  "traits": [],
  "intrinsics": [],
  "tys": [],
  "roots": []
}

What is going on here? This is the result of an important design choice that SAW makes: SAW only supports verifying monomorphic functions. To be more precise, SAW’s approach to symbolic simulation requires all of the code being simulated to have fixed types without any type parameters.

In order to verify a function using generics in your Rust code, you must provide a separate, monomorphic function that calls into the generic function. For example, you can rewrite the example above like so:

pub fn id<A>(x: A) -> A {
    x
}

pub fn id_u8(x: u8) -> u8 {
    id(x)
}

If you compile this version with saw-rustc, you’ll see:

$ saw-rustc generics-take-2.rs
<snip>
note: Emitting MIR for generics_take_2/8b1bf337::id_u8

note: Emitting MIR for generics_take_2/8b1bf337::id::_instaddce72e1232152c[0]

linking 1 mir files into generics-take-2.linked-mir.json
<snip>

This time, the resulting JSON file contains a definition for id_u8. The reason that this works is because when id_u8 calls id, the Rust compile will generate a specialized version of id where A is instantiated with the type u8. This specialized version of id is named id::_instaddce72e1232152c[0] in the output above. (You don’t have to remember this name, thankfully!)

Although the resulting JSON file contains a definition for id_u8, it does not contain a definition for the generic id function. As a result, SAW will only be able to verify the id_u8 function from this file. If you are ever in doubt about which functions are accessible for verification with SAW, you can check this with jq like so:

$ jq '.intrinsics | map(.name)' generics-take-2.linked-mir.json
[
  "generics_take_2/8b1bf337::id_u8",
  "generics_take_2/8b1bf337::id::_instaddce72e1232152c[0]"
]

Here, “intrinsics” are monomorphic functions that are visible to SAW. Note that saw-rustc will optimize away all definitions that are not accessible from one of these intrinsic functions. This explains why the original program that only defined a generic id function resulted in a definition-less JSON file, as that program did not contain monomorphic functions (and therefore no intrinsics).

Generally speaking, we prefer to verify functions that are defined directly in the Rust source code, such as id_u8, as these functions’ names are more stable than the specialized versions of functions that the compiler generates, such as id::_instaddce72e1232152c[0]. Do note that SAW is capable of verifying both types of functions, however. (We will see an example of verifying an autogenerated function in the Salsa20 case study later in this tutorial.)

Identifiers

When you compile a function named id_u8, saw-rustc will expand it to a much longer name such as first_example/abef32c5::id_u8. This longer name is called an identifier, and it provides a globally unique name for that function. In the small examples we’ve seen up to this point, there hasn’t been any risk of name collisions, but you could imagine compiling this code alongside another file (or crate) that also defines an id_u8 function. If that happens, then it is essential that we can tell apart all of the different id_u8 functions, and identifiers provide us the mechanism for doing so.

Let’s take a closer look at what goes into an identifier. In general, an identifier will look like the following:

<crate name>/<disambiguator>::<function path>

<crate name> is the name of the crate in which the function is defined. All of the examples we’ve seen up to this point have been defined in standalone files, and as a result, the crate name has been the same as the file name, but without the .rs extension and with all hyphens replaced with underscores (e.g., first-example.rs is given the crate name first_example). In cargo-based projects, the crate name will likely differ from the file name.

<disambiguator> is a hash of the crate and its dependencies. In extreme cases, it is possible for two different crates to have identical crate names, in which case the disambiguator must be used to distinguish between the two crates. In the common case, however, most crate names will correspond to exactly one disambiguator. (More on this in a bit.)

<function path> is the path to the function within the crate. Sometimes, this is as simple as the function name itself. In other cases, a function path may involve multiple segments, depending on the module hierarchy for the program being verified. For instance, a read function located in core/src/ptr/mod.rs will have the identifier:

core/<disambiguator>::ptr::read

Where core is the crate name and ptr::read is the function path, which has two segments ptr and read. There are also some special forms of segments that appear for functions defined in certain language constructs. For instance, if a function is defined in an impl block, then it will have {impl} as one of its segments, e.g.,

core/<disambiguator>::ptr::const_ptr::{impl}::offset

The most cumbersome part of writing an identifier is the disambiguator, as it is extremely sensitive to changes in the code (not to mention hard to remember and type). Luckily, the vast majority of crate names correspond to exactly one disambiguator, and we can exploit this fact to abbreviate identifiers that live in such crates. For instance, we can abbreviate this identifier:

core/<disambiguator>::ptr::read

To simply:

core::ptr::read

We will adopt the latter, shorter notation throughout the rest of the tutorial. SAW also understands this shorthand, so we will also use this notation when passing identifiers to SAW commands.

SAW basics

A first SAW example

We now have the knowledge necessary to compile Rust code in a way that is suitable for SAW. Let’s put our skills to the test and verify something! We will build on the example from above, which we will put into a file named saw-basics.rs:

pub fn id<A>(x: A) -> A {
    x
}

pub fn id_u8(x: u8) -> u8 {
    id(x)
}

Our goal is to verify the correctness of the id_u8 function. However, it is meaningless to talk about whether a function is correct without having a specification for how the function should behave. This is where SAW enters the picture. SAW provides a scripting language named SAWScript that allows you to write a precise specification for describing a function’s behavior. For example, here is a specification that captures the intended behavior of id_u8:

let id_u8_spec = do {
  x <- mir_fresh_var "x" mir_u8;
  mir_execute_func [mir_term x];
  mir_return (mir_term x);
};

At a high level, this specification says that id_u8 is a function that accepts a single argument of type u8, and it returns its argument unchanged. Nothing too surprising there, but this example illustrates many of the concepts that one must use when working with SAW. Let’s unpack what this is doing, line by line:

In SAWScript, specifications are ordinary values that are defined with let. In this example, we are defining a specification named id_u8_spec.
Specifications are defined using “do-notation”. That is, they are assembled by writing do { <stmt>; <stmt>; ...; <stmt>; }, where each <stmt> is a statement that declares some property about the function being verified. A statement can optionally bind a variable that can be passed to later statements, which is accomplished by writing <var> <- <stmt>.
The x <- mir_fresh_var "x" mir_u8; line declares that x is a fresh variable of type u8 (represented by mir_u8 in SAWScript) that has some unspecified value. In SAW parlance, we refer to these unspecified values as symbolic values. SAW uses an SMT solver under the hood to reason about symbolic values.
The "x" string indicates what name the variable x should have when sent to the underlying SMT solver. This is primarily meant as a debugging aid, and it is not required that the string match the name of the SAWScript variable. (For instance, you could just as well have passed "x_smt" or something else.)
The mir_execute_func [mir_term x]; line declares that the function should be invoked with x as the argument. For technical reasons, we pass mir_term x to mir_execute_func rather than just x; we will go over what mir_term does later in the tutorial.
Finally, the mir_return (mir_term x); line declares that the function should return x once it has finished.

Now that we have a specification in hand, it’s time to prove that id_u8 actually adheres to the spec. To do so, we need to load the MIR JSON version of id_u8 into SAW, which is done with the mir_load_module command:

m <- mir_load_module "saw-basics.linked-mir.json";

This m variable contains the definition of id_u8, as well as the other code defined in the program. We can then pass m to the mir_verify command, which actually verifies that id_u8 behaves according to id_u8_spec:

mir_verify m "saw_basics::id_u8" [] false id_u8_spec z3;

Here is what is going on in this command:

The m and "saw_basics::id_u8" arguments instruct SAW to verify the id_u8 function located in the saw_basics crate defined in m. Note that we are using the shorthand identifier notation here, so we are allowed to omit the disambiguator for the saw_basics crate.
The [] argument indicates that we will not provide any function overrides to use when SAW simulates the id_u8 function. (We will go over how overrides work later in the tutorial.)
The false argument indicates that SAW should not use path satisfiability checking when analyzing the function. Path satisfiability checking is an advanced SAW feature that we will not be making use of in this tutorial, so we will always use false here.
The id_u8_spec argument indicates that id_u8 should be checked against the specification defined by id_u8_spec.
The z3 argument indicates that SAW should use the Z3 SMT solver to solve any proof goals that are generated during verification. SAW also supports other SMT solvers, although we will mostly use Z3 in this tutorial.

Putting this all together, our complete saw-basics.saw file is:

enable_experimental;

let id_u8_spec = do {
  x <- mir_fresh_var "x" mir_u8;
  mir_execute_func [mir_term x];
  mir_return (mir_term x);
};

m <- mir_load_module "saw-basics.linked-mir.json";

mir_verify m "saw_basics::id_u8" [] false id_u8_spec z3;

One minor detail that we left out until just now is that the SAW’s interface to MIR is still experimental, so you must explicitly opt into it with the enable_experimental command.

Now that everything is in place, we can check this proof like so:

$ saw saw-basics.saw



[16:14:07.006] Loading file "saw-basics.saw"
[16:14:07.009] Verifying saw_basics/f77ebf43::id_u8[0] ...
[16:14:07.017] Simulating saw_basics/f77ebf43::id_u8[0] ...
[16:14:07.017] Checking proof obligations saw_basics/f77ebf43::id_u8[0] ...
[16:14:07.017] Proof succeeded! saw_basics/f77ebf43::id_u8[0]

Tada! SAW was successfully able to prove that id_u8 adheres to its spec.

Cryptol

The spec in the previous section is nice and simple. It’s also not very interesting, as it’s fairly obvious at a glance that id_u8’s implementation is correct. Most of the time, we want to verify more complicated functions where the correspondence between the specification and the implementation is not always so clear.

For example, consider this function, which multiplies a number by two:

pub fn times_two(x: u32) -> u32 {
    x << 1 // Gotta go fast
}

The straightforward way to implement this function would be to return 2 * x, but the author of this function really cared about performance. As such, the author applied a micro-optimization that computes the multiplication with a single left-shift (<<). This is the sort of scenario where we are pretty sure that the optimized version of the code is equivalent to the original version, but it would be nice for SAW to check this.

Let’s write a specification for the times_two function:

let times_two_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_execute_func [mir_term x];
  mir_return (mir_term {{ 2 * x }});
};

This spec introduces code delimited by double curly braces {{ ... }}, which is a piece of syntax that we haven’t seen before. The code in between the curly braces is written in , a language designed for writing high-level specifications of various algorithms. Cryptol supports most arithmetic operations, so 2 * x works exactly as you would expect. Also note that the x variable was originally bound in the SAWScript language, but it is possible to embed x into the Cryptol language by referencing x within the curly braces. (We’ll say more about how this embedding works later.)

{{ 2 * x }} takes the Cryptol expression 2 * x and lifts it to a SAW expression. As such, this SAW spec declares that the function takes a single u32-typed argument x and returns 2 * x. We could have also wrote the specification to declare that the function returns x << 1, but that would have defeated the point of this exercise: we specifically want to check that the function against a spec that is as simple and readable as possible.

Our full SAW file is:

enable_experimental;

let times_two_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_execute_func [mir_term x];
  mir_return (mir_term {{ 2 * x }});
};

m <- mir_load_module "times-two.linked-mir.json";

mir_verify m "times_two::times_two" [] false times_two_spec z3;

Which we can verify is correct like so:

$ saw times-two.saw



[17:51:35.469] Loading file "times-two.saw"
[17:51:35.497] Verifying times_two/6f4e41af::times_two[0] ...
[17:51:35.512] Simulating times_two/6f4e41af::times_two[0] ...
[17:51:35.513] Checking proof obligations times_two/6f4e41af::times_two[0] ...
[17:51:35.527] Proof succeeded! times_two/6f4e41af::times_two[0]

Nice! Even though the times_two function does not literally return 2 * x, SAW is able to confirm that the function behaves as if it were implemented that way.

`Term`s and other types

Now that we know how Cryptol can be used within SAW, we can go back and explain what the mir_term function does. It is helpful to examine the type of mir_term by using SAW’s interactive mode. To do so, run the saw binary without any other arguments:

$ saw

Then run enable_experimental (to enable MIR-related commands) and run :type mir_term:

sawscript> enable_experimental
sawscript> :type mir_term
Term -> MIRValue

Here, we see that mir_term accepts a Term as an argument and returns a MIRValue. In this context, the Term type represents a Cryptol value, and the MIRValue type represents SAW-related MIR values. Terms can be thought of as a subset of MIRValues, and the mir_term function is used to promote a Term to a MIRValue.

Most other MIR-related commands work over MIRValues, as can be seen with SAW’s :type command:

sawscript> :type mir_execute_func
[MIRValue] -> MIRSetup ()
sawscript> :type mir_return
MIRValue -> MIRSetup ()

Note that MIRSetup is the type of statements in a MIR specification, and two MIRSetup-typed commands can be chained together by using do-notation. Writing MIRSetup () means that the statement does not return anything interesting, and the use of () here is very much analogous to how () is used in Rust. There are other MIRSetup-typed commands that do return something interesting, as is the case with mir_fresh_var:

sawscript> :type mir_fresh_var
String -> MIRType -> MIRSetup Term

This command returns a MIRSetup Term, which means that when you write x <- mir_fresh_var ... ... in a MIR specification, then x will be bound at type Term.

Values of type Term have the property that they can be embedded into Cryptol expression that are enclosed in double curly braces {{ ... }}. This is why our earlier {{ 2 * x }} example works, as x is of type Term.

Preconditions and postconditions

As a sanity check, let’s write a naïve version of times_two that explicitly returns 2 * x:

pub fn times_two_ref(x: u32) -> u32 {
    2 * x
}

It seems like we should be able to verify this times_two_ref function using the same spec that we used for times_two:

mir_verify m "times_two::times_two_ref" [] false times_two_spec z3;

Somewhat surprisingly, SAW fails to verify this function:

$ saw times-two-ref-fail.saw



[18:58:22.578] Loading file "times-two-ref-fail.saw"
[18:58:22.608] Verifying times_two/56182919::times_two_ref[0] ...
[18:58:22.621] Simulating times_two/56182919::times_two_ref[0] ...
[18:58:22.622] Checking proof obligations times_two/56182919::times_two_ref[0] ...
[18:58:22.640] Subgoal failed: times_two/56182919::times_two_ref[0] attempt to compute `const 2_u32 * move _2`, which would overflow
[18:58:22.640] SolverStats {solverStatsSolvers = fromList ["SBV->Z3"], solverStatsGoalSize = 375}
[18:58:22.640] ----------Counterexample----------
[18:58:22.640]   x: 2147483648
[18:58:22.640] Stack trace:
"mir_verify" (times-two-ref-fail.saw:11:1-11:11)
Proof failed.

The “which would overflow” portion of the error message suggests what went wrong. When a Rust program is compiled with debug settings (which is the default for rustc and saw-rustc), arithmetic operations such as multiplication will check if the result of the operation can fit in the requested number of bits. If not, the program will raise an error.

In this case, we must make the result of multiplication fit in a u32, which can represent values in the range 0 to 2^^32 - 1 (where ^^ is Cryptol’s exponentiation operator). But it is possible to take a number in this range, multiply it by two, and have the result fall outside of the range. In fact, SAW gives us a counterexample with exactly this number: 2147483648, which can also be written as 2^^31. Multiplying this by two yields 2^^32, which is just outside of the range of values expressible with u32. SAW’s duties include checking that a function cannot fail at runtime, so this function falls afoul of that check.

Note that we didn’t have this problem with the original definition of times_two because the semantics of << are such that if the result is too large to fit in the requested type, then the result will overflow, i.e., wrap back around to zero and count up. This means that (2^^31) << 1 evaluates to 0 rather than raising an error. Cryptol’s multiplication operation also performs integer overflow (unlike Rust in debug settings), which is why we didn’t notice any overflow-related issues when verifying times_two.

There are two possible ways that we can repair this. One way is to rewrite times_two_ref to use Rust’s function, a variant of multiplication that always uses integer overflow. This work around the issue, but it is a bit more verbose.

The other way is to make our spec more precise such that we only verify times_two_ref for particular inputs. Although times_two_ref will run into overflow-related issues when the argument is 2^^31 or greater, it is perfectly fine for inputs smaller than 2^^31. We can encode such an assumption in SAW by adding a precondition. To do so, we write a slightly modified version of times_two_spec:

let times_two_ref_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_precond {{ x < 2^^31 }};
  mir_execute_func [mir_term x];
  mir_return (mir_term {{ 2 * x }});
};

The most notable change is the mir_precond {{ x < 2^^31 }}; line. mir_precond (where “precond” is short for “precondition”) is a command that takes a Term argument that contains a boolean predicate, such as {{ x < 2^^31 }}. Declaring a precondition requires that this predicate must hold during verification, and any values of x that do not satisfy this predicate are not considered.

By doing this, we have limited the range of the function from 0 to 2^^31 - 1, which is exactly the range of values for which times_two_ref is well defined. SAW will confirm this if we run it:

mir_verify m "times_two::times_two_ref" [] false times_two_ref_spec z3;

[19:23:53.480] Verifying times_two/56182919::times_two_ref[0] ...
[19:23:53.496] Simulating times_two/56182919::times_two_ref[0] ...
[19:23:53.497] Checking proof obligations times_two/56182919::times_two_ref[0] ...
[19:23:53.531] Proof succeeded! times_two/56182919::times_two_ref[0]

We can add as many preconditions to a spec as we see fit. For instance, if we only want to verify times_two_ref for positive integers, we could add an additional assumption:

let times_two_ref_positive_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_precond {{ 0 < x }}; // The input must be positive
  mir_precond {{ x < 2^^31 }};
  mir_execute_func [mir_term x];
  mir_return (mir_term {{ 2 * x }});
};

In addition to preconditions, SAW also supports postconditions. Whereas preconditions represent conditions that must hold before invoking a function, postconditions represent conditions that must hold after invoking a function. We have already seen one type of postcondition in the form of the mir_return command, which imposes a postcondition on what the return value must be equal to.

We can introduce additional postconditions with the mir_postcond command. For example, if we call times_two_ref with a positive argument, then it should be the case that the return value should be strictly greater than the argument value. We can check for this using mir_postcond like so:

let times_two_ref_positive_postcond_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_precond {{ 0 < x }}; // The input must be positive
  mir_precond {{ x < 2^^31 }};
  mir_execute_func [mir_term x];
  mir_postcond {{ x < (2 * x) }}; // Argument value < return value
  mir_return (mir_term {{ 2 * x }});
};

An additional convenience that SAW offers is the mir_assert command. mir_assert has the same type as mir_precond and mir_postcond, but mir_assert can be used to declare both preconditions and postconditions. The difference is where mir_assert appears in a specification. If mir_assert is used before the call to mir_execute_func, then it declares a precondition. If mir_assert is used after the call to mir_execute_func, then it declares a postcondition.

For example, we can rewrite times_two_ref_positive_postcond_spec to use mir_asserts like so:

let times_two_ref_positive_postcond_assert_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_assert {{ 0 < x }}; // The input must be positive
  mir_assert {{ x < 2^^31 }};
  mir_execute_func [mir_term x];
  mir_assert {{ x < (2 * x) }}; // Argument value < return value
  mir_return (mir_term {{ 2 * x }});
};

The choice of whether to use mir_precond/mir_postcond versus mir_assert is mostly a matter personal taste.

Reference types

All of the examples we have seen up to this point involve simple integer types such as u8 and u32. While these are useful, Rust’s type system features much more than just integers. A key part of Rust’s type system are its reference types. For example, in this read_ref function:

pub fn read_ref(r: &u32) -> u32 {
    *r
}

The function reads the value that r (of type &u32) points to and returns it. Writing SAW specifications involving references is somewhat trickier than with other types of values because we must also specify what memory the reference points to. SAW provides a special command for doing this called mir_alloc:

sawscript> :type mir_alloc
MIRType -> MIRSetup MIRValue

mir_alloc will allocate a reference value with enough space to hold a value of the given MIRType. Unlike mir_fresh_var, mir_alloc returns a MIRValue instead of a Term. We mentioned before that Terms are only a subset of MIRValues, and this is one of the reasons why. Cryptol does not have a notion of reference values, but MIRValues do. As a result, you cannot embed the result of a call to mir_alloc in a Cryptol expression.

mir_alloc must be used with some care. Here is a first, not-quite-correct attempt at writing a spec for read_ref using mir_alloc:

let read_ref_fail_spec = do {
  r <- mir_alloc mir_u32;
  mir_execute_func [r];
  // mir_return ??;
};

m <- mir_load_module "ref-basics.linked-mir.json";

mir_verify m "ref_basics::read_ref" [] false read_ref_fail_spec z3;

As the comment suggests, it’s not entirely clear what this spec should return. We can’t return r, since read_ref returns something of type u32, not &u32. On the other hand, we don’t have any values of type u32 lying around that are obviously the right thing to use here. Nevertheless, it’s not required for a SAW spec to include a mir_return statement, so let’s see what happens if we verify this as-is:

$ saw ref-basics-fail.saw



[20:13:27.224] Loading file "ref-basics-fail.saw"
[20:13:27.227] Verifying ref_basics/54ae7b63::read_ref[0] ...
[20:13:27.235] Simulating ref_basics/54ae7b63::read_ref[0] ...
[20:13:27.235] Stack trace:
"mir_verify" (ref-basics-fail.saw:11:1-11:11)
Symbolic execution failed.
Abort due to assertion failure:
  ref-basics.rs:2:5: 2:7: error: in ref_basics/54ae7b63::read_ref[0]
  attempted to read empty mux tree

Clearly, SAW didn’t like what we gave it. The reason this happens is although we allocated memory for the reference r, we never told SAW what value should live in that memory. When SAW simulated the read_ref function, it attempted to dereference r, which pointed to uninitialized memory. This is constitutes an error in SAW, which is what this “attempted to read empty mux tree” business is about.

SAW provides a mir_points_to command to declare what value a reference should point to:

sawscript> :type mir_points_to
MIRValue -> MIRValue -> MIRSetup ()

Here, the first MIRValue argument represents a reference value, and the second MIRValue argument represents the value that the reference should point to. In our spec for read_ref, we can declare that the reference should point to a symbolic u32 value like so:

let read_ref_spec = do {
  r_ref <- mir_alloc mir_u32;
  r_val <- mir_fresh_var "r_val" mir_u32;
  mir_points_to r_ref (mir_term r_val);
  mir_execute_func [r_ref];
  mir_return (mir_term r_val);
};

We have renamed r to r_ref in this revised spec to more easily distinguish it from r_val, which is the value that r_ref is declared to point to using mir_points_to. In this version of the spec, it is clear that we should return r_val using mir_return, as r_val is exactly the value that will be computed by dereferencing r_ref.

This pattern, where a call to mir_alloc/mir_alloc_mut to followed by a call to mir_points_to, is common with function specs that involve references. Later in the tutorial, we will see other examples of mir_points_to where the reference argument does not come from mir_alloc/mir_alloc_mut.

The argument to read_ref is an immutable reference, so the implementation of the function is not allowed to modify the memory that the argument points to. Rust also features mutable references that do permit modifying the underlying memory, as seen in this swap function:

pub fn swap(a: &mut u32, b: &mut u32) {
    let a_tmp: u32 = *a;
    let b_tmp: u32 = *b;

    *a = b_tmp;
    *b = a_tmp;
}

A corresponding spec for swap is:

let swap_spec = do {
  a_ref <- mir_alloc_mut mir_u32;
  a_val <- mir_fresh_var "a_val" mir_u32;
  mir_points_to a_ref (mir_term a_val);

  b_ref <- mir_alloc_mut mir_u32;
  b_val <- mir_fresh_var "b_val" mir_u32;
  mir_points_to b_ref (mir_term b_val);

  mir_execute_func [a_ref, b_ref];

  mir_points_to a_ref (mir_term b_val);
  mir_points_to b_ref (mir_term a_val);
};

There are two interesting things worth calling out in this spec:

Instead of allocating the reference values with mir_alloc, we instead use mir_alloc_mut. This is a consequence of the fact that &mut u32 is a different type from &mut in Rust (and in MIR), and and such, we need a separate mir_alloc_mut to get the types right.
This spec features calls to mir_points_to before and after mir_execute_func. This is because the values that a_ref and b_ref point to before calling the function are different than the values that they point to after calling the function. The two uses to mir_points_to after the function has been called swap the order of a_val and b_val, reflecting the fact that the swap function itself swaps the values that the references point to.

Compound data types

Besides integer types and reference types, Rust also features a variety of other interesting data types. This part of the tutorial will briefly go over some of these data types and how to interface with them in SAW.

Array types

Rust includes array types where the length of the array is known ahead of time. For instance, this index function takes an arr argument that must contain exactly three u32 values:

pub fn index(arr: [u32; 3], idx: usize) -> u32 {
    arr[idx]
}

While Rust is good at catching many classes of programmer errors at compile time, one thing that it cannot catch in general is out-of-bounds array accesses. In this index example, calling the function with a value of idx ranging from 0 to 2 is fine, but any other choice of idx will cause the program to crash, since the idx will be out of the bounds of arr.

SAW is suited to checking for these sorts of out-of-bound accesses. Let’s write an incorrect spec for index to illustrate this:

let index_fail_spec = do {
  arr <- mir_fresh_var "arr" (mir_array 3 mir_u32);
  idx <- mir_fresh_var "idx" mir_usize;

  mir_execute_func [mir_term arr, mir_term idx];

  mir_return (mir_term {{ arr @ idx }});
};

m <- mir_load_module "arrays.linked-mir.json";

mir_verify m "arrays::index" [] false index_fail_spec z3;

Before we run this with SAW, let’s highlight some of the new concepts that this spec uses:

The type of the arr variable is specified using mir_array 3 mir_u32. Here, the mir_array function takes the length of the array and the element type as arguments, just as in Rust.
The spec declares the return value to be {{ arr @ idx }}, where @ is Cryptol’s indexing operator. Also note that it is completely valid to embed a MIR array type into a Cryptol expression, as Cryptol has a sequence type that acts much like arrays do in MIR.

As we hinted above, this spec is wrong, as it says that this should work for any possible values of idx. SAW will catch this mistake:

$ saw arrays-fail.saw



[21:03:05.374] Loading file "arrays-fail.saw"
[21:03:05.411] Verifying arrays/47a26581::index[0] ...
[21:03:05.425] Simulating arrays/47a26581::index[0] ...
[21:03:05.426] Checking proof obligations arrays/47a26581::index[0] ...
[21:03:05.445] Subgoal failed: arrays/47a26581::index[0] index out of bounds: the length is move _4 but the index is _3
[21:03:05.445] SolverStats {solverStatsSolvers = fromList ["SBV->Z3"], solverStatsGoalSize = 53}
[21:03:05.445] ----------Counterexample----------
[21:03:05.445]   idx: 2147483648
[21:03:05.445] Stack trace:
"mir_verify" (arrays-fail.saw:14:1-14:11)
Proof failed.

We can repair this spec by adding some preconditions:

let index_spec = do {
  arr <- mir_fresh_var "arr" (mir_array 3 mir_u32);
  idx <- mir_fresh_var "idx" mir_usize;
  mir_precond {{ 0 <= idx }}; // Lower bound of idx
  mir_precond {{ idx <= 2 }}; // Upper bound of idx

  mir_execute_func [mir_term arr, mir_term idx];

  mir_return (mir_term {{ arr @ idx }});
};

An alternative way of writing this spec is by using SAW’s mir_array_value command:

sawscript> :type mir_array_value
MIRType -> [MIRValue] -> MIRValue

Here, the MIRType argument represents the element type, and the list of MIRValue arguments are the element values of the array. We can rewrite index_spec using mir_array_value like so:

let index_alt_spec = do {
  arr0 <- mir_fresh_var "arr0" mir_u32;
  arr1 <- mir_fresh_var "arr1" mir_u32;
  arr2 <- mir_fresh_var "arr2" mir_u32;
  let arr = mir_array_value mir_u32 [mir_term arr0, mir_term arr1, mir_term arr2];

  idx <- mir_fresh_var "idx" mir_usize;
  mir_precond {{ 0 <= idx }}; // Lower bound of idx
  mir_precond {{ idx <= 2 }}; // Upper bound of idx

  mir_execute_func [arr, mir_term idx];

  mir_return (mir_term {{ [arr0, arr1, arr2] @ idx }});

Here, [arr0, arr1, arr2] is Cryptol notation for constructing a length-3 sequence consisting of arr0, arr1, and arr2 as the elements. index_alt_spec is equivalent to index_spec, albeit more verbose. For this reason, it is usually preferable to use mir_fresh_var to create an entire symbolic array rather than creating separate symbolic values for each element and combining them with mir_array_value.

There are some situations where mir_array_value is the only viable choice, however. Consider this variant of the index function:

pub fn index_ref_arr(arr: [&u32; 3], idx: usize) -> u32 {
    *arr[idx]
}

When writing a SAW spec for index_ref_arr, we can’t just create a symbolic variable for arr using mir_alloc (mir_array 3 ...), as the reference values in the array wouldn’t point to valid memory. Instead, we must individually allocate the elements of arr using separate calls to mir_alloc and then build up the array using mir_array_value. (As an exercise, try writing and verifying a spec for index_ref_arr).

Tuple types

Rust includes tuple types where the elements of the tuple can be of different types. For example:

pub fn flip(x: (u32, u64)) -> (u64, u32) {
    (x.1, x.0)
}

SAW includes a mir_tuple function for specifying the type of a tuple value. In addition, one can embed MIR tuples into Cryptol, as Cryptol also includes tuple types whose fields can be indexed with .0, .1, etc. Here is a spec for flip that makes use of all these features:

let flip_spec = do {
  x <- mir_fresh_var "x" (mir_tuple [mir_u32, mir_u64]);

  mir_execute_func [mir_term x];

  mir_return (mir_term {{ (x.1, x.0) }});
};

SAW also includes a mir_tuple_value function for constructing a tuple value from other MIRValues:

sawscript> :type mir_tuple_value
[MIRValue] -> MIRValue

mir_tuple_value plays a similar role for tuples as mir_array_value does for arrays.

Struct types

Rust supports the ability for users to define custom struct types. Structs are uniquely identified by their names, so if you have two structs like these:

pub struct S(u32, u64);
pub struct T(u32, u64);

Then even though the fields of the S and T structs are the same, they are not the same struct. This is a type system feature that Cryptol does not have, and for this reason, it is not possible to embed MIR struct values into Cryptol. It is also not possible to use mir_fresh_var to create a symbolic struct value. Instead, one can use the mir_struct_value command:

sawscript> :type mir_struct_value
MIRAdt -> [MIRValue] -> MIRValue

Like with mir_array_value and mir_tuple_value, the mir_struct_value function takes a list of MIRValues as arguments. What makes mir_struct_value unique is its MIRAdt argument, which we have not seen up to this point. In this context, “Adt” is shorthand for “”, and Rust’s structs are an example of ADTs. (Rust also supports enums, another type of ADT that we will see later in this tutorial.)

ADTs in Rust are named entities, and as such, they have unique identifiers in the MIR JSON file in which they are defined. Looking up these identifiers can be somewhat error-prone, so SAW offers a mir_find_adt command that computes an ADT’s identifier and returns the MIRAdt associated with it:

sawscript> :type mir_find_adt
MIRModule -> String -> [MIRType] -> MIRAdt

Here, MIRModule correspond to the MIR JSON file containing the ADT definition, and the String is the name of the ADT whose identifier we want to look up. The list of MIRTypes represent types to instantiate any type parameters to the struct (more on this in a bit).

As an example, we can look up the S and T structs from above like so:

m <- mir_load_module "structs.linked-mir.json";

s_adt <- mir_find_adt m "structs::S" [];
t_adt <- mir_find_adt m "structs::T" [];

We pass an empty list of MIRTypes to each use of mir_find_adt, as neither S nor T have any type parameters. An example of a struct that does include type parameters can be seen here:

pub struct Foo<A, B>(A, B);

As mentioned before, SAW doesn’t support generic definitions out of the box, so the only way that we can make use of the Foo struct is by looking up a particular instantiation of Foo’s type parameters. If we define a function like this, for example:

pub fn make_foo() -> Foo<u32, u64> {
    Foo(27, 42)
}

Then this function instantiates Foo’s A type parameter with u32 and the B type parameter with u64. We can use mir_find_adt to look up this particular instantiation of Foo like so:

foo_adt <- mir_find_adt m "structs::Foo" [mir_u32, mir_u64];

In general, a MIR JSON file can have many separate instantiations of a single struct’s type parameters, and each instantiation must be looked up separately using mir_find_adt.

Having looked up Foo<u32, u64> using mir_find_adt, let’s use the resulting MIRAdt in a spec:

let make_foo_spec = do {
  mir_execute_func [];

  let ret = mir_struct_value
              foo_adt
              [mir_term {{ 27 : [32] }}, mir_term {{ 42 : [64] }}];
  mir_return ret;
};

mir_verify m "structs::make_foo" [] false make_foo_spec z3;

Note that we are directly writing out the values 27 and 42 in Cryptol. Cryptol’s numeric literals can take on many different types, so in order to disambiguate which type they should be, we give each numeric literal an explicit type annotation. For instance, the expression 27 : [32] means that 27 should be a 32-bit integer.

Symbolic structs

Let’s now verify a function that takes a struct value as an argument:

pub struct Bar(u8, u16, Foo<u32, u64>);

pub fn do_stuff_with_bar(b: Bar) {
    // ...
}

Moreover, let’s verify this function for all possible Bar values. One way to do this is to write a SAW spec that constructs a struct value whose fields are themselves symbolic:

bar_adt <- mir_find_adt m "structs::Bar" [];

let do_stuff_with_bar_spec1 = do {
  z1 <- mir_fresh_var "z1" mir_u32;
  z2 <- mir_fresh_var "z2" mir_u64;
  let z = mir_struct_value
            foo_adt
            [mir_term z1, mir_term z2];

  x <- mir_fresh_var "x" mir_u8;
  y <- mir_fresh_var "y" mir_u16;
  let b = mir_struct_value
            bar_adt
            [mir_term x, mir_term y, z];

  mir_execute_func [b];

  // ...
};

This is a rather tedious process, however, as we had to repeatedly use mir_fresh_var to create a fresh, symbolic value for each field. Moreover, because mit_fresh_var does not work for structs, we had to recursively apply this process in order to create a fresh Foo value. It works, but it takes a lot of typing to accomplish.

To make this process less tedious, SAW offers a mir_fresh_expanded_value command that allows one to create symbolic values of many more types. While mir_fresh_var is limited to those MIR types that can be directly converted to Cryptol, mir_fresh_expanded_value can create symbolic structs by automating the process of creating fresh values for each field. This process also applies recursively for struct fields, such as the Foo field in Bar.

As an example, a much shorter way to write the spec above using mir_fresh_expanded_value is:

let do_stuff_with_bar_spec2 = do {
  b <- mir_fresh_expanded_value "b" (mir_adt bar_adt);

  mir_execute_func [b];

  // ...
};

That’s it! Note that the string "b" is used as a prefix for all fresh names that mir_fresh_expanded_value generates, so if SAW produces a counterexample involving this symbolic struct value, one can expect to see names such as b_0, b_1, etc. for the fields of the struct.

mir_fresh_expanded_value makes it easier to construct large, compound values that consist of many smaller, inner values. The drawback is that you can’t refer to these inner values in the postconditions of a spec. As a result, there are some functions for which mir_fresh_expanded_value isn’t suitable, so keep this in mind before reaching for it.

Enum types

Besides structs, another form of ADT that Rust supports are enums. Each enum has a number of different variants that describe the different ways that an enum value can look like. A famous example of a Rust enum is the Option type, which is defined by the standard library like so:

enum Option<T> {
    None,
    Some(T),
}

Option is commonly used in Rust code to represent a value that may be present (Some) or absent (None). For this reason, we will use Option as our motivating example of an enum in this section.

First, let’s start by defining some functions that make use of Option’s variants:

pub fn i_found_something(x: u32) -> Option<u32> {
    Some(x)
}

pub fn i_got_nothing() -> Option<u32> {
    None
}

Both functions return an Option<u32> value, but each function returns a different variant. In order to tell these variants apart, we need a SAW function which can construct an enum value that allows the user to pick which variant they want to construct. The `mir_enum_value function does exactly that:

sawscript> :type mir_enum_value
MIRAdt -> String -> [MIRValue] -> MIRValue

Like mir_struct_value, mir_enum_value also requires a MIRAdt argument in order to discern which particular enum you want. Unlike mir_struct_value, however, it also requires a String which variant of the enum you want. In the case of Option, this String will either be "None" or "Some". Finally, the [MIRValue] arguments represent the fields of the enum variant.

Let’s now verify some enum-related code with SAW. First, we must look up the Option<u32> ADT, which works just as if you had a struct type:

option_u32 <- mir_find_adt m "core::option::Option" [mir_u32];

Next, we can use this ADT to construct enum values. We shall use mir_enum_value to create a Some value in the spec for i_found_something:

let i_found_something_spec = do {
  x <- mir_fresh_var "x" mir_u32;

  mir_execute_func [mir_term x];

  let ret = mir_enum_value option_u32 "Some" [mir_term x];
  mir_return ret;
};

mir_verify m "enums::i_found_something" [] false i_found_something_spec z3;

Note that while we used the full identifier core::option::Option to look up the Option ADT, we do not need to use the core::option prefix when specifying the "Some" variant. This is because SAW already knows what the prefix should be from the option_u32 ADT, so the "Some" shorthand suffices.

Similarly, we can also write a spec for i_got_nothing, which uses the None variant:

let i_got_nothing_spec = do {
  mir_execute_func [];

  let ret = mir_enum_value option_u32 "None" [];
  mir_return ret;
};

mir_verify m "enums::i_got_nothing" [] false i_got_nothing_spec z3;

Symbolic enums

In order to create a symbolic struct, one could create symbolic fields and pack them into a larger struct value using mir_struct_value. The same process is not possible with mir_enum_value, however, as a symbolic enum value would need to range over all possible variants in an enum.

Just as mir_fresh_expanded_value supports creating symbolic structs, mir_fresh_expanded_value also supports creating symbolic enum values. For example, given this function that accepts an Option<u32> value as an argument:

pub fn do_stuff_with_option(o: Option<u32>) {
    // ...
}

We can write a spec for this function that considers all possible Option<u32> values like so:

let do_stuff_with_option_spec = do {
  o <- mir_fresh_expanded_value "o" (mir_adt option_u32);

  mir_execute_func [o];

  // ...
};

Here, o can be a None value, or it can be a Some value with a symbolic field.

Slices

Slices are a particular type of reference that allow referencing contiguous sequences of elements in a collection, such as an array. Unlike ordinary references (e.g., &u32), SAW does not permit allocating a slice directly. Instead, one must take a slice of an existing reference. To better illustrate this distinction, consider this function:

pub fn sum_of_prefix(s: &[u32]) -> u32 {
    s[0].wrapping_add(s[1])
}

sum_of_prefix takes a slice to a sequence of u32s as an argument, indexes into the first two elements in the sequence, and adds them together. There are many possible ways we can write a spec for this function, as the slice argument may be backed by many different sequences. For example, the slice might be backed by an array whose length is exactly two:

    let a1 = [1, 2];
    let s1 = &a1[..];
    let sum1 = sum_of_prefix(s1);

We could also make a slice whose length is longer than two:

    let a2 = [1, 2, 3, 4, 5];
    let s2 = &a2[..];
    let sum2 = sum_of_prefix(s2);

Alternatively, the slice might be a subset of an array whose length is longer than two:

    let a3 = [1, 2, 3, 4, 5];
    let s3 = &a3[0..2];
    let sum3 = sum_of_prefix(s3);

All of these are valid ways of building the slice argument to sum_of_prefix. Let’s try to write SAW specifications that construct these different forms of slices. To do so, we will need SAW functions that take a reference to a collection (e.g., an array) and converts them into a slice reference. The mir_slice_value function is one such function:

sawscript> :type mir_slice_value
MIRValue -> MIRValue

mir_slice_value arr_ref is the SAW equivalent of writing arr_ref[..]. That is, if arr_ref is of type &[T; N], then mir_slice_value arr_ref is of type &[T]. Note that arr_ref must be a reference to an array, not an array itself.

Let’s use mir_slice_value to write a spec for sum_of_prefix when the slice argument is backed by an array of length two:

let sum_of_prefix_spec1 = do {
  a_ref <- mir_alloc (mir_array 2 mir_u32);
  a_val <- mir_fresh_var "a" (mir_array 2 mir_u32);
  mir_points_to a_ref (mir_term a_val);

  let s = mir_slice_value a_ref;

  mir_execute_func [s];

  mir_return (mir_term {{ (a_val @ 0) + (a_val @ 1) }});
};

The first part of this spec allocates an array reference a_ref and declares that it points to a fresh array value a_val. The next part declares a slice s that is backed by the entirety of a_ref, which is then passed as an argument to the function itself. Finally, the return value is declared to be the sum of the first and second elements of a_val, which are the same values that back the slice s itself.

As noted above, the sum_of_prefix function can work with slices of many different lengths. Here is a slight modification to this spec that declares it to take a slice of length 5 rather than a slice of length 2:

let sum_of_prefix_spec2 = do {
  a_ref <- mir_alloc (mir_array 5 mir_u32);
  a_val <- mir_fresh_var "a" (mir_array 5 mir_u32);
  mir_points_to a_ref (mir_term a_val);

  let s = mir_slice_value a_ref;

  mir_execute_func [s];

  mir_return (mir_term {{ (a_val @ 0) + (a_val @ 1) }});
};

Both of these examples declare a slice whose length matches the length of the underlying array. In general, there is no reason that these have to be the same, and it is perfectly fine for a slice’s length to be less than the the length of the underlying array. In Rust, for example, we can write a slice of a subset of an array by writing &arr_ref[0..2]. The SAW equivalent of this can be achieved with the mir_slice_range_value function:

sawscript> :type mir_slice_range_value
MIRValue -> Int -> Int -> MIRValue

mir_slice_range_value takes takes two additional Int arguments that represent (1) the index to start the slice from, and (2) the index at which the slice ends. For example, mir_slice_range_value arr_ref 0 2 creates a slice that is backed by the first element (index 0) and the second element (index 1) of arr_ref. Note that the range [0..2] is half-open, so this range does not include the third element (index 2).

For example, here is how to write a spec for sum_of_prefix where the slice is a length-2 subset of the original array:

let sum_of_prefix_spec3 = do {
  a_ref <- mir_alloc (mir_array 5 mir_u32);
  a_val <- mir_fresh_var "a" (mir_array 5 mir_u32);
  mir_points_to a_ref (mir_term a_val);

  let s = mir_slice_range_value a_ref 0 2;

  mir_execute_func [s];

  mir_return (mir_term {{ (a_val @ 0) + (a_val @ 1) }});
};

Note that both Int arguments to mir_slice_range_value must be concrete (i.e., not symbolic). (See the section below if you want an explanation for why they are not allowed to be symbolic.)

Aside: slices of arbitrary length

After reading the section about slices above, one might reasonably wonder: is there a way to write a more general spec for sum_of_prefix: that covers all possible slice lengths n, where n is greater than or equal to 2? In this case, the answer is “no”.

This is a fundamental limitation of the way SAW’s symbolic execution works. The full reason for why this is the case is somewhat technical (keep reading if you want to learn more), but the short answer is that if SAW attempts to simulate code whose length is bounded by a symbolic integer, then SAW will go into an infinite loop. To avoid this pitfall, the mir_slice_range_value function very deliberately requires the start and end values to be concrete integers, as allowing these values to be symbolic would allow users to inadvertently introduce infinite loops in their specifications.

A longer answer as to why SAW loops forever on computations that are bounded by symbolic lengths: due to the way SAW’s symblolic execution works, it creates a complete model of the behavior of a function for all possible inputs. The way that SAW achieves this is by exploring all possible execution paths through a program. If a program involves a loop, for example, then SAW will unroll all iterations of the loop to construct a model of the loop’s behavior. Similarly, if a sequence (e.g., a slice or array) has an unspecified length, then SAW must consider all possible lengths of the array.

SAW’s ability to completely characterize the behavior of all paths through a function is one of its strengths, as this allows it to prove theorems that other program verification techniques would not. This strength is also a weakness, however. If a loop has a symbolic number of iterations, for example, then SAW will spin forever trying to unroll the loop. Similarly, if a slice were to have a symbolic length, then SAW would spin forever trying to simulate the program for all possible slice lengths.

In general, SAW cannot prevent users from writing programs whose length is bounded by a symbolic value. For now, however, SAW removes one potential footgun by requiring that slice values always have a concrete length.

Overrides and compositional verification

Up until this point, all uses of mir_verify in this tutorial have provided an empty list ([]) of overrides. This means that any time SAW has simulated a function which calls another function, it will step into the definition of the callee function and verify its behavior alongside the behavior of the callee function. This is a fine thing to do, but it can be inefficient. For example, consider a function like this:

pub fn f(x: u32) -> u32 {
    let x1 = g(x);
    let x2 = g(x1);
    g(x2)
}

Here, the caller function f invokes the callee function g three separate times. If we verify f with mir_verify as we have done up until this point, then SAW must analyze the behavior of g three separate times. This is wasteful, and especially so if g is a large and complicated function.

This is where compositional verification enters the picture. The idea behind compositional verification is that when we prove properties of a caller function, we can reuse properties that we have already proved about callee functions. These properties are captured as override specifications, which are also referred to by the shorthand term overrides. When a caller invokes a callee with a corresponding override specification, the override’s properties are applied without needing to re-simulate the entire function.

As it turns out, the command needed to produce an override specification is already familiar to us—it’s mir_verify! If you examine the type of this command:

sawscript> :type mir_verify
MIRModule -> String -> [MIRSpec] -> Bool -> MIRSetup () -> ProofScript () -> TopLevel MIRSpec

The returned value is a MIRSpec, which captures the behavior of the function that was verified as an override spec. This override can then be passed to another call to mir_verify to use as part of a larger proof.

Let’s now try compositional verification in practice. To do so, we will first prove a spec for the g function above. For demonstration purposes, we will pick a simplistic implementation of g:

pub fn g(x: u32) -> u32 {
    x.wrapping_add(1)
}

Note that we don’t really have to use compositional verification when g is this simple, as SAW is capable of reasoning about g’s behavior directly when proving a spec for f. It’s still worth going along with this exercise, however, as the same principles of compositional verification apply whether the implementation of g is small or large.

The first step of compositional verification is to prove a spec for g, the callee function:

let g_spec = do {
  x <- mir_fresh_var "x" mir_u32;

  mir_execute_func [mir_term x];

  mir_return (mir_term {{ x + 1 }});
};

g_ov <- mir_verify m "overrides::g" [] false g_spec z3;

There’s nothing that different about this particular proof from the proofs we’ve seen before. The only notable difference is that we bind the result of calling mir_verify to a MIRSpec value that we name g_ov (short for “g override”). This part is important, as we will need to use g_ov shortly.

The next step is to write a spec for f. Since g adds 1 to its argument, f will add 3 to its argument:

let f_spec = do {
  x <- mir_fresh_var "x" mir_u32;

  mir_execute_func [mir_term x];

  mir_return (mir_term {{ x + 3 }});
};

Again, nothing too surprising. Now let’s prove f against f_spec by using g_ov as a compositional override:

mir_verify m "overrides::f" [g_ov] false f_spec z3;

Here, note that instead of passing an empty list ([]) as we have done before, we now pass a list containing g_ov. This informs mir_verify that whenever it simulates a call to g, it should reuse the properties captured in g_ov. In general, we can pass as many overrides as we want (we will see examples of this later in the tutorial), but for now, one override will suffice.

Let’s run the proof of f against f_spec, making sure to pay attention to the output of SAW:

[19:06:17.392] Verifying overrides/96c5af24::f[0] ...
[19:06:17.406] Simulating overrides/96c5af24::f[0] ...
[19:06:17.407] Matching 1 overrides of  overrides/96c5af24::g[0] ...
[19:06:17.407] Branching on 1 override variants of overrides/96c5af24::g[0] ...
[19:06:17.407] Applied override! overrides/96c5af24::g[0]
[19:06:17.407] Matching 1 overrides of  overrides/96c5af24::g[0] ...
[19:06:17.407] Branching on 1 override variants of overrides/96c5af24::g[0] ...
[19:06:17.407] Applied override! overrides/96c5af24::g[0]
[19:06:17.407] Matching 1 overrides of  overrides/96c5af24::g[0] ...
[19:06:17.407] Branching on 1 override variants of overrides/96c5af24::g[0] ...
[19:06:17.407] Applied override! overrides/96c5af24::g[0]
[19:06:17.407] Checking proof obligations overrides/96c5af24::f[0] ...
[19:06:17.422] Proof succeeded! overrides/96c5af24::f[0]

We’ve now proven f compositionally! The first two lines (“Verifying ...” and “Simulating ...”) and the last two lines (“Checking proof obligations ...” and “Proof succeeded! ...") are the same as before, but this time, we have some additional lines of output in between:

Whenever SAW prints “Matching <N> overrides of <function>”, that’s when you know that SAW is about to simulate a call to <function>. At that point, SAW will check to see how many overrides (<N>) for <function> are available.
Whenever SAW prints “Brancing on <N> override variants of <function>", SAW is trying to figure out which of the` overrides to apply. In this example, there is only a single override, so the choice is easy. In cases where there are multiple overrides, however, SAW may have to work harder (possibly even consulting an SMT solver) to figure out which override to use.
If SAW successfully picks an override to apply, it will print “Applied override! ...”.

In the example above, we used a single g override that applies for all possible arguments. In general, however, there is no requirement that overrides must work for all arguments. In fact, it is quite common for SAW verification efforts to write different specifications for the same function, but with different arguments. We can then provide multiple overrides for the same function as part of a compositional verification, and SAW will be able to pick the right override depending on the shape of the argument when invoking the function being overridden.

For example, let’s suppose that we wrote different g specs, one where the argument to g is even, and another where the argument to g is odd:

let g_even_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_precond {{ x % 2 == 0 }};

  mir_execute_func [mir_term x];

  mir_return (mir_term {{ x + 1 }});
};

let g_odd_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_precond {{ x % 2 == 1 }};

  mir_execute_func [mir_term x];

  mir_return (mir_term {{ x + 1 }});
};

g_even_ov <- mir_verify m "overrides::g" [] false g_even_spec z3;
g_odd_ov  <- mir_verify m "overrides::g" [] false g_odd_spec  z3;

We can then prove f compositionally by passing both of the g overrides to mir_verify:

mir_verify m "overrides::f" [g_even_ov, g_odd_ov] false f_spec z3;

Like before, this will successfully verify. The only different now is that SAW will print output involving two overrides instead of just one:

[20:48:07.649] Simulating overrides/96c5af24::f[0] ...
[20:48:07.650] Matching 2 overrides of  overrides/96c5af24::g[0] ...
[20:48:07.650] Branching on 2 override variants of overrides/96c5af24::g[0] ...
[20:48:07.652] Applied override! overrides/96c5af24::g[0]
...

Keep in mind that if you provide at least one override for a function as part of a compositional verification, then SAW must apply an override whenever it invokes that function during simulation. If SAW cannot find a matching override, then the verification will fail. For instance, consider what would happen if you tried proving f like so:

mir_verify m "overrides::f" [g_even_ov] false f_spec z3;

This time, we supply one override for g that only matches when the argument is even. This is a problem, as SAW will not be able to find a matching override when the argument is odd. Indeed, SAW will fail to verify this:

[20:53:29.588] Verifying overrides/96c5af24::f[0] ...
[20:53:29.602] Simulating overrides/96c5af24::f[0] ...
[20:53:29.602] Matching 1 overrides of  overrides/96c5af24::g[0] ...
[20:53:29.602] Branching on 1 override variants of overrides/96c5af24::g[0] ...
[20:53:29.603] Applied override! overrides/96c5af24::g[0]
[20:53:29.603] Matching 1 overrides of  overrides/96c5af24::g[0] ...
[20:53:29.603] Branching on 1 override variants of overrides/96c5af24::g[0] ...
[20:53:29.604] Applied override! overrides/96c5af24::g[0]
[20:53:29.604] Matching 1 overrides of  overrides/96c5af24::g[0] ...
[20:53:29.604] Branching on 1 override variants of overrides/96c5af24::g[0] ...
[20:53:29.605] Applied override! overrides/96c5af24::g[0]
[20:53:29.605] Symbolic simulation completed with side conditions.
[20:53:29.606] Checking proof obligations overrides/96c5af24::f[0] ...
[20:53:29.623] Subgoal failed: overrides/96c5af24::f[0] No override specification applies for overrides/96c5af24::g[0].
Arguments:
- c@26:bv
Run SAW with --sim-verbose=3 to see a description of each override.
[20:53:29.623] SolverStats {solverStatsSolvers = fromList ["SBV->Z3"], solverStatsGoalSize = 388}
[20:53:29.624] ----------Counterexample----------
[20:53:29.624]   x: 1
...
Proof failed.

Here, we can see that No override specification applies, and SAW also generates a counterexample of x: 1. Sure enough, 1 is an odd number!

Overrides and mutable references

Compositional overrides provide great power, as they effectively allow you to skip over certain functions when simulating them and replace them with simpler implementations. With great power comes great responsibility, however. In particular, one must be careful when using overrides for functions that modify mutable references. If an override does not properly capture the behavior of a mutable reference, it could potentially lead to incorrect proofs.

This is the sort of thing that is best explained with an example, so consider these two functions:

pub fn side_effect(a: &mut u32) {
    *a = 0;
}

pub fn foo(x: u32) -> u32 {
    let mut b: u32 = x;
    side_effect(&mut b);
    b
}

The side_effect function does not return anything interesting; it is only ever invoked to perform a side effect of changing the mutable reference a to point to 0. The foo function invokes side_effect, and as a result, it will always return 0, regardless of what the argument to foo is. No surprises just yet.

Now let’s make a first attempt at verifying foo using compositional verification. First, we will write a spec for side_effect:

let side_effect_spec = do {
  a_ref <- mir_alloc_mut mir_u32;
  a_val <- mir_fresh_var "a_val" mir_u32;
  mir_points_to a_ref (mir_term a_val);
  mir_execute_func [a_ref];
};

side_effect_spec is somewhat odd. Although it goes through the effort of allocating a mutable reference a_ref and initializing it, nothing about this spec states that a_ref will point to 0 after the function has been invoked. This omission is strange, but not outright wrong—the spec just underspecifies what the behavior of the function is. Indeed, SAW will successfully verify this spec using mir_verify:

side_effect_ov <- mir_verify m "overrides_mut::side_effect" [] false side_effect_spec z3;

Next, let’s try to write a spec for foo:

let foo_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_execute_func [mir_term x];
  mir_return (mir_term {{ x }});
};

At this point, alarm bells should be going off in your head. This spec incorrectly states that foo(x) should return x, but it should actually return 0! This looks wrong, but consider what would happen if you tried to verify this compositionally using our side_effect_ov override:

mir_verify m "overrides_mut::foo" [side_effect_ov] false foo_spec z3;

If SAW were to simulate foo(x), it would invoke create a temporary variable b and assign it to the value x, and then it would invoke side_effect(&mut b). At this point, the side_effect_ov override would apply. According to side_effect_spec, the argument to side_effect is not modified at all after the function returns. This means that when the foo function returns b, it will still retain its initial value of x. This shows that if we were to use side_effect_ov, we could prove something that’s blatantly false!

Now that we’ve made you sweat a little bit, it’s time for some good news: SAW won’t actually let you prove foo_spec. If you try this compositional proof in practice, SAW will catch your mistake:

[14:50:29.170] Verifying overrides_mut/11e47708::foo[0] ...
[14:50:29.181] Simulating overrides_mut/11e47708::foo[0] ...
[14:50:29.181] Matching 1 overrides of  overrides_mut/11e47708::side_effect[0] ...
[14:50:29.181] Branching on 1 override variants of overrides_mut/11e47708::side_effect[0] ...
...
State of memory allocated in precondition (at overrides-mut-fail.saw:6:12) not described in postcondition

The line of code that SAW points to in the “State of memory ...” error message is:

  a_ref <- mir_alloc_mut mir_u32;

SAW informs us that although we allocated the mutable reference a_ref, we never indicated what it should point to after the function has returned. This is an acceptable (if somewhat unusual) thing to do when verifying side_effect_spec using mir_verify, but it is not acceptable to do this when using this spec as an override. To avoid unsound behavior like what is described above, any override that allocates a mutable reference in its preconditions must declare what its value should be in the postconditions, no exceptions.

Thankfully, repairing this spec is relatively straightforward. Simply add a mir_points_to statement in the postconditions of side_effect_spec:

let side_effect_spec = do {
  a_ref <- mir_alloc_mut mir_u32;
  a_val <- mir_fresh_var "a_val" mir_u32;
  mir_points_to a_ref (mir_term a_val);
  mir_execute_func [a_ref];

  // This is new
  mir_points_to a_ref (mir_term {{ 0 : [32] }});
};

Then use the correct return value in foo_spec:

let foo_spec = do {
  x <- mir_fresh_var "x" mir_u32;
  mir_execute_func [mir_term x];

  // This is new
  mir_return (mir_term {{ 0 : [32] }});
};

And now the compositional proof of foo_spec works!

Unsafe overrides

Now that we’ve made it this far into the tutorial, it’s time to teach you a more advanced technique: unsafe overrides. Up until this point, we have relied on SAW to check all of our work, and this is usually what you’d want from a formal verification tool. In certain circumstances, however, it can be useful to say “I know what I’m doing, SAW—just believe me when I say this spec is valid!” In order to say this, you can use mir_unsafe_assume_spec:

sawscript> :type mir_unsafe_assume_spec
MIRModule -> String -> MIRSetup () -> TopLevel MIRSpec

mir_unsafe_assume_spec is mir_verify’s cousin who likes to live a little more dangerously. Unlike mir_verify, the specification that you pass to mir_unsafe_assume_spec (the MIRSetup () argument) is not checked for full correctness. That is, mir_unsafe_assume_spec will bypass SAW’s usual symbolic execution pipeline, which is why one does not need to pass a ProofScript argument (e.g., z3) to mir_unsafe_assume_spec. SAW will believe whatever spec you supply mir_unsafe_assume_spec to be valid, and the MIRSpec that mir_unsafe_assume_spec returns can then be used in later compositional verifications.

Why would you want to do this? The main reason is that writing proofs can be difficult, and sometimes, there are certain functions in a SAW verification effort that are disproportionately harder to write a spec for than others. It is tempting to write specs for each function in sequence, but this can run the risk of getting stuck on a particularly hard-to-verify function, blocking progress on other parts of the proofs.

In these situations, mir_unsafe_assume_spec can be a useful prototyping tool. One can use mir_unsafe_assume_spec to assume a spec for the hard-to-verify function and then proceed with the remaining parts of the proof. Of course, you should make an effort to go back and prove the hard-to-verify function’s spec later, but it can be nice to try something else first.

For example, here is how one can unsafely assume g_spec and use it in a compositional proof of f_spec:

g_ov <- mir_unsafe_assume_spec m "overrides::g" g_spec;
mir_verify m "overrides::f" [g_ov] false f_spec z3;

It should be emphasized that when we say “unsafe”, we really mean it. mir_unsafe_assume_spec can be used to prove specs that are blatantly wrong, so use it with caution.

Static items

Sometimes, Rust code makes use of , which are definitions that are defined in a precise memory location for the entire duration of the program. As such, static items can be thought of as a form of global variables.

Immutable static items

There are two kinds of static items in Rust: mutable static items (which have a mut keyword) and immutable static items (which lack mut). Immutable static items are much easier to deal with, so let’s start by looking at an example of a program that uses immutable static data:

static ANSWER: u32 = 42;

pub fn answer_to_the_ultimate_question() -> u32 {
    ANSWER
}

This function will return ANSWER, i.e., 42. We can write a spec that says as much:

let answer_to_the_ultimate_question_spec1 = do {
  mir_execute_func [];

  mir_return (mir_term {{ 42 : [32] }});
};

This works, but it is somewhat unsatisfying, as it requires hard-coding the value of ANSWER into the spec. Ideally, we’d not have to think about the precise implementation of static items like ANSWER. Fortunately, SAW makes this possible by providing a mir_static_initializer function which computes the initial value of a static item at the start of the program:

sawscript> :type mir_static_initializer
String -> MIRValue

In this case, mir_static_initializer "statics::ANSWER" is equivalent to writing mir_term {{ 42 : [32] }}, so this spec is also valid:

let answer_to_the_ultimate_question_spec2 = do {
  mir_execute_func [];

  mir_return (mir_static_initializer "statics::ANSWER");
};

Like mir_verify, the mir_static_initializer function expects a full identifier as an argument, so we must write "statics::ANSWER" instead of just `“ANSWER”.

At the MIR level, there is a unique reference to every static item. You can obtain this reference by using the mir_static function:

sawscript> :type mir_static
String -> MIRValue

Here is one situation in which you would need to use a reference to a static item (which mir_static computes) rather than the value of a static item (which mir_static_initializer computes):

pub fn answer_in_ref_form() -> &'static u32 {
    &ANSWER
}

A spec for this function would look like this:

let answer_in_ref_form_spec = do {
  mir_execute_func [];

  mir_return (mir_static "statics::ANSWER");
};

That’s about all there is to say regarding immutable static items.

Mutable static items

Mutable static items are a particularly tricky aspect of Rust. Including a mutable static item in your program is tantamount to having mutable global state that any function can access and modify. They are so tricky, in fact, that Rust does not even allow you to use them unless you surround them in an unsafe block:

static mut MUT_ANSWER: u32 = 42;

pub fn mut_answer_to_the_ultimate_question() -> u32 {
    unsafe { MUT_ANSWER }
}

The mir_static_initializer and mut_static functions work both immutable and mutable static items, so we can write specs for mutable items using mostly the same techniques as for writing specs for immutable items. We must be careful, however, as SAW is pickier when it comes to specifying the initial values of mutable static items. For example, here is naïve attempt at porting the spec for answer_to_the_ultimate_question over to its mutable static counterpart, mut_answer_to_the_ultimate_question:

let mut_answer_to_the_ultimate_question_spec = do {
  mir_execute_func [];

  mir_return (mir_static_initializer "statics::MUT_ANSWER");
};

This looks plausible, but SAW will fail to verify it:

[21:52:32.738] Verifying statics/28a97e47::mut_answer_to_the_ultimate_question[0] ...
[21:52:32.745] Simulating statics/28a97e47::mut_answer_to_the_ultimate_question[0] ...
...
Symbolic execution failed.
Abort due to assertion failure:
  statics.rs:14:14: 14:24: error: in statics/28a97e47::mut_answer_to_the_ultimate_question[0]
  attempted to read empty mux tree

Oh no! Recall that we have seen this “attempted to read empty mux tree” error message once before when discussing reference types. This error arose when we attempted to read from uninitialized memory from a reference value. The same situation applies here. A static item is backed by a reference, and SAW deliberately does not initialize the memory that a mutable static reference points to upon program startup. Since we did not initialize MUT_ANSWER’s reference value in the preconditions of the spec, SAW crashes at simulation time when it attempts to read from the uninitialized memory.

The solution to this problem is to perform this initialization explicitly using mir_points_to in the preconditions of the spec. For example, this is a valid spec:

We don’t necessarily have to use mir_static_initializer as the starting value for MUT_ANSWER, however. This spec, which uses 27 as the starting value, is equally valid:

At this point, you are likely wondering: why do we need to explicitly initialize mutable static references but not immutable static references? After all, when we wrote a spec for answer_to_the_ultimate_question earlier, we managed to get away with not initializing the reference for ANSWER (which is an immutable static item). The difference is that the value of a mutable static item can change over the course of a program, and SAW requires that you be very careful in specifying what a mutable static value is at the start of a function. For example, consider a slightly extended version of the earlier Rust code we saw:

static mut MUT_ANSWER: u32 = 42;

pub fn mut_answer_to_the_ultimate_question() -> u32 {
    unsafe { MUT_ANSWER }
}

pub fn alternate_universe() -> u32 {
    unsafe {
        MUT_ANSWER = 27;
    }
    mut_answer_to_the_ultimate_question()
}

Suppose someone were to ask you “what value does mut_answer_to_the_ultimate_question return?” This is not a straightforward question to answer, as the value that it returns depends on the surrounding context. If you were to call mut_answer_to_the_ultimate_question right as the program started, it would return 42. If you were to call mut_answer_to_the_ultimate_question as part of the implementation of alternate_universe, however, then it would return 27! This is an inherent danger of using mutable static items, as they can be modified at any time by any function. For this reason, SAW requires you to be explicit about what the initial values of mutable static items should be.

Mutable static items and compositional overrides

In the “Overrides and mutable references” section, we discussed the potential pitfalls of using mutable references in compositional overrides. Mutable static items are also mutable values that are backed by references, and as such, they are also subject to the same pitfalls. Let’s see an example of this:

static mut A: u32 = 42;

pub fn side_effect() {
    unsafe {
        A = 0;
    }
}

pub fn foo() -> u32 {
    side_effect();
    unsafe { A }
}

The setup is almost the same, except that instead of passing a mutable reference as an argument to side_effect, we instead declare a mutable static item A that is shared between side_effect and foo. We could potentially write SAW specs for side_effect and foo like these:

let a = "statics_compositional::A";

let side_effect_spec = do {
  mir_points_to (mir_static a) (mir_static_initializer a);
  mir_execute_func [];
};

let foo_spec = do {
  let a_init = mir_static_initializer a;
  mir_points_to (mir_static a) a_init;
  mir_execute_func [];
  mir_return a_init;
};

side_effect_ov <- mir_verify m "statics_compositional::side_effect" [] false side_effect_spec z3;
mir_verify m "statics_compositional::foo" [side_effect_ov] false foo_spec z3;

Note that we have once again underspecified the behavior of side_effect, as we do not say what A’s value should be in the postconditions of side_effect_spec. Similarly, foo_spec is wrong, as it should return 0 rather than the initial value of A. By similar reasoning as before, we run the risk that using side_effect_ov could lead use to prove something incorrect. Thankfully, SAW can also catch this sort of mistake:

[15:46:38.525] Verifying statics_compositional/16fea9c0::side_effect[0] ...
[15:46:38.533] Simulating statics_compositional/16fea9c0::side_effect[0] ...
[15:46:38.533] Checking proof obligations statics_compositional/16fea9c0::side_effect[0] ...
[15:46:38.533] Proof succeeded! statics_compositional/16fea9c0::side_effect[0]
[15:46:38.533] Verifying statics_compositional/16fea9c0::foo[0] ...
[15:46:38.542] Simulating statics_compositional/16fea9c0::foo[0] ...
[15:46:38.542] Matching 1 overrides of  statics_compositional/16fea9c0::side_effect[0] ...
[15:46:38.542] Branching on 1 override variants of statics_compositional/16fea9c0::side_effect[0] ...
...
State of mutable static variable "statics_compositional/16fea9c0::A[0]" not described in postcondition

To repair this proof, add a mir_points_to statement in the postconditions of side_effect_spec:

let side_effect_spec = do {
  mir_points_to (mir_static a) (mir_static_initializer a);
  mir_execute_func [];

  // This is new
  mir_points_to (mir_static a) (mir_term {{ 0 : [32] }});
};

And then correct the behavior of foo_spec:

let foo_spec = do {
  let a_init = mir_static_initializer a;
  mir_points_to (mir_static a) a_init;
  mir_execute_func [];

  // This is new
  mir_return (mir_term {{ 0 : [32] }});
};

Be warned that if your program declares any mutable static items, then any compositional override must state what the value of each mutable static item is in its postconditions. This applies even if the override does not directly use the mutable static items. For example, if we had declared a second mutable static item alongside A:

static mut A: u32 = 42;
static mut B: u32 = 27;

Then side_effect_spec would need an additional mir_points_to statement involving B to satisfy this requirement. This requirement is somewhat heavy-handed, but it is necessary in general to avoid unsoundness. Think carefully before you use mutable static items!

Case study: Salsa20

If you’ve made it this far into the tutorial, congrats! You’ve now been exposed to all of the SAW fundamentals that you need to verify Rust code found in the wild. Of course, talking about verifying real-world code is one thing, but actually doing the verification is another thing entirely. Making the jump from the small examples to “industrial-strength” code can be intimidating.

To make this jump somewhat less frightening, the last part of this tutorial will consist of a case study in using SAW to verify a non-trivial piece of Rust code. In particular, we will be looking at a Rust implementation of the stream cipher. We do not assume any prior expertise in cryptography or stream ciphers in this tutorial, so don’t worry if you are not familiar with Salsa20.

More than anything, this case study is meant to emphasize that verification is an iterative process. It’s not uncommon to try something with SAW and encounter an error message. That’s OK! We will explain what can go wrong when verifying Salsa20 and how to recover from these mistakes. Later, if you encounter similar issues when verifying your own code with SAW, the experience you have developed when developing these proofs can inform you of possible ways to fix the issues.

The `salsa20` crate

The code for this Salsa20 implementation we will be verifying can be found under the subdirectory. This code is adapted from version 0.3.0 of the salsa20 crate, which is a part of the project. The code implements Salsa20 as well as variants such as HSalsa20 and XSalsa20, but we will only be focusing on the original Salsa20 cipher in this tutorial.

The parts of the crate that are relevant for our needs are mostly contained in the file, as well as some auxiliary definitions in the and files. You can take a look at these files if you’d like, but you don’t need to understand everything in them just yet. We will introduce the relevant parts of the code in the tutorial as they come up.

Salsa20 preliminaries

Salsa20 is a stream cipher, which is a cryptographic technique for encrypting and decrypting messages. A stream cipher encrypts a message by combining it with a keystream to produce a ciphertext (the encrypted message). Moreover, the same keystream can then be combined with the ciphertext to decrypt it back into the original message.

The original author of Salsa20 has published a specification for Salsa20 . This is a great starting point for a formal verification project, as this gives us a high-level description of Salsa20’s behavior that will guide us in proving the functional correctness of the salsa20 crate. When we say that salsa20 is functionally correct, we really mean “proven correct with respect to the Salsa20 specification”.

The first step in our project would be to port the Salsa20 spec to Cryptol code, as we will need to use this code when writing SAW proofs. The process of transcribing an English-language specification to executable Cryptol code is interesting in its own right, but it is not the primary focus of this tutorial. As such, we will save you some time by providing a pre-baked Cryptol implementation of the Salsa20 spec . (This implementation is from the repo.)

Writing the Cryptol version of the spec is only half the battle, however. We still have to prove that the Rust implementation in the salsa20 crate adheres to the behavior prescribed by the spec, which is where SAW enters the picture. As we will see shortly, the code in salsa20 is not a direct port of the pseudocode shown in the Salsa20 spec, as it is somewhat more low-level. SAW’s role is to provide us assurance that the behavior of the low-level Rust code and the high-level Cryptol code coincide.

A note about cryptographic security

As noted in the previous section, our goal is to prove that the behavior of salsa20 functions is functionally correct. This property should not be confused with cryptographic security. While functional correctness is an important aspect of cryptographic security, a full cryptographic security audit would encompass additional properties such as whether the code runs in constant time on modern CPUs. As such, the SAW proofs we will write would not constitute a full security audit (and indeed, the states that the crate has never received such an audit).

An overview of the `salsa20` code

Before diving into proofs, it will be helpful to have a basic understanding of the functions and data types used in the salsa20 crate. Most of the interesting code lives in . At the top of this file, we have the Core struct:

pub struct Core<R: Rounds> {
    /// Internal state of the core function
    state: [u32; STATE_WORDS],

    /// Number of rounds to perform
    rounds: PhantomData<R>,
}

Let’s walk through this:

The state field is an array that is STATE_WORDS elements long, where STATE_WORDS is a commonly used alias for 16:
```
/// Number of 32-bit words in the Salsa20 state
const STATE_WORDS: usize = 16;
```
The rounds field is of type PhantomData<R>. If you haven’t seen it before, is a special type that tells the Rust compiler to pretend as though the struct is storing something of type R, even though a PhantomData value will not take up any space at runtime.

The reason that Core needs a PhantomData<R> field is because R implements the Rounds trait:

//! Numbers of rounds allowed to be used with a Salsa20 family stream cipher

pub trait Rounds: Copy {
    const COUNT: usize;
}

A core operation in Salsa20 is hashing its input through a series of rounds. The COUNT constant indicates how many rounds should be performed. The Salsa20 spec assumes 20 rounds:

/// 20-rounds (Salsa20/20)
#[derive(Copy, Clone)]
pub struct R20;

impl Rounds for R20 {
    const COUNT: usize = 20;
}

However, there are also reduced-round variants that perform 8 and 12 rounds, respectively:

Each number of rounds has a corresponding struct whose names begins with the letter R. For instance, a Core<R20> value represents a 20-round Salsa20 cipher. Here is the typical use case for a Core value:

A Core value is created using the new function:
We’ll omit the implementation for now. This function takes a secret Key value and a unique Nonce value and uses them to produce the initial state in the Core value.
After creating a Core value, the counter_setup and rounds functions are used to produce the Salsa20 keystream:
We’ll have more to say about these functions later.
The pièce de résistance is the apply_keystream function. This takes a newly created Core value, produces its keystream, and applies it to a message to produce the output:

Our ultimate goal is to verify the apply_keystream function, which is the Rust equivalent of the Salsa20 encryption function described in the spec.

Building `salsa20`

The next step is to build the salsa20 crate. Unlike the examples we have seen up to this point, which have been self-contained Rust files, salsa20 is a cargo-based project. As such, we will need to build it using cargo saw-build, an extension to the cargo package manager that integrates with mir-json. Before you proceed, make sure that you have defined the SAW_RUST_LIBRARY_PATH environment variable as described in .

To build the salsa20 crate, perform the following steps:

Near the end of the build output, you will see a line that looks like this:

This is the location of the MIR JSON file that we will need to provide to SAW. (When we tried it, the hash in the file name was dd0d90f28492b9cb, but it could very well be different on your machine.) Due to how cargo works, the location of this file is in a rather verbose, hard-to-remember location. For this reason, we recommend copying this file to a different path, e.g.,

As a safeguard, we have also checked in a compressed version of this MIR JSON file as . In a pinch, you can extract this archive to obtain a copy of the MIR JSON file, which is approximately 4.6 megabytes when uncompressed.

Getting started with SAW

Now that we’ve built the salsa20 crate, it’s time to start writing some proofs! Let’s start a new code/salsa20/salsa20.saw file as fill it in with the usual preamble:

We are also going to need to make use of the Cryptol implementation of the Salsa20 spec, which is defined in . SAW allows you to import standalone Cryptol .cry files by using the import command:

As an aside, note that we have also checked in a , which contains a complete SAW file. We encourage you not to look at this file for now, since following along with the tutorial is meant to illustrate the “a-ha moments” that one would have in the process of writing the proofs. In you become stuck while following along and absolutely need a hint, however, then this file can help you become unstuck.

Verifying our first `salsa20` function

Now it’s time to start verifying some salsa20 code. But where do we start? It’s tempting to start with apply_keystream, which is our end goal. This is likely going to be counter-productive, however, as apply_keystream is a large function with several moving parts. Throwing SAW at it immediately is likely to cause it to spin forever without making any discernible progress.

For this reason, we will instead take the approach of working from the bottom-up. That is, we will first verify the functions that apply_keystream transitively invokes, and then leverage compositional verification to verify a proof of apply_keystream using overrides. This approach naturally breaks up the problem into smaller pieces that are easier to understand in isolation.

If we look at the implementation of apply_keystream, we see that it invokes the round function, which in turn invokes the quarter_round function:

quarter_round is built on top of the standard library functions and , so we have finally reached the bottom of the call stack. This makes quarter_round a good choice for the first function to verify.

The implementation of the Rust quarter_round function is quite similar to the Cryptol quarterround function in Salsa20.cry:

The Cryptol quarterround function doesn’t have anything like the state argument in the Rust quarter_round function, but let’s not fret about that too much yet. Our SAW spec is going to involve quarterround somehow—we just have to figure out how to make it fit.

Let’s start filling out the SAW spec for quarter_round:

We are going to need some fresh variables for the a, b, c, and d arguments:

We will also need to allocate a reference for the state argument. The reference’s underlying type is STATE_WORDS (16) elements long:

Finally, we will need to pass these arguments to the function:

With that, we have a spec for quarter_round! It’s not very interesting just yet, as we don’t specify what state_ref should point to after the function has returned. But that’s fine for now. When developing a SAW proof, it can be helpful to first write out the “skeleton” of a function spec that only contains the call to mir_execute_func, without any additional preconditions or postconditions. We can add those later after ensuring that the skeleton works as expected.

Let’s check our progress thus far by running this through SAW:

We’ve already run into some type errors. Not too surprising, considering this was our first attempt. The error message contains that STATE_WORDS is unbound. This makes sense if you think about it, as STATE_WORDS is defined in the Rust code, but not in the SAW file itself. Let’s fix that by adding this line to salsa20.saw:

That change fixes the type errors in quarter_round_spec. Hooray! Let’s press on.

Next, we need to add a call to mir_verify. In order to do this, we need to know what the full identifier for the quarter_round function is. Because it is defined in the salsa20 crate and in the core.rs file, so we would expect the identifier to be named salsa20::core::quarter_round:

However, SAW disagrees:

Ugh. This is a consequence of how mir-json disambiguates identifiers. Because there is a separate core crate in the Rust standard libraries, mir-json uses “core#1”, a distinct name, to refer to the core.rs file. You can see this for yourself by digging around in the MIR JSON file, if you’d like. (In a future version of SAW, one will be able to more easily.)

Once we change the identifier:

We can run SAW once more. This time, SAW complains about a different thing:

Here, SAW complains that we have an index out of bounds. Recall that we are indexing into the state array, which is of length 16, using the a/b/c/d arguments. Each of these arguments are of type usize, and because we are declaring these to be symbolic, it is quite possible for each argument to be 16 or greater, which would cause the index into state to be out of bounds.

In practice, however, the only values of a/b/c/d that we will use are less than 16. We can express this fact as a precondition:

That is enough to finally get SAW to verify this very stripped-down version of quarter_round_spec. Some good progress! But we aren’t done yet, as we don’t yet say what happens to the value that state points to after the function returns. This will be a requirement if we want to use quarter_round_spec in compositional verification (and we do want this), so we should address this shortly.

Recall that unlike the Rust quarter_round function, the Cryptol quarterround function doesn’t have a state argument. This is because the Rust function does slightly more than what the Cryptol function does. The Rust function will look up elements of the state array, use them to perform the computations that the Cryptol function does, and then insert the new values back into the state array. To put it another way: the Rust function can be thought of as a wrapper around the Cryptol function that also performs an in-place bulk update of the state array.

In Cryptol, one can look up elements of an array using the (@@) function, and one can perform in-place array updates using the updates function. This translates into a postcondition that looks like this:

What does SAW think of this? Someone surprisingly, SAW finds a counterexample:

Note that in this counterexample, the values of c and d are the same. In the Rust version of the function, each element in state is updated sequentially, so if two of the array indices are the same, then the value that was updated with the first index will later be overwritten by the value at the later index. In the Cryptol version of the function, however, all of the positions in the array are updated simultaneously. This implicitly assumes that all of the array indices are disjoint from each other, an assumption that we are not encoding into quarter_round_spec’s preconditions.

At this point, it can be helpful to observe how the quarter_round function is used in practice. The call sites are found in the rounds function:

Here, we can see that the values of a/b/c/d will only ever be chosen from a set of eight possible options. We can take advantage of this fact to constrain the possible set of values for a/b/c/d. The latest iteration of the quarter_round_spec is now:

Note that:

The indices value is constrained (via a precondition) to be one of the set of values that is chosen in the rounds function. (Note that \/ is the logical-or function in Cryptol.) Each of these are concrete values that are less than STATE_WORDS (16), so we no longer need a precondition stating a <STATE_WORDS / …`.
Because we now reference indices in the preconditions, we have moved its definition up. (Previously, it was defined in the postconditions section.)

With this in place, will SAW verify quarter_round_spec now?

At long last, it succeeds. Hooray! SAW does have to think for a while, however, as this proof takes about 17 seconds to complete. It would be unfortunate to have to wait 17 seconds on every subsequent invocation of SAW, and since we still have other functions to verify, this is a very real possibility. For this reason, it can be helpful to temporarily turn this use of mir_verify into a mir_unsafe_assume_spec:

Once we are done with the entire proof, we can come back and remove this use of mir_unsafe_assume_spec, as we’re only using it as a time-saving measure.

Verifying the `rounds` function

Now that we’ve warmed up, let’s try verifying the rounds function, which is where quarter_round is invoked. Here is the full definition of rounds:

First, rounds performs COUNT rounds on the state argument. After this, it takes each element of self.state and adds it to the corresponding element in state.

Linking back at the Salsa20 spec, we can see that the rounds function is almost an implementation of the Salsa20(x) hash function. The only notable difference is that while the Salsa20(x) hash function converts the results to little-endian form, the rounds function does not. Salsa20.cry implements this part of the spec here:

Where Salsa20 is the hash function, and Salsa20_rounds is the part of the hash function that excludes the little-endian conversions. In other words, Salsa20_rounds precisely captures the behavior of the Rust rounds function.

One aspect of the rounds function that will make verifying it slightly different from verifying quarter_rounds is that rounds is defined in an impl block for the Core struct. This means that the &mut self argument in rounds has the type &mut Core<R>. As such, we will have to look up the Core ADT in SAW using mir_find_adt.

This raises another question, however: when looking up Core<R>, what type should we use to instantiate R? As noted above, our choices are R8, R12, and R20, depending on how many rounds you want. For now, we’ll simply hard-code it so that R is instantiated to be R8, but we will generalize this a bit later.

Alright, enough chatter—time to start writing a proof. First, let’s look up the R8 ADT. This is defined in the salsa20 crate in the rounds.rs file, so its identifier becomes salsa20::rounds::R8:

Next, we need to look up the PhantomData<R8> ADT, which is used in the rounds field of the Core<R8> struct. This is defined in core::marker:

Finally, we must look up Core<R8> itself. Like quarter_round, the Core structis defined insalsa20::core#1`:

Now that we have the necessary prerequisites, let’s write a spec for the rounds function. First, we need to allocate a reference for the self argument:

Next, we need to create symbolic values for the fields of the Core struct, which self_ref will point to. The self.state field will be a fresh array, and the self.rounds field will be a simple, empty struct value:

Finally, putting all of the self values together:

Next, we need a state argument (not to be confused with the self.state field in Core). This is handled much the same as it was in quarter_round_spec:

Lastly, we cap it off with a call to mir_execute_func:

(Again, we’re missing some postconditions describing what self_ref and state_ref point to after the function returns, but we’ll return to that in a bit.)

If we run SAW at this point, we see that everything in rounds_spec typechecks, so we’re off to a good start. Let’s keep going and add a mir_verify call.

Here, we are faced with an interesting question: what is the identifier for rounds::<R8>? The rounds function is defined using generics, so we can’t verify it directly—we must instead verify a particular instantiation of rounds. At present, there isn’t a good way to look up what the identifiers for instantiations of generic functions are (there ), but it turns out that the identifier for rounds::<R8> is this:

Note that we are using quarter_round_ov as a compositional override. Once again, SAW is happy with our work thus far:

Nice. Now let’s go back and fill in the missing postconditions in rounds_spec. In particular, we must declare what happens to both self_ref and state_ref. A closer examination of the code in the Rust rounds function reveals that the self argument is never modified at all, so that part is easy:

The state argument, on the other hand, is modified in-place. This time, our job is made easier by the fact that Salsa20_rounds implements exactly what we need. Because we are instantiating rounds at type R8, we must explicitly state that we are using 8 rounds:

Once again, SAW is happy with our work. We’re on a roll!

Now let’s address the fact that we are hard-coding everything to R8, which is somewhat uncomfortable. We can make things better by allowing the user to specify the number of rounds. The first thing that we will need to change is the r_adt definition, which is response for looking up R8. We want to turn this into a function that, depending on the user input, will look up R8, R12, or R20:

Where str_concat is a SAW function for concatenating strings together:

We also want to parameterize phantom_data_adt and core_adt:

Next, we need to parameterize rounds_spec by the number of rounds. This will require changes in both the preconditions and postconditions. On the preconditions side, we must pass the number of rounds to the relevant functions:

And on the postconditions side, we must pass the number of rounds to the Cryptol Salsa20_rounds function:

Finally, we must adjust the call to rounds_spec in the context of mir_verify so that we pick 8 as the number of rounds:

SAW is happy with this generalization. To demonstrate that we have generalized things correctly, we can also verify the same function at R20 instead of R8:

The only things that we had to change were the identifier and the argument to rounds_spec. Not bad!

Verifying the `counter_setup` function

We’re very nearly at the point of being able to verify apply_keystream. Before we do, however, there is one more function that apply_keystream calls, which we ought to verify first: counter_setup. Thankfully, the implementation of counter_setup is short and sweet:

This updates the elements of the state array at indices 8 and 9 with the lower 32 bits and the upper 32 bits of the counter argument, respecitvely. At a first glance, there doesn’t appear to be any function in Salsa20.cry that directly corresponds to what counter_setup does. This is a bit of a head-scratcher, but the answer to this mystery will become more apparent as we get further along in the proof.

For now, we should take matters into our own hands and write our own Cryptol spec for counter_setup. To do this, we will create a new Cryptol file named Salsa20Extras.cry, which imports from Salsa20.cry:

The Cryptol implementation of counter_setup will need arrays of length STATE_WORDS, so we shall define STATE_WORDS first:

Note that we preceded this definition with the type keyword. In Cryptol, sequence lengths are encoded at the type level, so if we want to use STATE_WORDS at the type level, we must declare it as a type.

Finally, we can write a Cryptol version of counter_setup using our old friend updates to perform a bulk sequence update:

Note that counter is a 64-bit word, but the elements of the state sequence are 32-bit words. As a result, we have to explicitly truncate counter and counter >> 32 to 32-bit words by using the drop function, which drops the first 32 bits from each word.

Returning to salsa20.saw, we must now make use of our new Cryptol file by importing it at the top:

With the counter_setup Cryptol implementation in scope, we can now write a spec for the Rust counter_setup function. There’s not too much to remark on here, as the spec proves relatively straightforward to write:

We can now verify counter_setup against counter_setup_spec at lengths 8 and 20:

That wasn’t so bad. It’s a bit unsatisfying that we had to resort to writing a Cryptol function not found in Salsa20.cry, but go along with this for now—it will become apparent later why this needed to be done.

Verifying the `apply_keystream` function (first attempt)

It’s time. Now that we’ve verified rounds and counter_setup, it’s time to tackle the topmost function in the call stack: apply_keystream:

There aren’t that many lines of code in this function, but there is still quite a bit going on. Let’s walk through apply_keystream in more detail:

The output argument represents the message to encrypt (or decrypt). output is a slice of bytes, so in principle, output can have an arbitrary length. That being said, the first line of apply_keystream’s implementation checks that output’s length is equal to BLOCK_SIZE:
Where BLOCK_SIZE is defined here:
So in practice, output must have exactly 64 elements.
Next, apply_keystream invokes the counter_setup and rounds functions to set up the keystream (the local state variable).
Finally, apply_keystream combines the keystream with output. It does so by chunking output into a sequence of 4 bytes, and then it XOR’s the value of each byte in-place with the corresponding byte from the keystream. This performs little-endian conversions as necessary.

The fact that we are XOR’ing bytes strongly suggests that this is an implementation of the Salsa20 encryption function from the spec. There is an important difference between how the Salsa20 spec defines the encryption function versus how apply_keystream defines it, however. In the Salsa20 spec, encryption is a function of a key, nonce, and a message. apply_keystream, on the other hand, is a function of self’s internal state, a counter, and a message. The two aren’t quite the same, which is makes it somewhat tricky to describe one in terms of the other.

Salsa20.cry defines a straightforward Cryptol port of the Salsa20 encryption function from the spec, named Salsa20_encrypt. Because it takes a key and a nonce as an argument, it’s not immediately clear how we’d tie this back to apply_keystream. But no matter: we can do what we did before and define our own Cryptol version of apply_keystream in Salsa20Extras.cry:

This implementation builds on top of the Cryptol counter_setup and Salsa20_rounds functions, which we used as the reference implementations for the Rust counter_setup and rounds functions, respectively. We also make sure to define a BLOCK_SIZE type alias at the top of the file:

Now let’s write a SAW spec for apply_keystream. Once again, we will need to reference BLOCK_SIZE when talking about the output-related parts of the spec, so make sure to define BLOCK_SIZE at the top of the .saw file:

First, we need to declare all of our arguments, which proceeds as you would expect:

What about the postconditions? We have two mutable references to contend with: self_ref and output_ref. The postcondition for self_ref is fairly straightforward: the only time it is ever modified is when counter_setup is called. This means that after the apply_keystream function has returned, self_ref will point to the results of calling the counter_setup Cryptol function:

output_ref is where the interesting work happenings. After the Rust apply_keystream function has returned, it will point to the results of calling the Cryptol apply_keystream function that we just defined:

Finally, we can put this all together and verify apply_keystream against apply_keystream_spec at lengths 8 and 20:

SAW will successfully verify these. We’ve achieved victory… or have we? Recall that we had to tailor the Cryptol apply_keystream function to specifically match the behavior of the corresponding Rust code. This makes the proof somewhat underwhelming, since the low-level implementation is nearly identical to the high-level spec.

A more impressive proof would require linking apply_keystream to a Cryptol function in the Salsa20.cry file, which was developed independently of the Rust code. As we mentioned before, however, doing so will force us to reconcile the differences in the sorts of arguments that each function takes, as apply_keystream doesn’t take a key or nonce argument. Time to think for a bit.

Verifying the `new_raw` function

At this point, we should ask ourselves: why doesn’t apply_keystream take a key or nonce argument? The reason lies in the fact that the salsa20 crate implements Salsa20 in a stateful way. Specifically, the Core struct stores internal state that is used to compute the keystream to apply when hashing. In order to use this internal state, however, we must first initialize it. The new function that is responsible for this initialization:

Sure enough, this function takes a key and a nonce as an argument! This is a critical point that we overlooked. When using the salsa20 crate, you wouldn’t use the apply_keystream function in isolation. Instead, you would create an initial Core value using new, and then you would invoke apply_keystream. The Salsa20 spec effectively combines both of these operations in is encryption function, whereas the salsa20 splits these two operations into separate functions altogether.

Strictly speaking, we don’t need to verify new in order to verify apply_keystream, as the latter never invokes the former. Still, it will be a useful exercise to verify new, as the insights we gain when doing so will help us write a better version of apply_keystream_spec.

All that being said, we probably to verify new_raw (a lower-level helper function) rather than new itself. This is because the definitions of Key and Nonce are somewhat involved. For instance, Key is defined as:

The full implementation of new_raw is rather long, so we won’t inline the whole thing here. At a high level, it initializes the state array of a Core value by populating each element of the array with various things. Some elements of the array are populated with key, some parts are populated with iv (i.e., the nonce), and other parts are populated with an array named CONSTANTS:

The comment about "expand 32-byte k" is a strong hint that new_raw is implementing a portion of the Salsa20 expansion function from the spec. (No really, the spec literally says to use the exact string "expand 32-byte k"—look it up!) The Salsa20.cry Cryptol file has an implementation of this portion of the expansion function, which is named Salsa20_init:

Note that we were careful to say a portion of the Salsa20 expansion function. There is also a Cryptol implementation of the full expansion function, named Salsa20_expansion:

This calls Salsa20_init followed by Salsa20, the latter of which performs hashing. Importantly, new_raw does not do any hashing on its own, just initialization. For this reason, we want to use Salsa20_init as the reference implementation of new_raw, not Salsa20_expansion.

Alright, time to write a SAW spec. The first part of the spec is straightforward:

As is usually the case, the postconditions are the tricky part. We know that the behavior of new_raw will roughly coincide with the Salsa20_init function, so let’s try that first:

If we attempt to verify this using mir_verify:

SAW complains thusly:

Here, the 2nd tuple field is the nonce_arr in Salsa20_init(key_arr, nonce_arr). And sure enough, Salsa20_init expects the 2nd tuple field to be a sequence of 16 elements, but nonce_arr only has 8 elements. Where do we get the remaining 8 elements from?

The answer to this question can be found by looking at the implementation of new_raw more closely. Let’s start at this code:

This will chunk up iv (the nonce) into two 4-byte chunks and copies them over to the elements of state array at indices 6 and 7. This is immediately followed by two updates at indices 8 and 9, which are updated to be 0:

If you take the two 4-bytes chunks of iv and put two 4-byte 0 values after them, then you would have a total of 16 bytes. This suggests that the nonce value that Salsa20_init expects is actually this:

Where zero : [8][8] is a Cryptol expression that returns all zeroes, and (#) is the Cryptol operator for concatenating two sequences together. Let’s update new_raw_spec to reflect this:

This is closer to what we want, but not quite. SAW still complains:

This is because Salsa20_init returns something of type [64][8], which corresponds to the Rust type [u8; 64]. self.state, on the other hand, is of type [u32; 16]. These types are very close, as they both contain the same number of bytes, but they are chunked up differently. Recall the code that copies the nonce value over to self.state:

Note that [64][8] is the Cryptol equivalent of [u8; 64], and [16][32] is the Cryptol equivalent of [u32; 16]. As such, this is exactly the function that we need to resolve the differences in types:

With that change, SAW is finally happy with new_raw_spec and successfully verifies it.

There is an interesting connection between the new_raw and counter_setup functions. Both functions perform in-place updates on state at indices 8 and 9. Whereas new_raw always sets these elements of state to 0, counter_setup will set them to the bits of the counter argument (after converting counter to little-endian form). This means that if you invoke counter_setup right after new_raw, then counter_setup would overwrite the 0 values with the counter argument. In order words, it would be tantamount to initializing state like so:

Where littleendian_inverse (a sibling of littleendian_state) converts a [64] value to a [8][8] one. This pattern is a curious one…

Verifying the `apply_keystream` function (second attempt)

Let’s now return to the problem of linking apply_keystream up to Salsa20_encrypt. In particular, let’s take a closer look at the definition of Salsa20_encrypt itself:

Does anything about this definition strike you as interesting? Take a look at the v#(littleendian_inverse i) part—we just saw a use of littleendian_inverse earlier in our discussion about initializing the state! Moreover, v is the nonce argument, so it is becoming clearer that Sals20_encrypt is creating an initial state is much the same way that new_raw is.

A related question: what is the i value? The answer is somewhat technical: the Salsa20 encryption function is designed to work with messages with differing numbers of bytes (up to 2^^70 bytes, to be exact). Each 8-byte chunk in the message will be encrypted with a slightly difference nonce. For instance, the first 8-byte chunk’s nonce will have its lower 32 bits set to 0, the second 8-byte chunk’s nonce will have its lower 32 bits set to 1, and so on. In general, the ith 8-byte chunk’s nonce will have its lower 32 bits set to i, and this corresponds exactly to the i in the expression littleendian_inverse i.

Note, however, that apply_keystream only ever uses a message that consists of exactly eight 8-byte chunks. This means that Salsa20_encrypt will only ever invoke Salsa20_expansion once with a nonce value where the lower 32 bits are set to 0. That is, it will perform encryption with an initial state derived from:

Which can be further simplified to Salsa20_init(k, v # zero). This is very nearly what we want, as this gives us the behavior of the Rust new_raw function. There’s just one problem though: it doesn’t take the behavior of counter_setup into account. How do we go from zero to littleendian_inverse counter?

While Salsa20_encrypt doesn’t take counters into account at all, it is not too difficult to generalize Salsa20_encrypt in this way. There is a variant of Salsa20_encrypt in the same file named Salsa20_encrypt_with_offset, where the offset argument o serves the same role that counter does in counter_setup:

(Observe that Salsa20_encrypt(count, k, v, m) is equivalent to Salsa20_encrypt_with_offset(count, k, v, 0, m).)

At long last, we have discovered the connection between apply_keystream and the Salsa20 spec. If you assume that you invoke new_raw beforehand, then the behavior of apply_keystream corresponds exactly to that of Salsa20_encrypt_with_offset. This insight will inform us how to write an alternative SAW spec for apply_keystream:

Observe the following differences between apply_keystream_alt_spec and our earlier apply_keystream_spec:

In apply_keystream_alt_spec, we declare fresh key and nonce values, which weren’t present at all in apply_keystream_spec.
In apply_keystream_alt_spec, we no longer make self_state a fresh, unconstrained value. Instead, we declare that it must be the result of calling Salsa20_init on the key, nonce, and counter values. This is the part that encodes the assumption that new_raw was invoked beforehand.

The parts of the spec relating to output remain unchanged:

The postconditions are slightly different in apply_keystream_alt_spec. While the parts relating to self_ref remain unchanged, we now have output_ref point to the results of calling Salsa20_encrypt_with_offset:

Tying this all together, we call mir_verify, making sure to use compositional overrides involving counter_setup and rounds:

At long last, it is time to run SAW on this. When we do, we see this:

After this, SAW loops forever. Oh no! While somewhat disheartening, this is a reality of SMT-based verification that we must content with. SMT solvers are extremely powerful, but their performance can sometimes be unpredictable. The task of verifying apply_keystream_alt_spec is just complicated enough that Z3 cannot immediately figure out that the proof is valid, so it resorts to much slower algorithms to solve proof goals.

We could try waiting for Z3 to complete, but we’d be waiting for a long time. It’s not unheard of for SMT solvers to take many hours on especially hard problems, but we don’t have that many hours to spare. We should try a slightly different approach instead.

When confronted with an infinite loop in SAW, there isn’t a one-size-fits-all solution that will cure the problem. Sometimes, it is worth stating your SAW spec in a slightly different way such that the SMT solver can spot patterns that it couldn’t before. Other times, it can be useful to try and break the problem up into smaller functions and use compositional verification to handle the more complicated subfunctions. As we mentioned before, the performance of SMT solvers in unpredictable, and it’s not always obvious what the best solution is.

In this example, however, the problem lies with Z3 itself. As it turns out, Yices (a different SMT solver) can spot the patterns needed to prove apply_keystream_alt_spec immediately. Fortunately, SAW includes support for both Z3 and Yices. In order to switch from Z3 to Yices, swap out the z3 proof script with yices:

After doing this, SAW is leverage Yices to solve the proof goals almost immediately:

And with that, we’re finally done! You’ve successfully completed a non-trivial SAW exercise in writing some interesting proofs. Give yourself a well-deserved pat on the back.

The process of developing these proofs was bumpy at times, but that is to be expected. You very rarely get a proof correct on the very first try, and when SAW doesn’t accept your proof, it is important to be able to figure out what went wrong and how to fix it. This is a skill that takes some time to grow, but with enough time and experience, you will be able to recognize common pitfalls. This case study showed off some of these pitfalls, but there are likely others.

A final word

Tutorial (LLVM/Java)

Introduction

SAWScript is a special-purpose programming language developed by Galois to help orchestrate and track the results of the large collection of proof tools necessary for analysis and verification of complex software artifacts.

The language adopts the functional paradigm, and largely follows the structure of many other typed functional languages, with some special features specifically targeted at the coordination of verification and analysis tasks.

This tutorial introduces the details of the language by walking through several examples, and showing how simple verification tasks can be described. The complete examples are available on GitHub. Most of the examples make use of inline specifications written in Cryptol, a language originally designed for high-level descriptions of cryptographic algorithms. For readers unfamiliar with Cryptol, various documents describing its use are available here.

Example: Find First Set

As a first example, we consider showing the equivalence of several quite different implementations of the POSIX ffs function, which identifies the position of the first 1 bit in a word. The function takes an integer as input, treated as a vector of bits, and returns another integer which indicates the index of the first bit set (zero being the least significant). This function can be implemented in several ways with different performance and code clarity tradeoffs, and we would like to show those different implementations are equivalent.

Reference Implementation

One simple implementation takes the form of a loop with an index initialized to zero, and a mask initialized to have the least significant bit set. On each iteration, we increment the index, and shift the mask to the left. Then we can use a bitwise “and” operation to test the bit at the index indicated by the index variable. The following C code (which is also in the ffs.c file on GitHub) uses this approach.

uint32_t ffs_ref(uint32_t word) {
    int i = 0;
    if(!word)
        return 0;
    for(int cnt = 0; cnt < 32; cnt++)
        if(((1 << i++) & word) != 0)
            return i;
    return 0; // notreached
}

This implementation is relatively straightforward, and a proficient C programmer would probably have little difficulty believing its correctness. However, the number of branches taken during execution could be as many as 32, depending on the input value. It’s possible to implement the same algorithm with significantly fewer branches, and no backward branches.

Alternative Implementations

An alternative implementation, taken by the following function (also in ffs.c), treats the bits of the input word in chunks, allowing sequences of zero bits to be skipped over more quickly.

uint32_t ffs_imp(uint32_t i) {
    char n = 1;
    if (!(i & 0xffff)) { n += 16; i >>= 16; }
    if (!(i & 0x00ff)) { n += 8;  i >>= 8; }
    if (!(i & 0x000f)) { n += 4;  i >>= 4; }
    if (!(i & 0x0003)) { n += 2;  i >>= 2; }
    return (i) ? (n+((i+1) & 0x01)) : 0;
}

Another optimized version, in the following rather mysterious program (also in ffs.c), based on the ffs implementation in musl libc.

uint32_t ffs_musl (uint32_t x)
{
  static const char debruijn32[32] = {
    0, 1, 23, 2, 29, 24, 19, 3, 30, 27, 25, 11, 20, 8, 4, 13,
    31, 22, 28, 18, 26, 10, 7, 12, 21, 17, 9, 6, 16, 5, 15, 14
  };
  return x ? debruijn32[(x&-x)*0x076be629 >> 27]+1 : 0;
}

These optimized versions are much less obvious than the reference implementation. They might be faster, but how do we gain confidence that they calculate the same results as the reference implementation?

Finally, consider a buggy implementation which is correct on all but one possible input (also in ffs.c). Although contrived, this program represents a case where traditional testing – as opposed to verification – is unlikely to be helpful.

uint32_t ffs_bug(uint32_t word) {
    // injected bug:
    if(word == 0x101010) return 4; // instead of 5
    return ffs_ref(word);
}

SAWScript allows us to state these problems concisely, and to quickly and automatically 1) prove the equivalence of the reference and optimized implementations on all possible inputs, and 2) find an input exhibiting the bug in the third version.

Generating LLVM Code

SAW can analyze LLVM code, but most programs are originally written in a higher-level language such as C, as in our example. Therefore, the C code must be translated to LLVM, using something like the following command:

> clang -g -c -emit-llvm -o ffs.bc ffs.c

The -g flag instructs clang to include debugging information, which is useful in SAW to refer to variables and struct fields using the same names as in C. We have tested SAW successfully with versions of clang from 3.6 to 7.0. Please report it as a bug on GitHub if SAW fails to parse any LLVM bitcode file.

This command, and following command examples in this tutorial, can be run from the code directory on GitHub. A Makefile also exists in that directory, providing quick shortcuts for tasks like this. For instance, we can get the same effect as the previous command by running:

> make ffs.bc

Equivalence Proof

We now show how to use SAWScript to prove the equivalence of the reference and implementation versions of the FFS algorithm, and exhibit the bug in the buggy implementation.

A SAWScript program is typically structured as a sequence of commands, potentially along with definitions of functions that abstract over commonly-used combinations of commands.

The following script (in ffs_llvm.saw) is sufficient to automatically prove the equivalence of ffs_ref with ffs_imp and ffs_musl, and identify the bug in ffs_bug.

set_base 16;

print "Extracting reference term: ffs_ref";
l <- llvm_load_module "ffs.bc";
ffs_ref <- llvm_extract l "ffs_ref";

print "Extracting implementation term: ffs_imp";
ffs_imp <- llvm_extract l "ffs_imp";

print "Extracting implementation term: ffs_musl";
ffs_musl <- llvm_extract l "ffs_musl";

print "Extracting buggy term: ffs_bug";
ffs_bug <- llvm_extract l "ffs_bug";

print "Proving equivalence: ffs_ref == ffs_imp";
let thm1 = {{ \x -> ffs_ref x == ffs_imp x }};
result <- prove abc thm1;
print result;

print "Proving equivalence: ffs_ref == ffs_musl";
let thm2 = {{ \x -> ffs_ref x == ffs_musl x }};
result <- prove abc thm2;
print result;

print "Finding bug via sat search: ffs_ref != ffs_bug";
let thm3 = {{ \x -> ffs_ref x != ffs_bug x }};
result <- sat abc thm3;
print result;

print "Finding bug via failed proof: ffs_ref == ffs_bug";
let thm4 = {{ \x -> ffs_ref x == ffs_bug x }};
result <- prove abc thm4;
print result;

print "Done.";

In this script, the print commands simply display text for the user. The llvm_extract command instructs the SAWScript interpreter to perform symbolic simulation of the given C function (e.g., ffs_ref) from a given bitcode file (e.g., ffs.bc), and return a term representing the semantics of the function.

The let statement then constructs a new term corresponding to the assertion of equality between two existing terms. Arbitrary Cryptol expressions can be embedded within SAWScript; to distinguish Cryptol code from SAWScript commands, the Cryptol code is placed within double brackets {{ and }}.

The prove command can verify the validity of such an assertion, or produce a counter-example that invalidates it. The abc parameter indicates what theorem prover to use; SAWScript offers support for many other SAT and SMT solvers as well as user definable simplification tactics.

Similarly, the sat command works in the opposite direction to prove. It attempts to find a value for which the given assertion is true, and fails if there is no such value.

If the saw executable is in your PATH, you can run the script above with

> saw ffs_llvm.saw

producing the output

Loading file "ffs_llvm.saw"
Extracting reference term: ffs_ref
Extracting implementation term: ffs_imp
Extracting implementation term: ffs_musl
Extracting buggy term: ffs_bug
Proving equivalence: ffs_ref == ffs_imp
Valid
Proving equivalence: ffs_ref == ffs_musl
Valid
Finding bug via sat search: ffs_ref != ffs_bug
Sat: [x = 0x101010]
Finding bug via failed proof: ffs_ref == ffs_bug
prove: 1 unsolved subgoal(s)
Invalid: [x = 0x101010]
Done.

Note that both explicitly searching for an input exhibiting the bug (with sat) and attempting to prove the false equivalence (with prove) exhibit the bug. Symmetrically, we could use sat to prove the equivalence of ffs_ref and ffs_imp, by checking that the corresponding disequality is unsatisfiable. Indeed, this exactly what happens behind the scenes: prove abc <goal> is essentially not (sat abc (not <goal>)).

Cross-Language Proofs

We can implement the FFS algorithm in Java with code almost identical to the C version.

The reference version (in FFS.java) uses a loop, like the C version:

    static int ffs_ref(int word) {
        int i = 0;
        if(word == 0)
            return 0;
        for(int cnt = 0; cnt < 32; cnt++)
            if(((1 << i++) & word) != 0)
                return i;
        return 0;
    }

And the efficient implementation uses a fixed sequence of masking and shifting operations:

    static int ffs_imp(int i) {
        byte n = 1;
        if ((i & 0xffff) == 0) { n += 16; i >>= 16; }
        if ((i & 0x00ff) == 0) { n +=  8; i >>=  8; }
        if ((i & 0x000f) == 0) { n +=  4; i >>=  4; }
        if ((i & 0x0003) == 0) { n +=  2; i >>=  2; }
        if (i != 0) { return (n+((i+1) & 0x01)); } else { return 0; }
    }

Although in this case we can look at the C and Java code and see that they perform almost identical operations, the low-level operators available in C and Java do differ somewhat. Therefore, it would be nice to be able to gain confidence that they do, indeed, perform the same operation.

We can do this with a process very similar to that used to compare the reference and implementation versions of the algorithm in a single language.

First, we compile the Java code to a JVM class file.

> javac -g FFS.java

Like with clang, the -g flag instructs javac to include debugging information, which can be useful to preserve variable names.

Using saw with Java code requires a command-line option -b that locates Java. Run the code in this section with the command:

> saw -b <path to directory where Java lives> ffs_compare.saw

Alternatively, if Java is located on your PATH, you can omit the -b option entirely.

Both Oracle JDK and OpenJDK versions 6 through 8 work well with SAW. SAW also includes experimental support for JDK 9 and later. Code that only involves primitive data types (such as FFS.java above) is known to work well under JDK 9+, but there are some as-of-yet unresolved issues in verifying code involving classes such as String. For more information on these issues, refer to this GitHub issue.

Now we can do the proof both within and across languages (from ffs_compare.saw):

import "ffs.cry";
j <- java_load_class "FFS";
java_ffs_ref <- jvm_extract j "ffs_ref";
java_ffs_imp <- jvm_extract j "ffs_imp";

l <- llvm_load_module "ffs.bc";
c_ffs_ref <- llvm_extract l "ffs_ref";
c_ffs_imp <- llvm_extract l "ffs_imp";

print "java ref <-> java imp";
let thm1 = {{ \x -> java_ffs_ref x == java_ffs_imp x }};
prove_print abc thm1;

print "c ref <-> c imp";
let thm2 = {{ \x -> c_ffs_ref x == c_ffs_imp x }};
prove_print abc thm2;

print "java imp <-> c imp";
let thm3 = {{ \x -> java_ffs_imp x == c_ffs_imp x }};
prove_print abc thm3;

print "cryptol imp <-> c imp";
let thm4 = {{ \x -> ffs_imp x == c_ffs_imp x }};
prove_print abc thm4;

print "cryptol imp <-> cryptol ref";
let thm5 = {{ \x -> ffs_imp x == ffs_ref x }};
prove_print abc thm5;

print "Done.";

Here, the jvm_extract function works like llvm_extract, but on a Java class and method name. The prove_print command works similarly to the prove followed by print combination used for the LLVM example above.

Using SMT-Lib Solvers

The examples presented so far have used the internal proof system provided by SAWScript, based primarily on a version of the ABC tool from UC Berkeley linked into the saw executable. However, there is internal support for other proof tools – such as Yices and Z3 as illustrated in the next example – and more general support for exporting models representing theorems as goals in the SMT-Lib language. These exported goals can then be solved using an external SMT solver.

Consider the following C file:

int double_ref(int x) {
    return x * 2;
}

int double_imp(int x) {
    return x << 1;
}

In this trivial example, an integer can be doubled either using multiplication or shifting. The following SAWScript program (in double.saw) verifies that the two are equivalent using both internal Yices and Z3 modes, and by exporting an SMT-Lib theorem to be checked later, by an external SAT solver.

l <- llvm_load_module "double.bc";
double_imp <- llvm_extract l "double_imp";
double_ref <- llvm_extract l "double_ref";
let thm = {{ \x -> double_ref x == double_imp x }};

r <- prove yices thm;
print r;

r <- prove z3 thm;
print r;

let thm_neg = {{ \x -> ~(thm x) }};
write_smtlib2 "double.smt2" thm_neg;

print "Done.";

The new primitives introduced here are the tilde operator, ~, which constructs the logical negation of a term, and write_smtlib2, which writes a term as a proof obligation in SMT-Lib version 2 format. Because SMT solvers are satisfiability solvers, their default behavior is to treat free variables as existentially quantified. By negating the input term, we can instead treat the free variables as universally quantified: a result of “unsatisfiable” from the solver indicates that the original term (before negation) is a valid theorem. The prove primitive does this automatically, but for flexibility the write_smtlib2 primitive passes the given term through unchanged, because it might be used for either satisfiability or validity checking.

The SMT-Lib export capabilities in SAWScript make use of the Haskell SBV package, and support ABC, Boolector, CVC4, CVC5, MathSAT, Yices, and Z3.

External SAT Solvers

In addition to the abc, z3, and yices proof tactics used above, SAWScript can also invoke arbitrary external SAT solvers that read CNF files and produce results according to the SAT competition input and output conventions, using the external_cnf_solver tactic. For example, you can use PicoSAT to prove the theorem thm from the last example, with the following commands:

let picosat = external_cnf_solver "picosat" ["%f"];
prove_print picosat thm;

The use of let is simply a convenient abbreviation. The following would be equivalent:

prove_print (external_cnf_solver "picosat" ["%f"]) thm;

The first argument to external_cnf_solver is the name of the executable. It can be a fully-qualified name, or simply the bare executable name if it’s in your PATH. The second argument is an array of command-line arguments to the solver. Any occurrence of %f is replaced with the name of the temporary file containing the CNF representation of the term you’re proving.

The external_cnf_solver tactic is based on the same underlying infrastructure as the abc tactic, and is generally capable of proving the same variety of theorems.

To write a CNF file without immediately invoking a solver, use the offline_cnf tactic, or the write_cnf top-level command.

Compositional Proofs

The examples shown so far treat programs as monolithic entities. A Java method or C function, along with all of its callees, is translated into a single mathematical model. SAWScript also has support for more compositional proofs, as well as proofs about functions that use heap data structures.

Compositional Imperative Proofs

As a simple example of compositional reasoning on imperative programs, consider the following Java code.

class Add {
    public static int add(int x, int y) {
        return x + y;
    }

    public static int dbl(int x) {
        return add(x, x);
    }
}

Here, the add function computes the sum of its arguments. The dbl function then calls add to double its argument. While it would be easy to prove that dbl doubles its argument by following the call to add, it’s also possible in SAWScript to prove something about add first, and then use the results of that proof in the proof of dbl, as in the following SAWScript code (java_add.saw on GitHub).

let add_spec = do {
    x <- jvm_fresh_var "x" java_int;
    y <- jvm_fresh_var "y" java_int;
    jvm_execute_func [jvm_term x, jvm_term y];
    jvm_return (jvm_term {{ x + y }});
};

let dbl_spec = do {
    x <- jvm_fresh_var "x" java_int;
    jvm_execute_func [jvm_term x];
    jvm_return (jvm_term {{ x + x }});
};

cls <- java_load_class "Add";
ms <- jvm_verify cls "add" [] false add_spec abc;
ms' <- jvm_verify cls "dbl" [ms] false dbl_spec abc;
print "Done.";

This can be run as follows:

> saw -b <path to directory where Java lives> java_add.saw

In this example, the definitions of add_spec and dbl_spec provide extra information about how to configure the symbolic simulator when analyzing Java code. In this case, the setup blocks provide explicit descriptions of the implicit configuration used by jvm_extract (used in the earlier Java FFS example and in the next section). The jvm_fresh_var commands instruct the simulator to create fresh symbolic inputs to correspond to the Java variables x and y. Then, the jvm_return commands indicate the expected return value of the each method, in terms of existing models (which can be written inline). Because Java methods can operate on references, as well, which do not exist in Cryptol, Cryptol expressions must be translated to JVM values with jvm_term.

To make use of these setup blocks, the jvm_verify command analyzes the method corresponding to the class and method name provided, using the setup block passed in as a parameter. It then returns an object that describes the proof it has just performed. This object can be passed into later instances of jvm_verify to indicate that calls to the analyzed method do not need to be followed, and the previous proof about that method can be used instead of re-analyzing it.

Interactive Interpreter

The examples so far have used SAWScript in batch mode on complete script files. It also has an interactive Read-Eval-Print Loop (REPL) which can be convenient for experimentation. To start the REPL, run SAWScript with no arguments:

> saw

The REPL can evaluate any command that would appear at the top level of a standalone script, or in the main function, as well as a few special commands that start with a colon:

:env     display the current sawscript environment
:type    check the type of an expression
:browse  display the current environment
:eval    evaluate an expression and print the result
:?       display a brief description about a built-in operator
:help    display a brief description about a built-in operator
:quit    exit the REPL
:load    load a module
:add     load an additional module
:cd      set the current working directory

As an example of the sort of interactive use that the REPL allows, consider the file code/NQueens.cry, which provides a Cryptol specification of the problem of placing a specific number of queens on a chess board in such a way that none of them threaten any of the others.

all : {n, a} (fin n) => (a -> Bit, [n]a) -> Bit
all (f, xs) = [ f x | x <- xs ] == ~zero

contains xs e = [ x == e | x <- xs ] != zero

distinct : {n,a} (fin n, Cmp a) => [n]a -> Bit
distinct xs =
    [ if n1 < n2 then x != y else True
    | (x,n1) <- numXs , (y,n2) <- numXs
    ] == ~zero
  where
  numXs = [ (x,n) | x <- xs | n <- [ (0:[width n]) ... ] ]

type Position n = [width (n - 1)]

type Board n = [n](Position n)

type Solution n = Board n -> Bit

checkDiag : {n} (fin n, n >= 1) => Board n -> (Position n, Position n) -> Bit
checkDiag qs (i, j) = (i >= j) || (diffR != diffC)
  where   qi = qs @ i
          qj = qs @ j
          diffR = if qi >= qj then qi-qj else qj-qi
          diffC = j - i                   // we know i < j

nQueens : {n} (fin n, n >= 1) => Solution n
nQueens qs = all (inRange qs, qs) && all (checkDiag qs, ijs `{n}) && distinct qs

ijs : {n}(fin n, n>= 1)=> [_](Position n, Position n)
ijs = [ (i, j) | i <- [0 .. (n-1)], j <- [0 .. (n-1)]]

inRange : {n} (fin n, n >= 1) => Board n -> Position n -> Bit
inRange qs x = x <= `(n - 1)

property nQueensProve x = (nQueens x) == False

This example gives us the opportunity to use the satisfiability checking capabilities of SAWScript on a problem other than equivalence verification.

First, we can load a model of the nQueens term from the Cryptol file.

sawscript> m <- cryptol_load "NQueens.cry"
sawscript> let nq8 = {{ m::nQueens`{8} }}

Once we’ve extracted this model, we can try it on a specific configuration to see if it satisfies the property that none of the queens threaten any of the others.

sawscript> print {{ nq8 [0,1,2,3,4,5,6,7] }}
False

This particular configuration didn’t work, but we can use the satisfiability checking tools to automatically find one that does.

sawscript> sat_print abc nq8
Sat [qs = [3, 1, 6, 2, 5, 7, 4, 0]]

And, finally, we can double-check that this is indeed a valid solution.

sawscript> print {{ nq8 [3,1,6,2,5,7,4,0] }}
True

Other Examples

The code directory on GitHub contains a few additional examples not mentioned so far. These remaining examples don’t cover significant new material, but help fill in some extra use cases that are similar, but not identical to those already covered.

Java Equivalence Checking

The previous examples showed comparison between two different LLVM implementations, and cross-language comparisons between Cryptol, Java, and LLVM. The script in ffs_java.saw compares two different Java implementations, instead.

print "Extracting reference term";
j <- java_load_class "FFS";
ffs_ref <- jvm_extract j "ffs_ref";

print "Extracting implementation term";
ffs_imp <- jvm_extract j "ffs_imp";

print "Proving equivalence";
let thm1 = {{ \x -> ffs_ref x == ffs_imp x }};
prove_print abc thm1;
print "Done.";

As with previous Java examples, this one needs to be run with the -b flag to tell the interpreter where to find Java:

> saw -b <path to directory where Java lives> ffs_java.saw

AIG Export and Import

Most of the previous examples have used the abc tactic to discharge theorems. This tactic works by translating the given term to And-Inverter Graph (AIG) format, transforming the graph in various ways, and then using a SAT solver to complete the proof.

Alternatively, the write_aig command can be used to write an AIG directly to a file, in AIGER format, for later processing by external tools, as shown in code/ffs_gen_aig.saw.

cls <- java_load_class "FFS";
bc <- llvm_load_module "ffs.bc";
java_ffs_ref <- jvm_extract cls "ffs_ref";
java_ffs_imp <- jvm_extract cls "ffs_imp";
c_ffs_ref <- llvm_extract bc "ffs_ref";
c_ffs_imp <- llvm_extract bc "ffs_imp";
write_aig "java_ffs_ref.aig" java_ffs_ref;
write_aig "java_ffs_imp.aig" java_ffs_imp;
write_aig "c_ffs_ref.aig" c_ffs_ref;
write_aig "c_ffs_imp.aig" c_ffs_imp;
print "Done.";

Conversely, the read_aig command can construct an internal term from an existing AIG file, as shown in ffs_compare_aig.saw.

java_ffs_ref <- read_aig "java_ffs_ref.aig";
java_ffs_imp <- read_aig "java_ffs_imp.aig";
c_ffs_ref <- read_aig "c_ffs_ref.aig";
c_ffs_imp <- read_aig "c_ffs_imp.aig";

let thm1 = {{ \x -> java_ffs_ref x == java_ffs_imp x }};
prove_print abc thm1;

let thm2 = {{ \x -> c_ffs_ref x == c_ffs_imp x }};
prove_print abc thm2;

print "Done.";

We can use external AIGs to verify the equivalence as follows, generating the AIGs with the first script and comparing them with the second:

> saw -b <path to directory where Java lives> ffs_gen_aig.saw
> saw ffs_compare_aig.saw

Files in AIGER format can be produced and processed by several external tools, including ABC, Cryptol version 1, and various hardware synthesis and verification systems.

Tutorials and Publications

Tutorials

Publications

Program Verification with SAW

Contents:

Getting Started

Background

Notation

Exercises: Initial Setup

SAW

Cryptol

LLVM and Clang

Troubleshooting / Installation Alternatives

Using Vagrant to Install and Use SAW

Specifications and Verification

First Example: Counting Set Bits

The Code

Exercise: A Safe and a Broken pop_count

Testing Programs

Exercise: Testing popcount

Symbolic Execution

Running SAW

Exercise: Verifying Clever Versions of popcount

Exercise: Verifying Your pop_count Implementations

Memory Layouts and Pointers

Specifying Memory Layout

Cryptol

Exercises: Getting Started with SAW and Pointers

Exercise: Unsigned Arithmetic

Exercise: Alternative Implementations

Exercise: Swapping and Rotating

Exercise: Arrays

Compositional Verification and Salsa20

Salsa20 Verification Overview

A Cryptol Specification

SAW Specification and Verification

Comparing Compositional and Non-compositional Verification

Exercise: Rot13

Extended Exercise: HMAC Maintenance

Proof Maintenance Exercises: s2n HMAC

Background: The Updates to the Implementation

Exercise: Update the Cryptol Specification

Exercise: Update the SAW Specifications

Example Solution: HMAC Maintenance

Proof Maintenance Exercises: Solutions

Updating the Cryptol Specification

Updating the SAW Specifications

Further Reading

Specification with Cryptol

Verification with SAW

Glossary

compositional verification

Cryptol

proof maintenance

SAWCore

SAWScript

SetupValue

specification

symbolic execution

symbolic value

testing

Term

verification

Memory Layouts and Pointers

Specifying Memory Layout

Cryptol

Exercises: Getting Started with SAW and Pointers

Exercise: Unsigned Arithmetic

Exercise: Alternative Implementations

Exercise: Swapping and Rotating

Exercise: Arrays

Tutorials and Publications

Tutorials

Publications

Specifications and Verification

First Example: Counting Set Bits

The Code

Exercise: A Safe and a Broken pop_count

Testing Programs

Exercise: Testing popcount

Exercise: A Safe and a Broken `pop_count`

Exercise: Testing `popcount`

Exercise: Verifying Clever Versions of `popcount`

Exercise: Verifying Your `pop_count` Implementations

Exercise: A Safe and a Broken `pop_count`

Exercise: Testing `popcount`

Exercise: Verifying Clever Versions of `popcount`

Exercise: Verifying Your `pop_count` Implementations

About `mir-json`

A first example with `saw-rustc`

The `SAW_RUST_LIBRARY_PATH` environment variable

`Term`s and other types