Skip to content

regehr/ub-canaries

Repository files navigation

-------------------------------------------------------------------------------

UB Canaries: A collection of C/C++ programs that detect undefined
behavior exploitation by compilers.

-------------------------------------------------------------------------------

To run all tests, type:

  run-canaries

For a complete list of command line options:

  run-canaries --help

-------------------------------------------------------------------------------

Each directory documents and tests an expectation: something a developer might
-- perhaps unreasonably! -- expect the compiler to do when faced with undefined
behavior. For example, the "uninitialized-variable" directory tests the
expectation that an uninitialized scalar variable will consistently return some
value that is legal for the variable's type.

If any test program in a directory does not display the expected behavior (for a
given compiler version + flags), then that compiler is considered to violate
that expectation.

The toplevel script run-canaries tests the expectations for a
specified collection of compilers and compiler flags. For now, you
need to edit that file to change the set of compilers and flags.

The output of run-canaries is, for each compiler / flag / canary
combination, a bit that tells us whether that compiler has been
observed to exploit that particular UB (in other words, whether it has
been observed violating the expectation). So here, for example, clang
has been seen to exploit signed integer overflows and uninitialized
variables, but not signed left shifts:

clang -O3 signed-left-shift 0
clang -O3 signed-integer-overflow 1
clang -O3 uninitialized-variable 1

-------------------------------------------------------------------------------

Guidelines for writing tests:

- Each test program should be entirely contained (except for standard header
  files) in a single compilation unit.

- An expectation should only be tested by looking at a program's stdout, never
  by looking at its assembly code or observing its memory usage or execution
  time.

- A test program foo.c may have one or more outputs corresponding to the
  "expected" case where the compiler does not exploit that UB. If there is one
  such file it should be called foo.output. If there are multiple files they
  should be called foo.output1, foo.output2, etc. If the actual output does not
  match any of these files, the compiler is assumed to have exploited the UB.

- Every test program must test only a single UB. In other words, each test
  program is written in a dialect of C that is completely standard except that a
  single behavior (signed integer overflow or whatever) is actually defined
  instead of undefined.

- Reliance on implementation-defined behavior is unavoidable, but please avoid
  gratuitous reliance such as assuming a particular size for int or long. It is
  OK to use the fixed-width types such as int32_t.

- A tricky issue to how much to expose to the optimizer and how much to hide.
  There are no particular good rules of thumb that I am aware of, you just have
  to try different things.

- An easier issue is *how* to hide from the optimizer. I suggest introducing a
  dependency on argc or on the value loaded from a volatile. Tests may assume
  that argc == 1.

-------------------------------------------------------------------------------

About

collection of C/C++ programs that try to get compilers to exploit undefined behavior

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published