This three-part series will show you how to take your fuzzing targets beyond memory errors and crashes to finding correctness and even efficiency issues using Property-based fuzzing. This technique is especially useful for memory-safe languages like Rust and Go.
Testing has become an integral part of modern software development. Tests are critical for correct and safe programs. However, few developers see writing tests as an exciting prospect. Writing tests can be burdensome and repetitive work.
For instance, let's look at the tests for a function that returns the longest common prefix of two strings:
This Go code was inspired by a test in guava, google's core library for Java. This test is an example of what we call example-based testing, the most common way to write unit tests. Example-based testing tests a function on example inputs, and checks that the output matches the expected result for each input. The examples are created by the developer, who is trying to think of the different ways the code could be used and the tricky edge cases that might break it. You could probably think of many more examples, some of which might explore the function in somewhat different ways. However, coming up with examples is boring and repetitive.
Property-based testing automates that work for us. Test automation allows us to generate better tests with less work and less code, so that we can focus our effort on the less mechanical work developers excel at.
Instead of writing tests with manually created examples, a property-based test defines the types of inputs it needs. In the example above with CommonPrefix, the input would be two strings. The property-based test framework will generate hundreds if not thousands of examples and feed them to your test function.
In pseudo code, a property-based test roughly looks like the following:
Not only is this test shorter, it covers more cases. In fact, it can cover hundreds of thousands of them. Indeed, the property-based test is parametrized: it takes two strings as arguments. As such, it provides a test template which can run on any two strings, instead of a pre-defined set of hard-coded examples. The property-based testing framework will repeatedly run this test on automatically generated inputs for you. A good property-based testing framework will generate known pathological inputs that often lead to failures: empty strings, valid and invalid UTF-8 strings, non-printable characters, ...
A property-based test checks output differently. While unit tests check that the output is identical to a pre-computed expected result, property-based tests check properties. It cannot rely on a pre-computed set of expected results, since it can run on millions of arbitrary inputs. This forces us to think about what defines correctness (and safety). In the example above, we check two properties of the output:
- The two input strings (s1 and s2) should start with the prefix.
- The prefix is the longest common prefix.
Coming up with those properties requires more thought than computing the expected result for one input. Good property-based testing is a learned skill. Luckily, even a beginner can get value out of property-based testing. The last section of this article will help you get started by giving you some common properties to check.
A generator is a function that returns an instance of a given type from a source of randomness. In the previous section, we wrote a parameterized test that had two strings as arguments. The property-based testing framework automatically detected that the function expected a string. Using one of its default input generator, generated random strings automatically.
Generators are a critical part of a property-based testing framework. While developers do the hard work of coming up with good properties to check, the efficiency with which they will be exercised depends entirely on what kind of inputs will be passed to them. A framework with bad generators will not uncover as many issues.
By default, property-based testing frameworks provide built-in generators for built-in types like strings, integers, floats, ... Sometimes, they may even provide generators for higher-level types like datetimes. Oftentimes, generators can be configured. For instance, a string generator might allow you to specify the alphabet, or size restriction.
Eventually, you will want a way to define your custom generators. Either because you have a complex data type to generate in a reusable way, or because you would like to control the input distribution. For instance, if you are testing a XML parser, you might want to skew the distribution towards strings with < and > characters. This is closely related to structure aware fuzzing.
What makes a good generator?
Not all generators are equals. Some are better than others. In this section, we describe what makes a good generator:
- A generator should be fast. A generator will be executed millions of times when combined with fuzzing. The longer it takes, the fewer inputs we can check.
- A generator should be deterministic, given a source of randomness. This one is a bit counter-intuitive given that a generator is producing random data. It means that a generator should draw all of its randomness from the given source of randomness. If it gets random data from somewhere else, then it will not be deterministic. And if it's not deterministic, then the property-based testing will not be able to reliably re-produce failures by giving the same source of randomness to a generator.
- A generator should not waste randomness, or at least minimize the waste. Imagine a generator having to generate 16 random bits. It could either call random.nextInt() 16 times, or call it once and extract the bits from the integer. The latter is preferable, as less random data from the source of randomness is thrown away. That way, the mutations of the property-based testing framework will be more effective. They do not have to waste time mutating bits that do not affect the output of the generator.
- A generator should cover the code under test, ideally uniformly. A generator's effectiveness is evaluated with respect to a given test or program. One string generator might be good for a given test, but bad for another. It’s generally useful to review the test coverage of a property-based test. If you see a big portion of the code uncovered, that probably means that you should tweak the generator.
In part two, I'll talk about coverage-guided and continuous property-based fuzz testing.
Development Speed or Code Security. Why Not Both?
Mayhem is an award-winning AI that autonomously finds new exploitable bugs and improves your test suites.