Earlier this year, I reported a security vulnerability in Rust’s linked-list-allocator to the RustOS Dev team, which was assigned CVE-2022-36086. This post explains how Mayhem helped me identify the vulnerability and outlines some of the techniques used.
The Target
The target of interest for this example comes from Philipp Opperman's amazing "BlogOS". This series of blogs is an introduction to both Rust programming and OS development, and it's absolutely amazing. Please go read it if you haven't yet!
In particular, we will target the linked list allocator used for dynamic memory management. This library is designed for use in embedded and bootloader contexts, where we don't have the standard library's Vec or Box for dynamic memory.
This data structure manages a large, linear array of memory and provides convenient access to smaller, dynamically sized chunks of that memory to callers. The main interface is:
The Harness
The main idea with this harness is to initialize the heap object with a random size, then exercise the other functions at random. I will be highlighting the relevant parts of a cargo-fuzz harness in this section, but for the full source, see the full harness.
First, define the API functions we wish to call randomly as an Action enum, and derive the Arbitrary trait so it can be randomly generated from binary inputs:
use arbitrary::Arbitrary;
#[derive(Debug, Arbitrary)]
enum Action {
// allocate a chunk with the size specified
Alloc { size: u16, align_bit: u8 },
// free the pointer at the index specified
Free { index: u8 },
// extend the heap by amount specified
Extend { additional: u16 },
}
use Action::*;
Next, specify the input types that this target will accept through the fuzz_target!
macro, and then set up some linear memory for the heap to manage:
fuzz_target!(|data: (u16, Vec)| {
let (size, actions) = data;
fuzz(size, actions);
});
const MAX_HEAP_SIZE: usize = 5000;
static mut HEAP_MEM: [u8; MAX_HEAP_SIZE] = [0; MAX_HEAP_SIZE];
Next comes the actual fuzzing logic. Note that many of these functions are unsafe
, so we must read the documentation and call them with correct inputs, etc. If we don't, we would end up with false-positive defects being reported.
When fuzzer inputs are unsuitable for the functions being called, it's possible to truncate or take the remainder of those inputs and continue fuzzing. This may give a slight performance increase, but I chose to just return
early. Doing this makes cargo fuzz fmt
reflect more useful values, and it makes triage much easier.
We start by initializing the heap:
fn fuzz(size: u16, actions: Vec) {
// init heap
let mut heap = unsafe {
let size = size as usize;
if size > MAX_HEAP_SIZE || size < 3 * core::mem::size_of::() {
return;
}
Heap::new(HEAP_MEM.as_mut_ptr(), size)
};
Once the heap has been initialized, we iterate over the actions, and call the relevant APIs from the target:
// keep track of live allocations here
let mut ptrs: Vec<(NonNull, Layout)> = Vec::new();
// process operations
for action in actions {
match action {
Alloc { size, align_bit } => {
let layout = {
let align = 1_usize.rotate_left(align_bit as u32);
if align == 1 << 63 {
return;
}
Layout::from_size_align(size as usize, align).unwrap()
};
if let Ok(ptr) = heap.allocate_first_fit(layout) {
ptrs.push((ptr, layout));
} else {
return;
}
}
Of course, we still have to abide by the documentation since these functions are marked unsafe. We can't just call Heap::deallocate
, for example, with a randomly generated pointer!
What we can do, however, is track what pointers have been allocated (with the ptrs
Vec), and allow the fuzzer to choose an index to deallocate.
for action in actions {
match action {
// ---------8<---------
Free { index } => {
if index as usize >= ptrs.len() {
return;
}
let (ptr, layout) = ptrs.swap_remove(index as usize);
unsafe {
heap.deallocate(ptr, layout);
}
}
Extend { additional } =>
// safety: new heap size never exceeds MAX_HEAP_SIZE
unsafe {
let remaining_space = HEAP_MEM
.as_mut_ptr()
.add(MAX_HEAP_SIZE)
.offset_from(heap.top());
assert!(remaining_space >= 0);
// return if fuzzer wants to extend past what is valid
if additional as isize > remaining_space {
return;
}
heap.extend(additional as usize);
},
}
}
After iterating over actions
, we can still exercise the deallocate
API by deallocating any remaining allocations in ptrs
:
// free the remaining allocations
for (ptr, layout) in ptrs {
unsafe {
heap.deallocate(ptr, layout);
}
}
CVE-2022-36086
After fuzzing for only a few seconds, a crash was reported with a stack trace leading to Heap::extend
, and ASAN reported a write out of bounds. At first, I expected to discover something in the documented API that I wasn't following, but no! This was indeed a new bug to triage.
Internally, the allocator stores metadata about a free chunk in a Hole
struct:
/// A block containing free memory. It points to the next hole and thus forms a linked list.
pub(crate) struct Hole {
pub size: usize,
pub next: Option<NonNull>,
}
When Heap::extend
was called, a new Hole
was written at the current Heap::top
address (the highest address the heap spans), and linked into the chain of free Holes. The root issue was that space for the new Hole
structure (two usize
s) was assumed to be available, but not checked or documented as a requirement. If a small extension was given (even if not likely in practice) memory beyond what was "given" to the Heap would be overwritten.
Here is a minimal reproduction of the issue:
fn out_of_bounds_write() {
// only give 33 total bytes to the heap
static mut HEAP: [u8; 33] = [0; 33];
unsafe {
// initialize a heap using the first 32 bytes
let mut heap = Heap::new(HEAP.as_mut_ptr(), 32);
// writes 15 bytes past the end of HEAP on 64-bit cpus
heap.extend(1);
}
}
The Fix
I came up with a way to fix the problem by simply delaying the Hole
placement until enough free space is available. This means arbitrary extensions are possible, with minimal API change for users (just clarification of how Heap::size
and Heap::top
are defined).
/// A sorted list of holes. It uses the hole itself to store its nodes.
pub struct HoleList {
pub(crate) first: Hole,
pub(crate) bottom: *mut u8,
pub(crate) top: *mut u8,
+ pub(crate) pending_extend: u8,
}
This internal pending_extend field tracks small extensions until a Hole sized block is available for use. The top of the Heap now also remains aligned, so any alignment remainders are tracked here as well. Only 15 bytes are ever pending, so a u8 is large enough.
Patch Confidence
The fuzz harness shown above helped me a ton as a new contributor to the project, especially considering how much unsafe code I worked with! Following a type of fuzz driven development, I added assert! checks at various places where I was making changes. I would then let the fuzzer run for a bit and see if my assumptions held. When I was wrong, there would be a counter-example immediately available!
During active development, counter-examples never took longer than a few seconds to find, so the dev cycle was pretty fast! Once the fuzzer stabilized on a new attempt, I would then add individual unit tests with their own assertions. These served as fine-grained checks for specific edge-cases I was aware of.
After fuzzing for dozens of CPU hours without finding new coverage, I had fairly high confidence in the patch. I also ran each file in the corpus through MIRI to detect undefined behavior. The only UB was a pointer operation in my harness's Extend action, which is now fixed.
How Mayhem Keeps You Safe
I used Mayhem here to both identify the vulnerability, as well as verify that the workaround proposed above and the patched version aren’t susceptible to the same weakness. If you’re using linked-list-allocator 0.10.1 or below, Mayhem will identify the weakness during testing. As part of this, Mayhem will automatically create the regression test that can be used to confirm the CVE is no longer present after updating to 0.10.2 or above.
More Info
For more information on this vulnerability, and other similar issues discovered during this disclosure, see the GitHub advisory.