Leveraging Fuzz Testing to Achieve ED-203A / DO-356A

David Brumley

June 8, 2021

Aerospace has become a software industry. Software drives every area of flight, including flight control, ground-based systems, communication, weather, maintenance systems, infotainment and more. Like any software-based system, aerospace must continually and proactively find and fix security and safety issues before cyber-attackers can exploit them.

In 2018 the aerospace industry published DO-356A, Airworthiness Security Methods and Considerations, to provide updated guidance on airworthiness cybersecurity. Airworthiness cybersecurity is the protection of aircraft from intentional unauthorized electronic interaction. ED-203A and DO-356A are technically identical consensus-based documents jointly created by a panel of aviation experts through the RTCA and EUROCAE organisations. The reports provide methods and considerations for showing compliance with the airworthiness security process defined in ED-202A / DO-326A during avionics design and development. The publications were ratified in June 2018 by the RTCA and EUROCAE councils, and are widely considered as the only Acceptable Means of Compliance with FAA and EASA cybersecurity airworthiness requirements.

What is ED-203A / DO-356A?

ED-203A and DO-356A introduce a new term called “refutation”, which is used to describe an independent set of assurance activities beyond typical analysis and requirements verification. The goal of "refutation" is two fold:

To form part of the vulnerability identification
to evaluate the effectiveness of the implemented aircraft security measures

“Refutation” was chosen because it denotes the sense of demonstrating the absence of a theoretical problem, as in refuting the allegation of vulnerabilities. Even though the term refutation is new, current versions of existing aviation standards and documents (e.g., DO-326A /ED-202A, ED-79A / ARP4754A, DO-178C / ED-12C, DO-254 / ED-80) have many safety/security assurance activities that are refutation activities. Refutation is also known as Security Evaluation in some contexts.

Why does refutation testing matter?

ED-202A and DO-326A define an Airworthiness Security Process that includes the identification of security assets that, if compromised, could lead to a safety hazard. Security measures are then implemented to protect the identified assets and an evaluation of the security effectiveness is performed via a security risk assessment. Refutation plays a key role in performing an evaluation of the effectiveness of the security measures.

Software can both meet requirements and still not be secure. This is a key concept, but easy to miss at first consideration by a non-security expert. For example, your web browser can both meet the requirement it will correctly render images on a website, while being vulnerable to attackers who place malicious images. Your car can functionally get you from your house to work, while still being vulnerable to an attacker disabling the brakes.

Verification activities check requirements, while refutation activities check security. Within aerospace, verification activities are requirements-based, while refutation activities are performed from an attacker perspective. Both verification and refutation are needed because they serve different purposes. Verification activities typically show that a system meets a functional requirement or specification. Refutation activities, on the other hand, show a system does or does not meet a security property by demonstrating the presence/absence of a vulnerability in a test scenario.

As noted in ED-203A / DO-356A, most of the vulnerabilities that get published are software vulnerabilities. Even when the overall system architecture and design are acceptably robust against adversaries, the software implementation may bring hidden vulnerabilities that allow bypass of the intended architecture. Traditional functional design and verification is insufficient to detect such vulnerabilities.

How are refutation testing and fuzz testing related?

Fuzz testing, also known as fuzzing, can be applied to satisfy refutation testing objectives for avionics software. Fuzz testing is a type of refutation testing that has long been employed by attackers and cybersecurity professionals to identify vulnerabilities in typical IT and OT systems. The term “fuzz testing” was coined by Prof. Miller in 1990 when his research group provided random inputs to typical UNIX programs to test reliability. Their first research paper showed between 25-33% of all unix utilities could be crashed with simple random input. While Prof. Miller coined the term, the idea of randomized testing goes back even earlier to at least the 1950’s.

Fuzz testing has matured enormously since Prof. Miller’s work, and the term “fuzzing” no longer means purely random testing. Modern fuzzing techniques use sophisticated dynamic analysis and formal verification techniques such as symbolic execution, with over 3,170 publications in 2020 alone mentioning the term fuzzing. In 2016, the US DARPA agency asked a “Cyber Grand Challenge” on whether fully autonomous application security was possible. Every competitive entry, including the winning Mayhem system, based their overall system on fuzzing.

Fuzzing has also shifted from ad-hoc, post-development analysis to a key component of software development. For example, Microsoft includes fuzzing in their Security Development Lifecycle (SDLC), and Google uses fuzzing on all components of the Chrome web browser. Teams at Google, for example, report that 80% of all bugs are found via fuzzing, up to 98.6% of bugs found by fuzzing are fixed, and that fuzzing prevented 40% more bugs being introduced via a new commit that broke previously working code (regression).

How to map fuzz testing to ED-203A / DO-356A

Avionics need higher reliability than typical software. Standards dictate that avionics that have been identified as safety critical, security critical, or mission critical have all vulnerabilities fixed or mitigated in order to be considered “airworthy”. ED-203A / DO-356A specifically calls out fuzzing as a refutation activity to support airworthiness, along with complementary techniques such as penetration testing, static analysis, and formal proofs. Indeed, fuzz testing satisfies all three Security Refutation Objectives:

O3.1 “Refutation analyses are performed to identify new vulnerabilities.” Fuzz testing identifies new vulnerabilities. Fuzz testing also provides what DARPA calls a “Proof of Vulnerability”, which is an input that demonstrates the vulnerability.
O3.2 “Refutation tests are performed to evaluate the exposure of vulnerabilities in the security environment and to challenge the vulnerability evaluation.” An application environment is parameterized by the particular runtime, configuration files, and application exposure to inputs. Fuzz testing is also parameterized by each of these settings. For example, if you have multiple configuration files, fuzzing can check whether the software is vulnerable under each configuration setting.
O3.3 “Refutation test plans are available. Refutation test results cover refutation test plans and performed tests. Refutation test results are analyzed, and discrepancies are justified and traced.” Fuzz testing produces reusable artifacts that can be incorporated into a test plan to ensure identified vulnerabilities are remediated and not re-introduced.

Interested in learning more about how ForAllSecure Mayhem can help you achieve DO-356A / ED-203A compliance? Learn more about our work with safety critical applications here or contact us here.

Share this post

Leveraging Fuzz Testing to Achieve ED-203A / DO-356A

What is ED-203A / DO-356A?

Why does refutation testing matter?

How are refutation testing and fuzz testing related?

How to map fuzz testing to ED-203A / DO-356A

Get a Demo

Or let us know if you have any questions

Complete API Security in 5 Minutes

Maximize Code Coverage in Minutes