Key Takeaways From ForAllSecure's, “Achieving Development Speed And Code Quality With Behavior Testing” Webinar

David Brumley
August 21, 2019
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Security and speed are often perceived to be mutually exclusive, repelling away from each other like identical poles of a magnet. Dr. David Brumley, CEO of ForAllSecure and professor at CMU, posits that they don’t have to be. In ForAllSecure’s latest webinar on “Achieving Development Speed and Code Quality with Behavior Testing (Next-Generation Fuzzing)”, Brumley unveils a next-generation dynamic testing technique that security teams trust and developers can love.

This next-generation DAST technique is known as behavior testing. Foundational techniques of behavior testing, namely fuzzing or fuzz testing, have been proven for nearly a decade. When guided fuzzing is coupled with a new research area known as symbolic execution, this accepted technique takes on automation and even autonomous characteristics that now allow it to fit seamlessly into DevOps environments to boost -- not hamper -- developer productivity. This technique has been battle-tested in the 2016 DARPA CGC, where it took first place, and deployed in the real-world, solving some of the most critical software security challenges.

Missed the webinar? Not a problem. You can catch the recording here. We’ve also listed below the top 3 takeaways from Brumley’s webinar.

Accuracy and reproducibility are key to enhancing developer productivity. 

Developers are creative, brilliant people. They solve intricate problems by writing applications.

Although they are talented individuals who possess many skills, they are not security engineers.

Writing code and writing secure code require two separate skill sets. Many R&D teams have come to this realization and have armed their developers with static Application Security Testing (SAST) tools. While SAST have their place in the SDLC and offer tremendous benefits, they unfortunately are not the ideal technique for automation and autonomous security testing

Static testing directly analyzes the code for vulnerabilities and/or weaknesses. Because SAST is conducted on applications while they’re in a non-running state, it can only blindly apply coding best practices. SASTs have zero context into how the application will behave in production environments, and, as a result, frequently produce false-positives. Grammar is an excellent analogy. Understanding the rules of English grammar sets the foundation for apt writing. However, the English language is all about exceptions, and the proper use of English grammar largely depends on context. Coding works similarly; The applicability of coding rules largely depends on context. Hence, their high false-positive rates. Now, imagine automating inaccuracy at scale. The result is a paralyzing backlog vulnerabilities for security and development teams to validate.

Thus, through ForAllSecure’s research on autonomous security, we’ve learned that accuracy and reproducibility are not only vital for DevOps environments, but a key requirement for automated and/or autonomous cyber reasoning systems. Accuracy lessens the number of security “speed bumps”, protecting developer productivity, while test cases for validating and reproducing issues accelerate time to fix.

“This little thing may seem small -- coming up with a test case-- but it's actually big, because if you can come up with a test case that demonstrates a problem, it's quicker to fix. But more importantly, you make sure that problem stays fixed forever,” Brumley shares.

For more information on autonomous security, learn more here.

Thinking like an adversary is a part of defensive security.

There is a common misconception that security is a binary state. Brumley’s experience working with whitehat hackers have proven that security is neither “true” nor “false”. Brumley challenges the audience to think see security as a continuous effort. Security is, in many ways, a game and the goal is the outpace the attackers. The first step to beating attackers are their own game is understanding what makes them tick.

What hackers commonly do is look for bad behaviors in programs. In fact, if you converse with most famous hackers, they often cite that they found their vulnerability when they noticed the computer behave unusually. They, then, further explore why it behave unusually. “It's the sort of unusual behaviors, those unintended behaviors, where the system crashes or outputs something it shouldn't. Those are the places they look for exploitable flaws.”

Brumley breaks down the hacker workflow in 3 simple steps:

  1. Generate a malformed input to see if it trigger anomalous behaviors
  2. Observe behaviors. Leverage the system’s behavior to influence new inputs and autonomously generate them.
  3. Repeat

Having discovered this, Brumley and ForAllSecure has made it their mission to teach computers to do exactly what these hackers do, so organizations can finally get aheads of attackers. The result is a next-generation dynamic testing technique and solution known as behavior testing and Mayhem, respectively.

There’s a new DAST technique in town. It’s called behavior testing.

Behavior testing is a technique that unifies the tried-and-true method of guided fuzzing and the unmatched ingenuity of symbolic execution.

Fuzz testing is an accepted technique with a proven track record. “Google has used fuzz testing to find 27,000 bugs and vulnerabilities in both Chrome and open source software. Each of these findings weren’t theoretical; they’re proven vulnerabilities. Carnegie Mellon has shown in a research project that they found 11,687 bugs in Linux programs. Microsoft has found over 1,800 bugs in Microsoft Word, according to their latest report.” All three organizations leverage fuzzing as a part of their security testing. The unfortunate reality is that this advanced technique is typically exclusively wielded by organizations with abundant resources, both in budget and in personnel. Fuzzing demands technical developers who are fluent in a niche security skill and coding, a rare skill-set combination.

Behavior testing aims to lower the barrier to use and democratize this fuzz testing by pairing it with symbolic execution. Symbolic execution reduces the technical fluency required to effectively and efficiently leverage this advanced technique. Although guided-fuzzing is able to quickly generates inputs, effectiveness is largely based on hope. It blindly and randomly generates inputs or test cases in hopes of triggering bad behaviors. Manual intervention is required to effectively guide the fuzzer across the application.

On the other hand, symbolic execution slowly yet methodically systemizes and informs how to intelligently craft inputs. Simply put, symbolic execution abstracts binaries in mathematical expressions to determine the parameters for each conditional branch and discern what inputs to generate. This information is fed back into the guided fuzzer test at scale.

An analogy to consider for fuzzing and behavior testing is robot vacuum cleaners. Robot vacuum cleaners that have a methodical cleaning pattern are able to ensure a cleaner home. They are able to find dirt and dust vacuum cleaners with a random approach may miss. “Smart” capabilities and mapping features may also influence the value and satisfaction customers extract. For example, vacuums with laser technology guide bots to systematically navigate around, over, and through furniture without breaking or damaging your home. And, of course, the capability to save mappings of each room enable faster cleaning times in future runs. 

Evaluating the value and satisfaction customers are able extract from various fuzzing versus behavior testing is similar. Techniques that have a methodical testing pattern are able to deliver superior results, finding defects other techniques may miss. Solutions with “intelligence” built into their analysis engine are able to systematically navigate around, over, and through functions, sweeping for defects without breaking applications. And, mappings in the form of saved test cases enable faster regression testing in future runs. Brumley shares a detailed technical deep dive of the technique and further supports his explanation with a demo in the webinar. You won’t want to miss it.


Several organizations, including the Department of Defense, have shared the benefit of using Mayhem, ForAllSecure’s behavior testing solution. More details on how the DoD has benefited from Mayhem in available in this blog and recording.

Want to learn more about Behavior Testing and Mayhem?  You can schedule a demo with a ForAllSecure representative here.

Share this post

Fancy some inbox Mayhem?

Subscribe to our monthly newsletter for expert insights and news on DevSecOps topics, plus Mayhem tips and tutorials.

By subscribing, you're agreeing to our website terms and privacy policy.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Add Mayhem to Your DevSecOps for Free.

Get a full-featured 30 day free trial.

Complete API Security in 5 Minutes

Get started with Mayhem today for fast, comprehensive, API security. 

Get Mayhem

Maximize Code Coverage in Minutes

Mayhem is an award-winning AI that autonomously finds new exploitable bugs and improves your test suites.

Get Mayhem