The Hacker Mind Podcast: Fuzzing Hyper-V

Robert Vamosi

August 26, 2021

At Black Hat USA 2021, researchers presented how they used their own fuzzer designed for hypervisors to find a critical vulnerability in Microsoft Azure.

Ophir Harpaz and Peleg Hadar join The Hacker Mind to discuss their journey from designing a custom fuzzer to identifying a critical vulnerability within Hyper-V and how their new research tool, hAFL1, can benefit others looking to secure other cloud architectures.

Robert: When we hear that something is processed or stored in the cloud, we often think we understand what that means. At least I did. Sure, I know, the data is not literally in the cloud in the sky. It's data sitting on a bare metal server in Virginia, or Poland or somewhere like that. But what is those servers, they're composed of virtual machines that can be easily spun up or down as needed. Virtual machines allow the cloud to be elastic ever changing to grow as demand requires, and to shrink if needed. So what's a hypervisor, or virtual machine. It's the software that emulates a physical computer virtually to run programs operating systems stored data connected networks and do other typical computing functions. You can create as many VMs as you want on one physical server, which is the point. So what does that really mean, think about all the hardware devices and interfaces that are needed to connect your physical server to the network. Now, recreate all or most as virtual services. So, the physical network card that backbone of any network communications is now a VM network card. It functions the same way as a physical network card. Except, everything is done in software. And if you're like me, you already know that software can have vulnerabilities in it. Okay, okay, hardware can also have vulnerabilities, but software vulnerabilities are more subtle, and sometimes they take some time, or some research to be exposed. The last few years, major organizations have moved organizational workloads that used to run entirely on premises, into the public cloud. Given this increasing value, a security researcher today needs to start proactively thinking like a criminal hacker and deliberately try to crash the cloud, at least tried to do so before the bad actors can leverage that same vulnerability. So it's a race. In a moment I'll introduce you to two researchers from Black Hat USA 2021, who built their own fuzzing tool designed to handle the special needs for security testing hypervisors, and how within two hours of running that tool for the very first time, they found a critical vulnerability, one that could have brought down whole regions of Microsoft Azure important research that only underscores the immediate need for more tools like theirs, and much much more attention on cloud security.

[Music]

Robert: Welcome to the hacker mind that original podcast from ForAllSecure. It's about challenging our expectations about the people who hack for a living. I'm Robert Vamosi, and in this episode I'm introducing to security researchers who built a specialized fuzzer to handle hypervisors, and how they're releasing the tool on GitHub, and hope to encourage others to find even more vulnerabilities and public clouds such as Microsoft Azure, Google and Amazon AWS,

[Music]

Robert: While listening to this episode, there are two advantages that might be important to keep in mind. One complexity is the enemy of security. The other obscurity, is not security. I say these because in researching the story, I found that I thought I had a grasp on cloud architecture. And then I didn't. Microsoft defined server virtualization as the process of dividing a physical server into multiple unique and isolated virtual servers, by means of a software application. Each virtual server can run its own operating systems, independently server virtualization is also a process that creates an abstract multiple virtual instances on a single server. The underlying virtualization technology from Microsoft is your is known as Hyper V, the public cloud, then is very complex and very intertwined with both old and new technologies. So that's complexity. There's also a lot of stuff that simply isn't well documented. That's obscurity. This then becomes fertile ground for cutting edge security research.

Peleg: I'm Peleg Hadar. I'm a senior security researcher working at SafeBreach Labs

Ophir: My name is Ophir Harpaz. I'm a senior security researcher at Guardicore.

Robert: So what was the initial concept that led or fear in Peleg to even look at hypervisors.

Ophir: I think there are two aspects to this. the first is that hypervisors today are the basis of cloud infrastructures, so any security flaw in hypervisor means much broader impact not only on one virtual machine but any virtual machine running on a host. And the second aspect is the technological challenge hypervisors are a whole separate domain, and it requires knowledge and research and both Peleg and I were extremely interested in this world hypervisors are enormous pieces of code, they have dozens of components and simply digging into that was very exciting for the two of us,

Robert: Ophir and Peleg saw a need and understood the challenge of looking for vulnerabilities in the cloud. For one thing, they needed to design a tool for that. So, at BlackHat USA 2021 They presented a talk entitled, and hAFL1: Our journey of fuzzing Hyper V and Discovering a Zero Day, we'll get to the zero day in a moment. But why did they choose Hyper V?

Peleg: According to Microsoft's official website. Azure Cloud is being used by with 95% of the Fortune 500 companies. What it actually means is that an attacker might deploy a very cheap VM virtual machine and by using a single instance, just send the runnable packet which we found, and just crashed the whole host which hosts other company's infrastructure as well so it's very critical,

Robert: I'd be remiss if I didn't acknowledge that previous Black Hat USA talks have addressed various Hyper-V vulnerabilities. However, over time, there have been very few.

Peleg: If you take a look of the windows. Each month during the Patch Tuesday Microsoft patches between 80 and 120, vulnerabilities, which means that a lot of researchers are using fuzzing static analysis and just take a look of this field and Hyper V I think the maximum when elevated which is being patched during your single passes is one, correct me if I'm wrong I think this is the number. And it means that a lot of researchers are not taking a look of it, but I think that Microsoft is trying to encourage a lot of researchers to take a look of it, because the bug bounty award, the maximum one is 250k, dollars for such a bug.

Robert: Whoa, so there's a Hyper-V bounty program with a payment up to $250,000 us. This can be a very lucrative line of research. Even so, this remains a research, blind spot. There doesn't seem to be a lot of people coming forward with their work.

Ophir: I think people have thought of it and actually done it, maybe they're not talking about it in Black Hat, but it's a niche but some people do it. Some people are reaching out to us talking about the research because they do similar things. I don't think we're going to make a revolution in terms of hypervisor fuzzing but we're making a small progress here and it's going to remain a niche in the next coming years, but hopefully it'll grow one more researchers will hopefully engage in such type of research.

[Music]

Robert: As we said hypervisors recreate physical computers in software, by definition, there are massive binary blob. A lot of software security testing today still begins with static analysis, which means parsing individual lines of code and identifying common weaknesses in coding practices. Although static analysis is largely automated, it's still time consuming on the other side for a researcher to pour over and then eliminate the many false positive, that static analysis can produce obviously poring over all that code in the cloud would be impossible. If not slow fuzzers then are tremendously important in the realm of vulnerability research. They're dynamic for one thing, so they're looking at the running code and monitoring its behavior by rapidly feeding a target with numerous inputs, they quickly automate the process of bug discovery.

Peleg: The main two approaches as you mentioned to find vulnerabilities. The first one is static analysis which means that security researcher will just deep dive into some piece of code and will try to learn it and how it behaves and it will try to look for certain flaws in the code, and obviously fuzzing and try to achieve vulnerability research by doing it automatically, so I think that it really depends on the target that you're due to pass on.

Robert: So what exactly is a fuzzer, it's an engine that injects mutations of known valid input into an application, and then monitors how that software handles it quickly identifying anomalous behavior, and even crashes.

Ophir: So I would say fuzzing is a technique for testing some program, which basically sends countless of inputs to a program that expects input, and it does it, while monitoring crashes and unexpected unexpected behavior from the target. And by that, it identifies security vulnerabilities or code flaws in the target program. So, who would be using fuzzing today. If I'm not mistaken fuzzing actually started as a testing technique, so mostly quality assurance, but I think it evolved into being a tool for security researcher, just because it's pillock said, it is capable of identifying problems in the code, which can be leveraged into exploitation maybe

Peleg: the most popular, like, mostly security researchers will use fuzzing. In order to find vulnerabilities in an automatic manner, then this is what we did in the project. And originally, a lot of companies today are using fuzzing in order to find vulnerabilities inside their own code, flaws and vulnerabilities, so I think these are the main usages today.

Robert: Big companies such as Microsoft, Google, Apple, and Nvidia have all publicly disclosed their use of fuzzing as part of their software development lifecycle and smaller companies are joining in, as well as code becomes more complex as functions become more obscure. It just makes common sense to be fuzzing your code these days. However, not all fuzzing engines have really looked at the problem of hypervisors, so Oh, fear and Pele research, a lot of open source fuzzers available today, and they settled on KAFL, which is a flavor of American fuzzy lop or NFL

Peleg: We examined multiple fuzzers out there, the main ones. What seems pretty useful for us at that time were Syzkaller, and KAFL. And we actually read about each one and we saw that iKAFL has all the components we wanted to achieve within the fuzzer.

Robert: Again, not all fuzzers are built alike, and for this project, there were additional components which would make it much much easier in terms of efficiency and performance

Ophir: KAFL was simply suitable for our needs because, more specifically, we did not target Hyper V as a whole. We targeted a specific driver in Hyper V, when networking switch and KAFL was easy to bootstrap with, I think, and the fact that it had code coverage like Peleg said was, was very significant parameter in choosing it. And also, we did not fuzz with the system calls like sis color supports we fuzz using very designated network packets that are proprietary for Hyper V, so KAFL I think was easy to choose and it also had the basis for what we needed which we added structure awareness and crash monitoring drops which we knew we had to implement but it didn't stop us. I mean, such as structural awareness.

Robert: Structure awareness, leveraging knowledge of the input format to generate test cases for your target. This means that the fuzzer doesn't send arbitrary sequences of bytes to the target, but rather, these are inputs that are structured in a way that could be valid in Episode 24 I talked in detail about structure aware fuzzing with Harrison Greene, we talked about creating valid test cases within parameters of a PNG file, for example, we also talked about applying the concept of structure awareness to fuzzing whole libraries to fear and pegleg. Why was structure aware fuzzing very important,

Ophir: Because your target may expect things that are not simply sequences of bytes but sequences of bytes that are meaningful, that, that are split into different fields etc. In our case, these were packets, but you can fuzz a program that expects certain file formats. Pictures PDFs symbol files, things like that. So in our case we integrated structure awareness, to make sure that we send meaningful packets and not arbitrary, bytes. So we actually used Protocol Buffers by Google, that's a framework for presenting data with structure. And we integrated a library called lib protobuf mutator, which performs the mutation or the changes of the fuzzing payloads, Based on the fields. It is not simply turn one bit sets or unsets a bit, but it takes a field as meaningful unit and mutates it. If a field is an integer for example it will make sure it stays an integer. If a field is a string, it will make sure it's mutated as a string,

Robert: Another important property to a fuzzer is coverage guidance, the ability to mutate inputs based on your previously visited execution paths. When you're running and testing software, you want to make sure that you're pushing deeper and deeper into the code.

Peleg: So think that you have some kind of a software that you're currently this is your target, and while you do the fuzzing. And obviously, if you'll send random packets, each time let's say I have one byte and I'll just send random bytes, it might do something on the software, it might call some new piece of code that you didn't trigger before, and it might do nothing, because you don't know if the software you fuzz actually parses the data and thinks it's like it makes any difference. So, the code coverage, actually helps you understand whether a single packet that is sent during the fuzzing actually triggered interesting code or triggered any code for that matter. And this is actually what makes the fuzzer efficient, because you can just send random inputs for like one week and it won't trigger any code. But if you do code coverage you understand exactly whether you triggered some code, and if you did, you take the same payload, the same fuzzing input and you just do some kind of a mutation on it. And this is actually giving you much more visibility, rather than just do some kind of blind fuzzing and hope that something will be good.

[Music]

Robert: So we more or less have a fuzzer, KAFL, Ophir and Peleg needed to identify a target for that fuzzer. Remember all the hardware devices and interfaces that are needed to connect to your physical server are now software in the Cloud. So you have things like VM buses VMs switches, and so forth. So, where do you start. More importantly, there's the architecture that rains it all in. You have nested virtual machines and that gets us back to complexity.

Peleg: It's actually, it's a problem that is related to all of the hypervisors, in that if you're using a fuzzer, and you'd like to pause the hypervisor. Obviously it runs other VMs Well,

Robert: Imagine a PowerPoint slide with a box that's your hypervisor, and within that hypervisor box is two other boxes. These are virtual machines. The way Hyper-V is architected one of these VMs, is the host with the operating system, and the other are the guest or child VMs, as many as you need. If a bare metal server is level zero, then the host is level one, and the child VMs are level two

Peleg: So you have your, your VM within your father. In our case it was Hyper V hypervisor and the hypervisor itself so it has also more VMs within. And these double layer actually married more hard to do because you need to communicate between two hypervisors.

Robert: In other words, it initially appeared that Pellegrino fear had to fuzz both level one and level two. And that felt kind of clunky, if not improbable.

Peleg: Today, not a lot of researchers are taking a look of hypervisors. These are, like, pretty new technology, I might say, and there is no doubt a lot of vulnerabilities which exists in this field, in particular, Hyper V. So I think that this was one of the main reasons we actually wanted to dive into because there is no prior documentation or prior technical details about it and we had to learn a lot of a lot of things from scratch and develop our own methods and techniques, so it was pretty challenging. And this is actually one of the main reasons we started the project.

Robert: Turns out there was a solid precedent for fuzzing Hyper-V components. In 2019, there was a Microsoft blog about fuzzing para-virtualized devices in Hyper-V and this blog planted an important seed with Ophir and Peleg since it talked about the potential vulnerabilities found in inner partition communication, they began to suspect that this communication channel could be fragile. For example, if a bad packet was passed from a guest VM to the host, it could crash the system as they search through documented and undocumented parts of Hyper V, they settled upon VM switch dot sis, which is a VSP responsible for networking within Hyper V. This virtual switch, more or less emulates a switch, as a layer two network device would communicating with the physical server, and also with guest VMs. Think about that seems logical that if you could somehow crash the communication switch. You might also crash the host. If not, some of the children as well. And though we've identified a target VM switch that says, we still have an identified how the fuzzer might address it,

Peleg: In our case on Hyper V which is, we'll talk about this is a very complex target, and there is a lot of challenges in order to just develop the fuzzer you have to exactly understand how does it work and how do you send inputs and whether the inputs are being parsed by the target. And obviously they are much easier targets, which you just need to send any data like and it will eventually crash. So it really depends on what is your target of the product.

Robert: Another of the core concepts in fuzzing is the harness, which is basically a couple lines of code that is responsible for sending fuzzing inputs to the target software, creating one for a hypervisor. That created a lot of work.

Peleg: So basically the hard part, as Ophir mentioned during the talk is, is the most important part of the fuzzer because you'll have to understand exactly how to send inputs to the target. You'll have to this is actually the main component of the puzzle because at the end of the puzzle we just keep sending it to the target. And it's not always so simple because there are like software with using very simple API, which again, just send packets or input with Hyper-V we had to do a lot of reverse engineering and understand exactly how do we send data that is possible. And in our case, we called an undocumented function. And we reverse engineer it and we just send our input by using it. After we actually completed this part was much easier to build all the infrastructure around it, because this is, this was the main component.

[Music]

Robert: Having decided on a fuzzer having built a successful harness, but Ophir and Peleg tested their new tool on CVE, 2019 0717, which is a Windows Hyper V denial of service vulnerability that was discovered and patched about a year before they spoke at BlackHat after working with that known vulnerability, they began to see how they can make their tool, much more efficient.

Peleg: It took us a couple of months. Yeah, I think that the whole project on moment we started to work on it failed and found a vulnerability was like six months, six months, and as I mentioned, like the departure took us the most time was the hardest part because we actually use Hyper V and hypervisor and it was not trivial to send data between two built on machine because you're using two different computers without computers. So we found a political trick which, actually, by doing some kind of a workaround we actually were able to pass, Hyper V within a single VM single computer. And this is actually made it easier.

Robert: This is actually kind of brilliant that aha moment when they realize that instead of trying to fuzz two virtual machines and Hyper V, level one and level two, they could instead just concentrate on level one. After that, things progressed rapidly.

Peleg: We had a lot of versions. So we deployed the fuzzer and then we improve it and added more components and it wasn't always stable, but once we choose the stable version which actually worked. Two hours until I called her and told her we found a bug, which apparently was a remote code execution but

Ophir: Could not believe it as well. Yeah, still today, it's pretty fascinating.

Robert: Typically calls for papers at major conferences are held months in advance of the actual conference. If I understand they spent several months just building the fuzzing tool, one that they had not yet stabilized, one which they had not yet found the vulnerability, yet they went ahead and started the submission process for Black Hat.

Peleg: That was the exact story but I think that our chances will were high, even without finding a vulnerability because the tool that we wrote is the first one, the first fuzzer which is also open source and is capable of targeting Hyper V. There is no other open source, Father, at least nowadays, that is capable of causing any hypervisor and not Hyper-V as well so I think this is revolutionary. Without the vendor or whatever it is, is extra. So we actually are like final target was our black hat, and there was a duty to the CFP the call for papers packet. And I think that we found is when I waited like a week before the due date. Yeah, so we had our father and we started to think like what you're going to publish and how we're going to market it. And then we found a vulnerability, which was great. They arrived at a name, and it's actually pretty logical, we call it half a one because ages for hypervisor. A fails because we use KAFL and AFL buzzer algorithm, and the one because we used only one level of virtualization, only one VM. I think you'd have a lot of benefits, and the main one is actually, you don't need to send data between two separate VMs, which has a lot of overhead, because you need to send it and then it will arrive to another VM, and it will be parsed, we only have one level of utilization and we send the data within it. So we actually made our fuzzer much more faster, and they think that we made something like 22,000 executions per second, which is pretty fast.

Robert: 22,000 executions has the potential to find crashes fast, which it did. But even then, they needed to be careful.

Ophir: I'd like to emphasize that not an all targets. A crash is a bug, just like you said, but being able to crash hypervisor host is critical, especially when it's the basis of Azure Microsoft public cloud if you manage to crash a Hyper V host. This means you can crash Azure regions, potentially, and with this region, you take down all machines running on top of it. So, you know blue screen sounds maybe not that severe but in hypervisors context and cloud context, this has tremendous impact.

Robert: So after additional testing after validating that it was real. They engaged in responsible disclosure with Microsoft,

Ophir: We immediately sent disclosure emails to MSRC Microsoft Security Response Center. And they, they responded quite quickly and they tested the vulnerability as well. They asked asked us for the tools that we used, and the the exploit itself, that we developed. And yeah, they, They assigned it with a nice CVE and gave it a critical score

Peleg: 9.9 out of 10,

Robert: And it's not hyperbolic to say it is perhaps the most critical hypervisor vulnerability to date. What Pellegrino fear found was CVE 2021 28 476 can be trivially exploited to gain denial of service or remote code execution.

Ophir: On a virtual host, you have your own virtual machine, you crash the host this whole host potentially runs multiple other virtual machines that depend on it. So once the host is down, everything, everything is down.

Robert: Yeah, given that it's scored at 9.9 out of 10 in severity, I would imagine something like that would be very hard to keep to yourself,

Ophir: When you disclose a vulnerability. This was my first actually but I was extremely thrilled about it I think that it was too. And my instinct is immediately go to Twitter where all information security happens and tweet about it. No, you didn't actually MSRC helped me censor my tweet so that I can actually tell the world that we found a bug, without actually disclosing details from it so they, they were very cooperative. And they they helped us get the message out without actually giving details for potential attackers that was very nice.

Robert: I've heard from different hackers that sometimes reporting a vulnerability to a vendor leads to silence. That's not the case with Microsoft, in fact, they've gotten pretty good of late at reporting vulnerabilities out to the public. The result can be seen in subsequent patch Tuesday's where they give credits to individuals who reported the vulnerabilities. In this case, Pellegrino fear had a good experience. And this, because it was a critical vulnerability that Microsoft knew it had to fix.

Peleg: They actually patched it took a few months but obviously when you're reporting abnormality on such a critical and complex environment such as Hyper V and Azure Cloud. I can only assume that it might take a long time to test the patches and obviously if the patches is not valid, they can just crash a lot of infrastructures which are depends on on Azure Cloud. But it was pretty fast. I mean, took like three months, I think, is the moment we reported until the closing test, which is pretty fast on, you know on hypervisor mode.

Robert: So once they submitted the Blackhat once they reported the vulnerability to Microsoft. Once they got accepted for that talk Pellegrino fear took a nice long vacation. What,

Peleg: No, so we actually centralize all of the details and we wrote the call for papers document and we sent it. And then we had one month to, like, have a lot of rest. Then we regard a message from like that we got in, got accepted. And then we started working again we had to finalize the two and do code review and refactoring a lot of things so I think that until last week, we worked nonstop.

Ophir: So I have to say that preparing the presentation itself took months of work seriously like creating the slides and rehearsing it, we did it so many times we can no longer hear ourselves saying those things.

[Music]

Robert: So at Black Hat Peleg and Ophir released hAFL1 on GitHub as an open source fuzzing tool. They also released harness, as well to help other researchers get started. So is this perhaps still too niche, to be more than an academic tool.

Peleg: So I think it might be used by both, I mean KAFL is an academic project, which is today being used by a lot of other companies, maybe, I don't know for commercial purposes but a lot of security researchers are using it. So I really hope that we will have some kind of a reference in academical stuff but I really hope that even for commercial purposes or even Microsoft itself we use some kind of some part of our code and just find more bugs because this is actually was our purpose, we wanted to people to just learn about hypervisors and how do you start a research and how do you find more bugs and I really hope we make it we made it more approachable.

Robert: So what are their targets might have a one be used for.

Peleg: So with regards to your question about other targets as well. So Hyper-V does contains a lot of components. These are called virtual service providers VSPs. Our target was the VM switch driver which is in charge of the networking part. There are a lot of other components which I think that might be great targets, we didn't have the time to actually take our fuzzer and target them but I think it would be really interesting for other security researchers to take our two half a one and try and target these vsps. In addition of wanting to add that type of one, we publish it today, tomorrow I'm going to release hAFL2, which is a side product which actually supports nested virtualization as well so might be a great tool as well for fuzzing other hypervisors, not particular Hyper V, and thinking might be useful as well. So we actually, as I said, we, we made it more approachable and we gave a lot of basic tools which help other researchers, and we hope to see in one year or five years, much more other researchers are using it and maybe even talking from blacker than referring to us

Robert: And for Ophir and Peleg, what is the next target?

Ophir: I'm not sure we have decided yet. But maybe we might are both very interested in this type of targets. Yeah, it could be Hyper-V because we're already in it and we have some knowledge, this point, but it can be also VMware or VirtualBox which is open source. Your future leads us.

Peleg: Yeah, In my opinion I think that's hypervisors research, obviously because it's very complex and it's only the beginning. These kind of researchers really believe that, like in five or 10 years. Most of the researchers will take a look of these targets, and therefore this is why this is why I hope that a lot of researchers will do some kind of a pivot, and we start to research this field. I believe it's very interesting and it's very challenging and this is so famous in the future. Every cloud environment is based on this or it has a lot of impact.

Robert: I'd really like to thank Ophir and Peleg for enduring the background noise of Black Hat USA 2021 during their interview about their journey in building a hypervisor fuzzing tool, and explaining some of the complexity inherent in cloud systems. This is an important research, so check out the hAFL1 and hAFL2 tools on GitHub. Remember, at least with regards to Hyper-V, Microsoft offers a pretty generous bug bounty program.

Hey, let's keep this conversation going. DM me at Robert Vamosi on Twitter, or join our subreddit or discord channel. You can find the deets and invites at the hacker mind calm.

The Hacker Mind podcast is brought to you every two weeks commercial free by ForAllSecure.

For The Hacker Mind I remained the fully vaccinated, fuzz tested vulnerability free, Robert Vamosi.

Share this post

The Hacker Mind Podcast: Fuzzing Hyper-V

Get a Demo

Or let us know if you have any questions

Complete API Security in 5 Minutes

Maximize Code Coverage in Minutes