Day 45 – Outreachy internship

Timeline and expectations

When I first started my project I wasn’t sure about how long it would really take as I had to look at it from a high level. However as the weeks have progressed I’ve come to realize that there is a lot of planning that goes into even the simplest pieces of code. Just last week it took me about 2 weeks to come up with an overall flow of how I wanted the fuzz testing framework to go. Of course my mentor’s input was much appreciated and very much needed as they helped me understand things a lot better. Below is the data flow I came up with for the framework.

Before I go on, I think its best you know how a hyper-v guest communicates with a hyper-v host. The guest and host communicate using a ring buffer, and there are two ring buffers one for guest-to-host communication and one for host-to-guest communication. Every hyper-v device (hyper-v driver) have these two ring buffers. This is done with a channel, these channels are called vmbus_channels. My project focuses on the host-to-guest channel. When the host places messages onto the ring buffer an event gets called and the channels special mechanism (oncallback) is called to notify the guest that the buffer is non empty. When this gets called, the entire buffer will be read and emptied before the host can start putting messages onto the buffer again. My project will allow you tap into the channel and do a test before the oncallback function gets called. This will allow you to manipulate or delay the messages in the buffer before the guest reads them.

As you can see from the pdf, it sounds like the user will be able to enter input, that will affect what gets tested and what method of testing is to be done. So far we only have two types of testing that we want to implement which are the following:

  1. Delay testing
    • delaying in microseconds a user appointed time between 1 – 1000, either the reading of the entire buffer, or the delaying of the reading of individual messages or both.
  2. Error Injection
    • this will go into the buffer before the guest reads a message and will edit a message by injecting a predefined error code into the message. Every message has an error code, so this part requires specific knowledge of the driver/messages. For now the focus is on getting the framework in place for this.

The purpose of this project is to allow someone to test hyper-v drivers. This means we need an interface between user space and kernel space. At first I was at a loss at how we can do this, but my mentor quickly pointed out that we could use Sysfs to grab input from the user and write it to our variables in the kernel. Eureka! I had no clue about Sysfs and its many use’s, so I’m grateful to my mentor for pointing that out because that’s exactly what we need. Sysfs provides us with an interface between user space and kernel space with the use of kobjects and attributes. Think of it this way, when the kernel boots up, Sysfs creates directories, sub-directories and files. These are made through creating kobjects (kernel objects), and creating the attributes for these kobjects. Usually an attribute would relate to some kernel variable, and to read and write to them you need to use the show() and store() functions. If you want to read more on Sysfs here’s a link to the Linux doc.

Heres a snippet of how you can do this:

static ssize_t fuzz_test_state_store(struct device *dev,
                                      struct device_attribute *attr,
                                      const char *buf, size_t count)
     /*...hopefully bugless code */
static ssize_t fuzz_test_state_show(struct device *dev,
                        struct device_attribute *dev_attr,char *buf)
     /*...hopefully bugless code */ 

For me personally, I think the main thing I had to wrap my head around was just understanding the hyper-v codebase, and understanding how the data flow from host to guest was done. Its hard, especially in an open source project that’s as big as the Linux kernel. What make’s it even more of a struggle is that since Hyper-V is a Microsoft product there aren’t many articles and documentation on the code specifically in the Linux Kernel. This was daunting at first and I believe, after realizing this I had to adjust my expectations and timeline. Its more important to understand the code and what its trying to do first, than to jump straight into the coding as this will save you a lot of headaches in the future. What was good for me was that my mentor emphasizes planning and brainstorming, so this also helped me develop an action plan. The end goal is to mask all this with a python tool that will take in the user input and use Sysfs in the background without the user knowing. This will be done last once I get the underlying code up into the kernel tree.

Well there you have it, half way done with the light at the end of the proverbial tunnel in sight. remember planning is key!

Till next time…


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s