A Rump Kernel Hypervisor for the Linux Kernel
Ever since I realized that the anykernel was the best way to construct a modern general purpose operating system kernel, I have been performing experiments by running unmodified NetBSD kernel drivers in rump kernels in various environments (nb. here driver does not mean a hardware device driver, but any driver like a file system driver or TCP driver). These experiments have included userspaces of various platforms, binary kernel modules on Linux and others, and compiling kernel drivers to javascript and running them natively in a web browser. I have also claimed that the anykernel allows harnessing drivers from a general purpose OS onto more specialized embedded computing devices which are becoming the new norm. This is an attractive possibility because while writing drivers is easy, making them handle all the abnormal conditions of the real world is a time-consuming process. Since the above-mentioned experiments were done on POSIX platforms (yes, even the javascript one), the experiments did not fully support the claim. The most interesting, decidedly non-POSIX platform I could think of for experimentation was the Linux kernel. Even though it had been several years since I last worked in the Linux kernel, my hypothesis was that it would be easy and fast to get unmodified NetBSD kernel drivers running in the Linux kernel as rump kernels.
A rump kernel runs on top of the rump kernel hypervisor. The hypervisor provides high level interfaces to host features, such as memory allocation and thread creation. In this case, the Linux kernel is the host. In principle, there are three steps in getting a rump kernel to run in a given environment. In reality, I prefer a more iterative approach, but the development can be divided into three steps all the same.
- implement generic rump kernel hypercalls, such as memory allocation, thread creation and synchronization
- figure out how to compile and run the rump kernel plus hypervisor in the target environment
- implement I/O related hypercalls for whatever I/O you plan to do
Getting basic functionality up and running was a relatively straightforward process. The only issue that required some thinking was an application binary interface (ABI) mismatch. I was testing on x86 where Linux kernel ABI uses -mregparm=3, which means that function arguments are passed in registers where possible. NetBSD always passes arguments on the stack. When two ABIs collide, the code may run, but since function arguments passed between the two ABIs result in garbage, eventually an error will be hit perhaps in the form of accessing invalid memory. The C code was easy enough to "fix" by applying the appropriate compiler flags. In addition to C code, a rump kernel uses a handful of assembly routines from NetBSD, mostly pertaining to optimizations (e.g. ffs()), but also to access the atomic memory operations of the platform. After assembly routines had been handled, it was possible to load a Linux kernel module which bootstraps a rump kernel in the Linux kernel and does some file system operations on the fictional kernfs file system. A screenshot of the resulting dmesg output is shown below.
It is one thing to execute a computation and an entirely different thing to perform I/O. To test I/O capabilities, I ran a rump kernel providing a TCP/IP driver inside the Linux kernel. For a networking stack to be able to do anything sensible, the interface layer needs to be able to shuffle packets. The quickest way to implement the hypercalls for packet shuffling was to use the same method as a userspace virtual TCP/IP stack might use: read/write packets using the tap device. Some might say that doing this from inside the kernel is cheating, but given that the alternative was to copypaste the tuntap driver and edit it slightly, I call my approach constructive laziness.
The demo itself opens a TCP socket to port 80 on vger.kernel.org (IP address 0x43b484d1 if you want to be really precise), does a HTTP get for "/" and displays the last 500 bytes of the result. TCP/IP is handled by the rump kernel, not by the Linux kernel. Think of it as the Linux kernel having two alternative TCP/IP stacks. Again, a screenshot of the resulting dmesg is shown below. Note that unlike in the first screenshot, there is no printout for the root file system because the configuration used here does not include any file system support. Yes, you can ping 10.0.2.17.
As hypothesized, a rump kernel hypervisor for the Linux kernel was easy and straightforward to implement. Furthermore, it could be done without making any changes to the existing hypercall interface thereby reinforcing the belief that unmodified NetBSD kernel drivers can run on top of most any embedded firmwares just by implementing a light hypervisor layer.
There were no challenges in the experiment, only annoyances. As Linux does not support rump kernels, I had to revert back to the archaic full OS approach to kernel development. The drawbacks of the full OS approach include for example suffering multi-second reboot cycles during iterative development. The other tangential issue that I spent a disproportionately large amount of time with was thinking about how releasing this code would affect existing NetBSD code due to GPL involvement. My conclusion was that this does not matter since all code used by the current demo is open source anyway, and if someone wants to use my code in a product, it is their problem, not mine.
For people interested in examining the implementation, I put the source code for the hypervisor along with the test code in a git repo here. The repository also contains the demos linked from this article. The NetBSD kernel drivers I used are available from ftp.netbsd.org or by getting buildrump.sh and running ./buildrump.sh checkout.
[2 comments]
Posted by Lester Vecsey on April 25, 2013 at 07:24 PM UTC #
Posted by Antti Kantee on April 26, 2013 at 10:22 AM UTC #