Category Archives: Technical

Xen Hypervisor to be Rewritten

The hypervisor team has come to the conclusion that using the C programming language, which is 45 years old as of writing, is not a good idea for the long term success of the project.

C, without doubt, is ridden with quirks and undefined behaviours. Even the most experienced developers find this collection of powerful footguns difficult to use. We’re glad that the development of programming languages in the last decade has given us an abundance of better choices.

After a heated debate among committers, we’ve settled on picking two of the most popular languages on HackerNews to rewrite the Xen hypervisor project. Our winners are Rust and JavaScript.

Rust, although not old enough to drink, has attracted significant attention in recent years. The hypervisor maintainers have acquainted themselves with the ownership system, borrow checker, lifetimes and cargo build system. We will soon start rewriting the X86 exception handler entry point, which has been a major source of security bugs in the past, and looks like an easy starting point for the conversion to Rust.

JavaScript has been a corner stone of web development since early 2000. With the advancement of React Native and Electron, plus the exemplary success of Atom and Visual Studio Code editors, it now makes sense to start rebuilding the Xen hypervisor toolstack in JavaScript. We’re confident that Node.js would be of great help when it comes to performance. And we believe Node.js and the current libxenlight event model is a match made in heaven.

Due to the improved ergonomics of the two programming languages, we expect developer efficiency to be boosted by factor of 10. We’re also quite optimistic that we can tap into the large talent pool of Rust and JavaScript developers and get significant help from them. We expect the rewrite to be finished and released within the year – by April 2018.

For those who want a more solid, tried and true technology, we are open to the idea of toolstack middleware being written in PHP and frontend JavaScript. But since maintainers are too busy playing with their new shiny toys, those who want PHP middleware will have to step up and help.

Stay put and get ready to embrace the most secure and easy to use Xen hypervisor ever, on April 1st 2018!

Note that this article was an April fools joke and was entirely made up.

Xen on ARM interrupt latency

Xen on ARM is becoming more and more widespread in embedded environments. In these contexts, Xen is employed as a single solution to partition the system into multiple domains, fully isolated from each other, and with different levels of trust.

Every embedded scenario is different, but many require real-time guarantees. It comes down to interrupt latency: the hypervisor has to be able to deliver interrupts to virtual machines within a very small timeframe. The maximum tolerable lag changes on a case by case basis, but it should be in the realm of nanoseconds and microseconds, not milliseconds.

Xen on ARM meets these requirements in a few different ways. Firstly, Xen comes with a flexible scheduler architecture. Xen includes a set of virtual machine schedulers, including RTDS, a soft real-time scheduler, and ARINC653, a hard real-time scheduler. Users can pick the ones that perform best for their use-cases. However, if they really care about latency, the best option is to have no schedulers at all and use a static assignment for virtual cpus to physical cpus instead. There are no automatic ways to do that today, but it is quite easy to achieve with the vcpu-pin command:

Usage: xl vcpu-pin [domain-id] [vcpu-id] [pcpu-id]

For example, in a system with four physical cpus and two domains with two vcpus each, a user can get a static configuration with the following commands:

xl vcpu-pin 0 0 0
xl vcpu-pin 0 1 1
xl vcpu-pin 1 0 2
xl vcpu-pin 1 1 3

As a result, all vcpus are pinned to different physical cpus. In such a static configuration, the latency overhead introduced by Xen is minimal. Xen always configures interrupts to target the cpu that is running the virtual cpu that should receive the interrupt. Thus, the overhead is down to just the time that it takes to execute the code in Xen to handle the physical interrupt and inject the corresponding virtual interrupt to the vcpu.

For my measurements, I used a Xilinx Zynq Ultrascale+ MPSoC, an excellent board with four Cortex A53 cores and a GICv2 interrupt controller. I installed Xen 4.9 unstable (changeset 55a04feaa1f8ab6ef7d723fbb1d39c6b96ad184a) and Linux 4.6.0 as Dom0. I ran tbm as a guest, which is a tiny baremetal application that programs timer events in the future, then, after receiving them, checks the current time again to measure the latency. tbm uses the virtual timer for measurements, however, the virtual timer interrupt is handled differently compared to any other interrupts in Xen. Thus, to make the results more generally applicable, I modified tbm to use the physical timer interrupt instead. I also modified Xen to forward physical timer interrupts to guests.

Keeping in mind that the native interrupt latency is about 300ns on this board, these are the results on Xen in nanoseconds:

AVG  MIN  MAX  WARM_MAX
4850 4810 7030 4980

AVG is the average latency, MIN is the minimum, MAX is the maximum and WARM_MAX is the maximum latency observed after discarding the first few interrupts to warm the caches. For real-time considerations, the number to keep in mind is WARM_MAX, which is 5000ns (when using static vcpu assignments).

This excellent result is small enough for most use cases, including piloting a flying drone. However, it can be further improved by using the new vwfi Xen command line option. Specifically, when vcpus are statically assigned to physical cpus using vcpu-pin, it makes sense to pass vwfi=native to Xen: it tells the hypervisor not to trap wfi and wfe commands, which are ARM instructions for sleeping. If no other vcpus can ever be scheduled on a given physical cpu, then we might as well let the guest put the cpu to sleep. Passing vwfi=native, the results are:

AVG  MIN  MAX  WARM_MAX
1850 1680 2650 1950

With this configuration, the latency is only 2 microseconds, which is extremely close to the hardware limits, and should be small enough for the vast majority of use cases. vwfi was introduced recently, but it has been backported to all the Xen stable trees.

In addition to vcpu pinning and vwfi, the third key parameter to reduce interrupt latency is unexpectedly simple: the DEBUG kconfig option in Xen. DEBUG is enabled by default in all cases except for releases. It adds many useful debug messages and checks, at the cost of increased latency. Make sure to disable it in production and when doing measurements.

Stealthy monitoring with Xen altp2m

One of the core features that differentiates Xen from other open-source hypervisors is its native support for stealthy and secure monitoring of guest internals (aka. virtual machine introspection [1]). In Xen 4.6 which was was released last autumn several new features have been introduced that make this subsystem better; a cleaned-up, optimized API and ARM support being just some of the biggest items on this list. As part of this release of Xen, a new and unique feature was also successfully added by a team from Intel that make stealthy monitoring even better on Xen: altp2m. In this blog entry we will take a look at what it’s all about.

p2mIn Xen’s terminology, p2m stands for the memory management layer that handles the translation from guest physical memory to machine physical. This translation is critical for safely partitioning the real memory of the machine between Xen and the various VMs running as to ensure a VM can’t access the memory of another without permission. There are several implementations of this mechanism, including one with hardware support via Intel Extended Page Tables (EPT) available to HVM guests and PVH . In Xen’s terminology, this is called Hardware Assisted Paging (hap). In this implementation the hypervisor maintains a second pagetable, similar to the one in 64-bit operating systems use, dedicated to running the p2m translation. All open-source hypervisors that use this hardware assisted paging method use a single EPT per virtual machine to handle this translation, as most of the time the memory of the guest is assigned at VM creation and doesn’t change much afterwards.

altp2mXen altp2m is the first implementation which changes this setup by allowing Xen to create more then one EPT for each guest. Interestingly, the Intel hardware has been capable of maintaining up to 512 EPT pointers from the VMCS since the Haswell generation of CPUs. However, no hypervisor made use of this capability until now. This changed in Xen 4.6, where we can now create of up to 10 EPTs per guest. The primary reason for this feature is to use it with the #VE and VMFUNC extensions.

It can also be used by external monitoring applications via the Xen vm_event system.

Why alt2pm is a game-changer

Alt2pm is a game-changer for applications performing purely external monitoring is because it simplifies the monitoring process of multi-vCPU guests. The EPT layer has been successfully used in stealthy monitoring applications to track the memory accesses made by the VM from a safe vantage point by restricting the type of access the VM may perform on various memory pages. Since EPT permission violations trap into the hypervisor, the VM would receive no indication that anything out of the ordinary has happened. While the method allowed for stealthy tracing of R/W/X memory accesses of the guest, the memory permission needs to be relaxed in order to allow the guest to continue execution. When a single EPT is shared across multiple running vCPUs, relaxing the permissions to allow one vCPU to continue may inadvertently allow another one to perform the memory access we would otherwise want to track. While under normal circumstances such race-condition may rarely occur, malicious code could easily use this to hide some of its actions from a monitoring application.

Solutions to this problem exist already. For example we can pause all vCPUs while the one violating the access is single-stepped. This approach however introduces heavy overhead just to avoid a race-condition that may rarely occur in practice. Alternatively, one could emulate the instruction that was violating the EPT permission without relaxing the EPT access permissions, as Xen’s built-in emulator doesn’t use EPT to access the guest memory. This solution, while supported in Xen, is not particularly ideal either as Xen’s emulator is incomplete and is known to have issues that can lead to guest instability [2]. Furthermore, over the years emulation has been a hotbed of various security issues in many hypervisors (including Xen [3]), thus building security tools based on emulation is simply asking for trouble. It can be handy but should be used only when no other option is available.

Xen’s altp2m system changes this problem quite significantly. By having multiple EPTs we can have differing access permissions defined in each table, which can be easily swapped around by changing the active EPT index in the VMCS. When the guest makes a memory access that is monitored, instead of having to relax the access permission, Xen can simply switch to an EPT (called a view) that allows the operation to continue. Afterwards the permissive view can be switched back to the restricted view to continue monitoring. Since each vCPU has its own VMCS where this switching is performed, this monitoring can be performed specific to each vCPU, without having to pause any of the other ones, or having to emulate the access. All without the guest noticing any of this switching at all. A truly simple and elegant solution.

Other introspection methods for stealthy monitoring

EPT based monitoring is not the only introspecting technique used for stealthy monitoring. For example, the Xen based DRAKVUF Dynamic Malware Analysis [4] uses it in combination with an additional technique to maximum effect. The main motivation for that is because EPT based monitoring is known to introduce significant overhead, even with altp2m: the granularity of the monitoring is that of a memory page (4KB). For example, if the monitoring application is really just interested in when a function-entry point is called, EPT based monitoring creates a lot of “false” events when that page is accessed for the rest of the function’s code.

This can be avoided by enabling the trapping of debug instructions into the hypervisor, a built-in feature of Intel CPUs that Xen exposes to third-party applications. This method is used in DRAKVUF, which writes breakpoint instructions into the guests’ memory at code-locations of interest. Since we will only get an event for precisely the code-location we are interested in this method effectively reduces the overhead. However, the trade-off is that unlike EPT permissions the breakpoints are now visible to the guest. Thus, to hide the presence of the breakpoints from the guest, these pages need to get further protected by restricting the pages to be execute-only in the EPT. This allows DRAKVUF to remove the breakpoints before in-guest code-integrity checking mechanisms (like Windows Patchguard) can access the page. While with altp2m the EPT permissions can be safely used with multi-vCPU systems, using breakpoints similarly presents a race-condition: the breakpoint hit by one vCPU has to be removed to allow the guest to execute the instruction that was originally overwritten, potentially allowing another vCPU to do so as well without notice.

altp2m-shadowFortunately, altp2m has another neat feature that can be used to solve this problem. Beside allowing for changing the memory permissions in the different altp2m views, it also allows to change the mapping itself! The same guest physical memory can be setup to be backed by different pages in the different views. With this feature we can really think of guest physical memory as “virtual”: where it is mapped really depends on which view the vCPU is running on. Using this feature allows us to hide the presence of the breakpoints in a brand new way. To do this, first we create a complete shadow copy of the memory page where a breakpoint is going to be written and only write the breakpoint into this shadow copy. Now, using altp2m, we setup a view where the guest physical memory of the page gets mapped to our shadow copy. The guest continues to access its physical memory as before, but underneath it is now using the trapped shadow copy. When the breakpoint is hit, or if something is trying to scan the code, we simply switch the view to the unaltered view for the duration of a single-step, then switch back to the trapped view. This allows us to hide the presence of the breakpoints specific to each vCPU! All without having pause any of the other vCPUs or having to emulate. The first open-source implementation of this tracing has been already merged into the DRAKVUF Malware Analysis System and is available as a reference implementation for those interested in more details.

Conclusion

As we can see, Xen continues to be on the forefront of advancing the development of virtualization based security application and allowing third-party tools to create some very exotic setups. This flexibility is what’s so great about Xen and why it will continue to be a trend-setter for the foreseeable future

References

[1] Virtual Machine Introspection
[2] xen-devel@: Failed vm entry with heavy use of emulator
[3] Hardening Hypervisors Against VENOM-Style Attacks
[4] DRAKVUF Malware Analysis System (drakvuf.com)
[5] Stealthy, Hypervisor-based Malware Analysis (Presentation)

Xen Project Contributor Training v2

Two weeks ago, I embarked onto a road trip to China with the aim to meet Xen Project users as well as contributors. I visited a number of vendors in Hangzhou and Beijing on this trip. Part of the objective was to give training to new contributors and developers, and to strengthen existing relationships.

Hypervisor contributions from Chinese developers

Hypervisor contributions from Chinese developers

A year ago I travelled to China and pioneered our developer training, because many of our Chinese developers had some challenges working with the community. The good news is that the training activities have helped, which can be seen in contribution statistics. This leads us to the “bad news”: a new group of developers joined the community, who could benefit from training. In addition, a lot of process and operational changes are currently discussed or have recently taken place within our community.

What is remarkable, is that many of the latest contributors to the project have only recently graduated from University (in 2014 or 2015). Working with the Xen Project and Linux was often their first experience with open source. Working with open source projects is not always easy, in particular when doing so in a non-native language and with a manager behind you, who expects that you get a feature into an open source project by a certain time. In addition, as a community we need to balance the needs of different stake-holders (enterprise, cloud, embedded, security companies) and make informed decisions on the relative importance of new features vs. quality vs. security vs. … which has led to increasingly strict criteria and more and more scrutiny, when reviewing code contributions. This means, that contributing to the project for the first time can sometimes feel like a real challenge. Part of the reason why I regularly travel to China, is to explain what is happening in the community, to explain that all members of the community can influence and shape how the project is run and to understand local community issues and address them as they occur.

Contributor Training v2

Since the creation of the training material last autumn, there have been a few changes in how the project operates. Most notably in the Security Vulnerability Management Process and Release Management. Many other areas of how the project operates are also being reviewed and discussed. The goals behind these discussions and proposed changes intend …

  • to make the communities’ development processes more efficient and scalable.
  • to make conscious decisions about trade-offs, such ease of feature contribution vs. quality and security.
  • to make it easier for newcomers to join the project.
  • to encourage more contributors to review other people’s code, test our software, write test code and make other non-code contributions to the project.

Thus, I updated our training material to reflect these changes and added new material. It is divided into 4 separate modules, each of which takes approximately 2.5 hours to deliver. The training decks are designed as reference material for self-study. Each training module has many examples and embedded links in it. The material is available from our Developer Intro Portal as slides or as PDFs. I embedded the updated and new training modules into this blog for your convenience:

If you have any questions, feel free to ask by contacting me via community dot manager at xenproject dot org and I will improve the material based on feedback. My plan is to keep the training material up-to-date and to modify it as new questions and new challenges arise.

Will Docker Replace Virtual Machines?

Docker is certainly the most influential open source project of the moment. Why is Docker so successful? Is it going to replace Virtual Machines? Will there be a big switch? If so, when?

Let’s look at the past to understand the present and predict the future. Before virtual machines, system administrators used to provision physical boxes to their users. The process was cumbersome, not completely automated, and it took hours if not days. When something went wrong, they had to run to the server room to replace the physical box.

With the advent of virtual machines, DevOps could install any hypervisor on all their boxes, then they could simply provision new virtual machines upon request from their users. Provisioning a VM took minutes instead of hours and could be automated. The underlying hardware made less of a difference and was mostly commoditized. If one needed more resources, it would just create a new VM. If a physical machine broke, the admin just migrated or resumed her VMs onto a different host.

Finer-grained deployment models became viable and convenient. Users were not forced to run all their applications on the same box anymore, to exploit the underlying hardware capabilities to the fullest. One could run a VM with the database, another with middleware and a third with the webserver without worrying about hardware utilization. The people buying the hardware and the people architecting the software stack could work independently in the same company, without interference. The new interface between the two teams had become the virtual machine. Solution architects could cheaply deploy each application on a different VM, reducing their maintenance costs significantly. Software engineers loved it. This might have been the biggest innovation introduced by hypervisors.

A few years passed and everybody in the business got accustomed to working with virtual machines. Startups don’t even buy server hardware anymore, they just shop on Amazon AWS. One virtual machine per application is the standard way to deploy software stacks.

Application deployment hasn’t changed much since the ’90s though. Up until then, it still involved installing a Linux distro, mostly built for physical hardware, installing the required deb or rpm packages, and finally installing and configuring the application that one actually wanted to run.

In 2013 Docker came out with a simple, yet effective tool to create, distribute and deploy applications wrapped in a nice format to run in independent Linux containers. It comes with a registry that is like an app store for these applications, which I’ll call “cloud apps” for clarity. Deploying the Nginx webserver had just become one “docker pull nginx” away. This is much quicker and simpler than installing the latest Ubuntu LTS. Docker cloud apps come preconfigured and without any unnecessary packages that are unavoidably installed by Linux distros. In fact the Nginx Docker cloud app is produced and distributed by the Nginx community directly, rather than Canonical or Red Hat.

Docker’s outstanding innovations are the introduction of a standard format for cloud applications, including the registry. Instead of using VMs to run cloud apps, Linux containers are used instead. Containers had been available for years, but they weren’t quite popular outside Google and few other circles. Although they offer very good performance, they have fewer features and weaker isolation compared to virtual machines. As a rising star, Docker made Linux containers suddenly popular, but containers were not the reason behind Docker’s success. It was incidental.

What is the problem with containers? Their live-migration support is still very green and they cannot run non-native workloads (Windows on Linux or Linux on Windows). Furthermore, the primary challenge with containers is security: the surface of attack is far larger compared to virtual machines. In fact, multi-tenant container deployments are strongly discouraged by Docker, CoreOS, and anybody else in the industry. With virtual machines you don’t have to worry about who is going to use it or how it will be used. On the other hand, only containers that belong to the same user should be run on the same host. Amazon and Google offer container hosting, but they both run each container on top of a separate virtual machine for isolation and security. Maybe inefficient but certainly simple and effective.

People are starting to notice this. At the beginning of the year a few high profile projects launched to bring the benefits of virtual machines to Docker, in particular Clear Linux by Intel and Hyper. Both of them use conventional virtual machines to run Docker cloud applications directly (no Linux containers are involved). We did a few tests with Xen: tuning the hypervisor for this use case allowed us to reach the same startup times offered by Linux containers, retaining all the other features. A similar effort by Intel for Xen is being presented at the Xen Developer Summit and Hyper is also presenting their work.

This new direction has the potential to deliver the best of both worlds to our users: the convenience of Docker with the security of virtual machines. Soon Docker might not be fighting virtual machines at all, Docker could be the one deploying them.

A Chinese translation of the article is available here: http://dockone.io/article/598

On rump kernels and the Rumprun unikernel

The Rumprun unikernel, based on the driver components offered by rump kernels, provides a means to run existing POSIX applications as unikernels on Xen. This post explains how we got here (it matters!), what sort of things can be solved today, and also a bit of what is in store for the future. The assumption for this post is that you are already familiar with unikernels and their benefits, or at least checked out the above unikernel link, so we will skip a basic introduction to unikernels.

Pre-Xen history for rump kernels

The first line of code for rump kernels was written more than 8 years ago in 2007, and the roots of the project can be traced to some years before that. The initial goal was to run unmodified NetBSD kernel drivers as userspace programs for testing and development purposes. Notably, in our terminology we use driver for any software component that acts as a protocol translator, e.g. TCP/IP driver, file system driver or PCI NIC driver. Namely, the goal was to run the drivers in a minimal harness without the rest of the OS so that the OS would not get in the way of the development. That minimal quality of most of the OS not being present also explains why the container the drivers run in is called a rump kernel. It did not take long to realize the additional potential of isolated, unmodified, production-quality drivers. “Pssst, want a portable, kernel-quality TCP/IP stack?” So, the goal of rump kernels was adjusted to provide portable, componentized drivers. Developing and testing NetBSD drivers as userspace programs was now one side-effect enabled by that goal. Already in 2007 the first unikernel-like software stack built on rump kernels was sketched by using file system drivers as a mtools-workalike (though truthfully it was not a unikernel for reasons we can split hairs about). Later, in 2008, a proper implementation of that tool was done under the name fs-utils [Arnaud Ysmal].

The hard problem with running drivers in rump kernels was not figuring out how to make it work once, the hard problem was figuring out how to make it sustainable so that you could simply pick any vintage of the OS source tree and use the drivers in rump kernels out-of-the-box. It took about two weeks to make the first set of unmodified drivers run as rump kernels. It took four years, ca. 2007-2011, to figure out how to make things sustainable. During the process, the external dependencies on top of which rump kernels run were discovered to consist of a thread implementation, a memory allocator, and access to whatever I/O backends the drivers need to access. These requirements were codified into the rump kernel hypercall interface. Unnecessary dependencies on complications, such as interrupts and virtual memory, were explicitly avoided as part of the design process. It is not that supporting virtual memory, for example, was seen to be impossible, but rather that the simplest form meant things would work the best and break the least. This post will not descend into the details or rationales of the internal architecture, so if you are interested in knowing more, have a look at book.rumpkernel.org.

In 2011, with rump kernels mostly figured out, I made the following prediction about them: “[…] can be seen as a gateway from current all-purpose operating systems to more specialized operating systems running on ASICs”. Since this is the Xen blog, we should unconventionally understand ASIC to stand for Application Specific Integrated Cloud.  The only remaining task was to make the prediction come true. In 2012-2013, I did some for-fun-and-hack-value work by making rump kernels run e.g. in a web browser and in the Linux kernel. Those experiments taught me a few more things about fitting rump kernels into other environments and confirmed that rump kernels could really run anywhere as long as one figured out build details and implemented the rump kernel hypercall interface.

Birth of the rump kernel-based unikernel

Now we get to the part where Xen enters the rump kernel story, and one might say it does so in a big way. A number of people suggested running rump kernels on top of Xen over the years. The intent was to build e.g. lightweight routers or firewalls as Xen guests, or anything else where most of the functionality was located in the kernel in traditional operating systems. At that time, there was no concept of offering userspace APIs on top of a rump kernel, just a syscall interface (yes, syscalls are drivers). The Xen hypervisor was a much lower-level entity than anything else rump kernels ran on back then. In summer 2013 I discovered Mini-OS, which provided essentially everything that rump kernels needed, and not too much extra stuff, so Xen support turned out to be more or less trivial. After announcing the result on the Xen lists, a number of people made the observation that a libc bolted on top of the rump kernel stack should just work; after all, rump kernels already offered the set of system calls expected by libc. Indeed, inspired by those remarks and after a few days of adventures with Makefiles and scripts, the ability to run unmodified POSIX-y software on top of the Xen hypervisor via the precursor of the Rumprun unikernel was born. Years of architectural effort on rump kernels had paid rich dividends.

So it was possible to run software. However, before you can run software, you have to build it for the target environment — obviously. Back in 2013, a convoluted process was required for building. The program that I used for testing during the libc-bolting development phase was netcat. That decision was mostly driven by the fact that netcat is typically built with cc netcat.c, so it was easy to adapt netcat’s build procedure. Hand-adapting more complex build systems was trickier. That limitation meant that the Rumprun unikernel was accessible only to people who had the know-how to adapt build systems and the time to do so — that set of people can be approximated as the empty set. What we wanted was for people to be able to deploy existing software as unikernels using the existing “make + config + run” skill set.

The first step in the above direction was creating toolchain wrappers for building applications on top of the Rumprun unikernel [Ian Jackson]. The second step was going over a set of pertinent real-world application programs so as to both verify that things really work, and also to create a set of master examples for common cases [Martin Lucina]. The third step was putting the existing examples into a nascent packaging system. The combined result is that anybody with a Xen-capable host is no more than a few documented commands away from deploying e.g. Nginx or PHP as unikernels. We are still in the process of making the details maximally flexible and user-friendly, but the end result works already. One noteworthy thing is that applications for the Rumprun unikernel are always cross-compiled. If you are an application author and wish to see your work run on top of the Rumprun unikernel, make sure your build system supports cross-compilation. For example, but not limited to, using standard GNU autotools will just work.

Comparison with other unikernels

The goal of rump kernels was not to build a unikernel. It still is not. The mission of the rump kernel project is to provide reusable kernel-quality components which others may build upon. For example, the MirageOS project has already started work towards using rump kernels in this capacity [Martin Lucina]. We encourage any other project wishing to do the same to communicate with us especially if changes are needed. Everyone not having to reinvent the wheel is one thing; we are aiming for everyone not having to maintain the wheel.

So if the goal of the rump kernel project was not to build a unikernel, why are we doing one? At some point we simply noticed that we have the right components and a basic unikernel built on top of rump kernels fell out in a matter of days. That said, and as indicated above, there has been and still is a lot of work to be done to provide the peripheral support infrastructure for unikernels. Since our components come unmodified from NetBSD, one might say that the Rumprun unikernel targets legacy applications. Of course, here “legacy” means “current reality,” even when I do strongly believe that “legacy” will some day actually be legacy. But things change slowly. Again, due to unmodified NetBSD component reuse, we offer a POSIX-y API. Since there is no porting work which could introduce errors into the application runtime, libc or drivers, programs will not just superficially seem to work, they will actually work and be stable. In the programming language department, most languages with a POSIX-based runtime will also simply just work. In the name of the history aspect of this post, the first non-C language to run on top of rump kernels on Xen was LuaJIT [Justin Cormack].

The following figure illustrates the relationships of the concepts further. We have not discussed the anykernel, but for understanding the figure it is enough to know that the anykernel enables the use of unmodified kernel components from an existing OS kernel; it is not possible to use just any existing OS kernel to construct rump kernels (details at book.rumpkernel.org). Currently, NetBSD is the only anykernel in existence. The third set of boxes on the right is an example, and the Mirage + rump kernel amalgamation is another example of what could be depicted there.

Conclusions and future work

You can use rump kernels to deploy current-day software as unikernels on Xen. Those unikernels have a tendency to simply work since we are using unmodified, non-ported drivers from an upstream OS. Experiments with running a reasonable number of varied programs as Rumprun unikernels confirms the previous statement. Once we figure out the final, stable usage of the full build-config-deploy chain, we will write a howto-oriented post here. Future posts will also be linked from the publications and talks page on the rump kernel wiki. Meanwhile, have a look at repo.rumpkernel.org/rumprun and the wiki tutorial section. If you want to help with figuring out e.g. the packaging system or launch tool usage, check the community page on the wiki for information on how to contribute.

There will be a number of talks around the Rumprun unikernel this month. At the Xen 2015 Developer Summit in Seattle, Wei Liu will be talking about Rump Kernel Based Upstream QEMU Stubdom and Martin Lucina will be talking about Deploying Real-World Software Today as Unikernels on Xen with Rumprun. Furthermore, at the co-located CloudOpen, Martin Lucina will be one of the panelists on the Unikernel vs. Container Panel Debate. At the Unikernel User Summit at Texas Linux Fest Justin Cormack will present get started using Rump Kernels.