At XenSummit 2012 in San Diego, Mukesh Rathor from Oracle presented his work on a new virtualization mode, called “PVH”. Adding this mode, there are now a rather dizzying array of different terms thrown about — “HVM”, “PV”, “PVHVM”, “PVH” — what do they all mean? And why do we have so many?
The reason we have all these terms is that virtualization is no longer binary; there is a spectrum of virtualization, and the different terms are different points along the spectrum. Part of the reason the terminology is a little unclear is the history; any language and terminology evolves over time in response to the changing situation. However, changing the terminology after-the-fact once certain usages become common is difficult.
So in this series of articles I will introduce just enough history to understand how the current situation came about, and (hopefully) introduce a consistent set of terminology which may help clear things up, while balancing this against the fact that people will still continue to use existing terminology.
This article will be divided into two parts. The first part will give a general introduction to virtualization, and to paravirtualization, Xen’s unique contribution to the field, as well as the advent of hardware virtualization extensions (HVM). It will also introduce the idea of adding paravirtualized drivers for disk and network. The second mode will cover the motivation and technical descriptions of two more modes which further mix elements of full virtualization and paravirtualization.
In the early days of virtualization (at least in the x86 world), the assumption was that you needed your hypervisor to provide a virtual machine that was functionally nearly identical to a real machine. This included the following aspects:
- Disk and network devices
- Interrupts and timers
- Emulated platform: motherboard, device buses, BIOS
- “Legacy” boot: i.e., starting in 16-bit mode and bootstrapping up to 64-bit mode
- Privileged instructions
- Pagetables (memory access)
In the early days of x86 virtualization, all of this needed to be virtualized: disk and network devices needed to be emulated, as did interrupts and timers, the motherboard and PCI buses, and so on. Guests needed to start in 16-bit mode and run a BIOS which loaded the guest kernel, which (again) ran in 16-bit mode, then bootstrapped its way up to 32-bit mode, and possibly then to 64-bit mode. All privileged instructions executed by the guest kernel needed to be emulated somehow; and the pagetables needed to be emulated in software.
This is mode, where all of the aspects the virtual machine must be functionally identical to real hardware, is what I will call fully virtualized mode.
Xen and paravirtualization
Unfortunately, particularly for x86, virtualizing privileged instructions is very complicated. Many instructions for x86 behave differently in kernel and user mode without generating a trap, meaning that your options for running kernel code were to do full software emulation (incredibly slow) or binary translation (incredibly complicated, and still very slow).
The key question of the original Xen research project at Cambridge University was, “What if instead of trying to fool the guest kernel into thinking it’s running on real hardware, you just let the guest know that it was running in a virtual machine, and changed the interface you provide to make it easier to implement?” To answer that question, they started from the ground up designing a new interface designed for virtualization. Working together with researchers at both the Intel and Microsoft labs, they took both Linux and Windows XP, and ripped out anything that was slow or difficult to virtualize, replacing it with calls into the hypervisor (hypercalls) or other virtualization-friendly techniques. (The Windows XP port to Xen 1.0, as you might imagine, never left Microsoft Research; but it was benchmarked in the original paper.)