Category Archives: Uncategorized

Xen on ARM and the Device Tree vs. ACPI debate

ACPI vs. Device Tree on ARM

Some of you may have seen the recent discussions on the linux-arm-kernel mailing list (and others) about the use of ACPI vs DT on the ARM platform. As always LWN have a pretty good summary (currently subscribers only, becomes freely available on 5 December) of the situation with ACPI on ARM.

Device Tree (or DT) and Advanced Configuration & Power Interface (or ACPI) are both standards which are used for describing a hardware platform e.g. to an operating system kernel. At their core both technologies provide a tree like data structure containing a hierarchy of devices and specifying what type they are and a set of “bindings” for that device. A binding is essentially a schema for specifying I/O regions, interrupt mappings, GPIOs and clocks etc.

For the last few years Linux on ARM has been moving away from hardcoded “board files” (a bunch of C code for each platform) towards using Device Tree instead. In the ARM space ACPI is the new kid on the block and has many unknowns. Given this the approach to ACPI which appears to have been reached by the Linux kernel maintainers, which is essentially to wait and see how the market pans out, seems sensible.

On the Xen side we started the port to ARM around the time that Linux’s transition from board files to Device Tree was starting and made the decision early on to go directly to device tree (ACPI wasn’t even on the table at this point, at least not publicly). Xen DT to discover all of the hardware on the system, both that which it intends to use itself and that which it intends to pass to domain 0. As well as consuming DT itself Xen also creates a filleted version of the host DT which it passes to the domain 0 kernel. DT is simple and yet powerful enough to allow us to do this relatively easily.

DT is also used by some of the BSD variants in their ARM ports as well.

My Position as Xen on ARM Maintainer

The platform configuration mechanism supported by Xen on ARM today is Device Tree. Device Tree is a good fit for our requirements and we will continue to support it as our primary hardware description mechanism.

Given that a number of operating system vendors and hardware vendors care about ACPI on ARM and are pushing hard for it, especially in the ARM server space, it is possible, perhaps even likely, that we will eventually find ourselves needing support ACPI as well. On systems which support both ACPI and DT we will continue to prefer Device Tree. Once ARM hardware platforms that only support ACPI are available, we will obviously need to support ACPI.

The Xen Project works closely with the Linux kernel and other open source upstreams as well as organisations such as Linaro. Before Xen on ARM can support ACPI I would like see it gaining some actual traction on ARM. In particular I would like to see it get to the point where it has been accepted by the Linux kernel maintainers. It is clearly not wise for Xen to be pioneering the use of ACPI before to it becoming clear whether or not it is going to gain any traction in the wider ecosystem.

So if you are an ARM silicon or platform vendor and you care about virtualization and Xen in particular, I encourage you to provide a complete device tree for your platform.

Note that this only applies to Xen on ARM. I cannot speak for Xen on x86 but I think it is pretty clear that it will continue to support ACPI so long as it remains the dominant hardware description on that platform.

It should also be noted that ACPI on ARM is primarily a server space thing at this stage. Of course Xen and Linux are not just about servers: both communities have sizable communities of embedded vendors (on the Xen side we had several interesting presentations at the recent Xen Developer Summit on embedded uses of Xen on ARM). Essentially no one is suggesting that the embedded use cases should move from DT to ACPI and so, irrespective of what happens with ACPI, DT has a strong future on ARM.

ACPI and Type I Hypervisors

Our experience on x86 has shown that the ACPI model is not a good fit for Type I hypervisors such as Xen, and the same is true on ARM. ACPI essentially enforces a model where the hypervisor, the kernel, the OSPM (the ACPI term for the bit of an OS which speaks ACPI) and the device drivers all must reside in the same privileged entity. In other words it effectively mandates a single monolithic entity which controls everything about the system. This obviously precludes such things as dividing hardware into that which is owned and controlled by the hypervisor and that which is owned and controlled by a virtual machine such as dom0. This impedance mismatch is probably not insurmountable but experience with ACPI on x86 Xen suggests that the resulting architecture is not going to be very agreeable.


Due to their history on x86 ACPI and UEFI are often lumped together as a single thing when in reality they are mostly independent. There is no reason why UEFI cannot also be used with Device Tree. We would expect Xen to support UEFI sooner rather than later.

RT-Xen: Real-Time Virtualization in Xen

RT-XenThe researchers at Washington University in St. Louis and University of Pennsylvania are pleased to announce, here on this blog, the release of a new and greatly improved version of the RT-Xen project. Recent years have seen increasing demand for supporting real-time systems in virtualized environments (for example, the Xen-ARM projects and several other real-time enhancements to Xen), as virtualization enables greater flexibility and reduces cost, weight and energy by breaking the correspondence between logical systems and physical systems. As an example of this, check out the video below from the 2013 Xen Project Developer Summit

The video describes how Xen could be used in an in-vehicle infotainement system.

In order to combine real-time and virtualization, a formally defined real-time scheduler at the hypervisor level is needed to provide timing guarantees to the guest virtual machines. RT-Xen bridges the gap between real-time scheduling theory and the virtualization technology by providing a suite of multi-core real-time schedulers to deliver real-time performance to domains running on the Xen hypervisor.

Background: Scheduling in Xen

In Xen, each domain’s core is abstracted as a Virtual CPU (VCPU), and the hypervisor scheduler is responsible for scheduling VCPUs. For example, the default credit scheduler would assign a weight per domain, which decides the proportional share of CPU cycles that a domain would get. The credit scheduler works great for general purpose computing, but is not suitable for real-time applications due to the following reasons:

  1. There is no reservation with credit scheduler. For example, when two VCPUs runs on a 2 GHz physical core, each would get 1 GHz. However, if another VCPU also boots on the same PCPU, the resource share shrinks to 0.66 GHz. The system manager have to carefully configure the number of VMs/VCPUs to ensure that each domain get an appropriate amount of CPU resource;
  2. There is little timing predictability or real-time performance provided to the VM. If a VM is running some real-time workload (video decoding, voice processing, and feedback control loops) which are periodically triggered and have a timing requirement — for example, the VM must be scheduled every 10 ms to process the data — there is no way the VM can express this information to the underlying VMM scheduler. The existing SEDF scheduler can help with this, but it has poor support for multi-core.

RT-Xen: Combining real-time and virtualization

RT-Xen aims to solve this problem by providing a suite of real-time schedulers. The users can specify (budget, period, CPU mask) for each VCPU individually. The budget represents the maximum CPU resource a VCPU will get during a period; the period represents the timing quantum of the CPU resources provided to the VCPU; the CPU mask defines a subset of physical cores a VCPU is allowed to run. For each VCPU, the budget is reset at each starting point of the period (all in milliseconds), consumed when the VCPU is executing, and deferred when the VCPU has budget but no work to do.

Within each scheduler, the users can switch between different priority schemes: earliest deadline first (EDF), where VCPU with earlier deadline has higher priority; or rate monotonic (RM), where VCPU with shorter period has higher priority. As a results, not only the VCPU gets a resource reservation (budget/period), but also an explicit timing information for the CPU resources (period). The real-time schedulers in RT-Xen delivers the desired real-time performance to the VMs based on the resource reservations.

To be more specific, the two multi-core schedulers in RT-Xen are:

  • RT-globalwhich uses a global run queue to hold all VCPUs (in runnable state). It is CPU mask aware, and provides better resource utilization, as VCPU can migrate freely between physical cores (within CPU mask)
  • RT-partition: which uses a run queue per physical CPU. In this way, each physical CPU only looks at its own run queue to make scheduling decisions, which incurs less overhead and potentially better cache performance. However, load-balancing between physical cores is not provided in the current release.

Source Code and References

The developers of RT-Xen are looking closely at how to integrate both schedulers into the Xen mainstream. In the meantime, please check out publications at [EMSOFT’14], [EMSOFT’11], [RTAS’12] and source code.

Fedora 20 Virtualization Test Day Report

So, it was Fedora Virtualization Test Day last Tuesday and I actually went down and took the occasion for some good testing of Xen on the next Fedora release (Fedora 20, codename Heisenbug). Fedora is going to ship Xen 4.3 (and there are not many other mainstream distribution doing that), so it is very important to try to make sure it will be as good as possible for Fedora users!

A lot of information on how to (well, how you should have… but it’ll be for next time ;-P) participate  to such event are available on our Wiki. What I am up to, here, is reporting how some of the tests I did that day went. Hopefully, this would give an idea of where we stand, regarding the integration of Xen in Fedora, as well as how well Xen itself works with Fedora’s default virtualization toolstack, i.e., libvirt.

Setting up the testing environment

Well, you at least need a Fedora 20 installation, in order to test Xen on Fedora 20. For the details, have a look at the already mentioned wiki page. Here I’m only going to say that I decided to go for a PXE-boot based install, which I did by downloading the following files:

 $ wget
 $ wget

and by preparing an appropriate entry in my PXE server configuration (usually a file called pxeconfig.cfg):

label fedora-20btc1-amd64-s
    KERNEL fedora/20/x86_64/Beta-TC2/vmlinuz
    APPEND initrd=fedora/20/x86_64/Beta-TC2/initrd.img repo= console=ttyS0,115200n8 text serial

Screenshot from 2013-10-08 15_19_37

Mind the console=ttyS0,115200n8 text serial in case you want to run the install on a serial console, like I’m doing in this case.

On a second test box, I did a proper graphical install (still via PXE). No big difference, really, just follow the guided procedure, then grab a coffe` and wait for this screen (on the right) to happear.

Installing Xen and rebooting into Dom0

After finishing installing the host, we need to install Xen, libvirt and some libvirt related tools. It’s all described in this other Wiki page, so let’s skip it here…. Just follow that instructions and reboot into the following:

Fedora release 20 (Heisenbug)
Kernel 3.11.2-301.fc20.x86_64 on an x86_64 (hvc0)

odyn login: root

# cat /etc/fedora-release 
Fedora release 20 (Heisenbug)
# sudo uname -a
Linux 3.11.3-301.fc20.x86_64 #1 SMP Thu Oct 3 00:57:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
# virt-what 

Well, I guess we can make a note of the first important fact of the Test Day:

  • Fedora 20 works quite nicely and straightforwardly as Xen Dom0!

Creating guests

Ok, let’s move forward to creating some guest. In Fedora, when you want to create and install a guest, especially a Fedora guest, you do it with virt-install. Period. So, let’s do that, a Fedora 20 PV guest, on a Fedora 20 Dom0, with virt-install, installed from the serial console too. It’s actually more easily done than said:

 # lvcreate -nf20_64 -L10G /dev/fedora_odyn
 # virt-install --paravirt --name f20_64 --ram 2048 --vcpus 4 -f /dev/fedora_odyn/f20_64 --network bridge=virbr0 --location --graphics none

 1) [x] Installation source               2) [!] Timezone settings
        (          (Timezone is
        b/alt/stage/20-Beta-TC1/Fedora/          not set.)
        x86_64/os/)                       4) [!] Set root password
                                                 (Password is not set.)
 3) [!] Install Destination               6) [!] Software selection
        (No disks selected)                      (GNOME Desktop)
 5) [!] Create user
        (No user will be created)
 7) [x] Network settings
        (Wired (eth0) connected)
  Please make your choice from above ['q' to quit | 'c' to continue |
  'r' to refresh]: 
[anaconda] 1:main* 2:shell  3:log  4:storage-log  5:program-log

Let’s now head to my second test box, and do something similar, which lead us where this screenshot shows:

Screenshot from 2013-10-08 16_50_31

I also created another PV guest and an HVM guest there, with similar procedures. From all this, we can reasonably assess the following:

  • Fedora 20 works fine both as a PV and HVM Xen guest.

Playing with the guests with virsh

Now, what about, seeing what we have running:

# virsh list
Id Name State
39     F20-HVM                               running
40    fedora20                               running

Pausing and resuming both the PV and HVM guests:

# virsh suspend F20-HVM
Domain F20-HVM suspended
# virsh suspend fedora20
Domain fedora20 suspended
# virsh list
Id Name State
39      F20-HVM                               paused
40     fedora20                               paused

# virsh resume F20-HVM
Domain F20-HVM resumed
# virsh resume fedora20
Domain fedora20 resumed
# virsh list
Id Name State
39     F20-HVM                               running
40    fedora20                               running

And, finally, saving-&-restoring one of them

# virsh save fedora20 /tmp/savefile
Domain fedora20 saved to /tmp/savefile
# virsh list
Id Name State
39      F20-HVM                              running

# virsh restore /tmp/savefile
Domain restored from /tmp/savefile
# virsh list
Id Name State
39      F20-HVM                              running
41     fedora20                              running

I also tried importing and cloning a VM, as described here and here, and it all worked.

Any issues, then? There indeed was one. Basically, it looks like reaching the guest’s PV console via virsh does not work, while it is fine with xl console:

# virsh console fedora20
Connected to domain fedora20
Escape character is ^]
error: internal error: cannot find character device (null)

# xl console fedora20
Fedora release 20 (Heisenbug)
Kernel 3.11.2-301.fc20.x86_64 on an x86_64 (hvc0)

fedora20 login: root
[root@fedora20 ~]#

And I will of course report that to the appropriate mailing list/bugzilla.

What’s there, what’s missing

The previous section reveals that not only Xen is straightforward to install and works quite well on Fedora 20 as a Dom0, and that Fedora 20 works quite well as a Xen PV or HVM guest. It also shows how the basic VM lifecycle of a Xen guest, in Fedora 20, can be handled nicely enough with libvirt and the related tools (virt-install, virt-manager, virt-viewer, etc.). That of course does not exclude the possibility of using Xen’s default command line toolstack (xl).

The only two relevant missing features, at the time of writing, in the libvirt libxl driver are:

  • PCI Passthrough
  • live migration

Yes, big ones, I know. However, consider the following:

  1. that does not mean that PCI Passthrough and migration does not work on Xen on Fedora at all. They do work via the xl toolstack, they are just not available via libvirt;
  2. this is going to be solved soon, as the libvirt libxl driver maintainer Jim Fehling reported recently on xen-devel. In fact, this is the patch series for PCI Passthrough, and this is the patch series implementing live migration, and there are pretty good chances that both these patches make it in libvirt before Xen 4.4 release time (so, not in time for Fedora 20, but still not bad at all).

So, stay tuned since, as Jim says, “Slowly, with each libvirt release, the libxl driver is improving”.


Fedora really does a great job with these Test Days. All of it: the planning, the managing, the reporting… An example that many other project should look at and try to follow (and actually, Xen Project is trying, as we also started having Xen Test Days).

Participating to the latest Fedora Virtualization Test Day has been really nice, although we need to do a better job in convincing more Xen folks to be there and do some Xen specific testing. Anyway, I am really glad to have had the chance to verify how well Xen 4.3 will work on Fedora 20.

It is actually quite important that we get a good Xen on Fedora test coverage, at least as far as running the next release of Fedora as a Xen DomU is concerned. In fact, being a functional Xen guest is one of the release blockers for a Fedora release, as in, if release X does not work as a Xen guest, it can’t be released!

Since testing Xen on Fedora is, for the most part, testing Xen integration with libvirt, what about producing some libvirt test-cases for OSSTest? That would be very cool, and we are already working on it. Another interesting thing would be to also have OSSTest could try to build and run Xen on various distro (as host), instead than using only Debian, as it is doing right now. This is a bit more tricky than the above, but we are thinking at how to do that too (standalone mode could, perhaps, help).

Debconf 13

I’ve recently returned from Debconf 13, in Vaumarcus in Switzerland. My colleague Ian Campbell joined me there.

Debconf is the annual conference for contributors to Debian, with a few hundred attendees. There’s a fairly standard conference format with a programme of talks and BoF sessions, but the best part of of a Debconf is usually the ad-hoc conversations with other developers. Often thorny design problems involving multiple parts of the system can be tackled much more effectively in person, so there’s quite a bit of vigorous handwaving and the odd whiteboard/flipchart session.

We had an excellent time and spent rather too much of it staring at the amazing view of Lake Neuchatel. Debian’s 20th birthday party was not to be missed either.

This year’s Debconf found a substantial offering of cloudy topics on the schedule. One major theme was the ways in which Debian are working on better integration with the big public clouds, for example by providing ready-to-use images and by better packing of cloud-related software.

Of particular interest for Xen was Thomas Goirand’s talk on the integration between OpenStack’s various components. OpenStack is a complicated piece of software which has been difficult to install and get running. Thomas, who runs a Xen-based public cloud provider, has been working to make the installation process smoother using Debian’s configuration management systems.

For me, an interesting topic was the continuing difficulty of integration between the Debian archive (Debian’s primary software repository) and git, and after a session in the bar with Joey Hess and others I wrote a tool to help with that.

Debconf is always a highlight of my year and I look forward to next year’s in Portland.

Schrödinger’s Cat in a (Xen) Virtualzed ‘Box’

Fedora Logo

Fedora Logo

Yes, apparently Schrödinger’s cat is alive, as the latest release of Fedora — Fedora 19, codename Schrödinger’s cat– as been released on July 2nd, and that even happened quite on time.

So, apparently, putting the cat “in a box” and all the stuff was way too easy, and that’s why we are bringing the challenge to the next level: do you dare putting Schrödinger’s cat “in a virtual box”?

In other words, do you dare install Fedora 19 within a Xen virtual machine? And if yes, how about doing that using Fedora 19 itself as Dom0?
Continue reading

Xen 4.3.0 Rc6 TestDay on Friday, June 28

Xen 4.3.0 time is approaching and, to make sure we’re delivering the best possible release, we are having another Xen TestDay on Friday, June 28 2013. (RSVP and iCal here).

We will be testing Xen 4.3.0-RC6, that will be tagged on Thursday. It will ship two really important changes (as compared to RC5) about PCI passthrough and CPU hotplug. Help us making sure there are no issues left, both on those two specifically, and in general!

In fact, about the former, we’ve had to change the way Xen handles some aspects of PCI passthrough, to work around an issue with qemu-xen. We think we’ve got everything right, but please test your own configuration to make sure that it still works for you. We particularly need graphics cards with large amounts of video RAM tested. About the latter, CPU hotplug support was missing in qemu-xen, and it has now been implemented, so go ahead testing it (CPU hot-unplug is still not supported, though).

We will announce on the xen-devel (and other relevant) mailing lists when RC6 will become available. In the meanwhile, here they are the Xen 4.3 RC6 test instructions, while more information about Xen TestDays are available here.

Join us on Friday on the #xentest channel on freenode!

If nothing relevant comes up during the TestDay, the plan is to have the release next week, probably on July 2nd.