If you are upgrading domain 0 Linux kernel from a non-pvops (classic, 2.6.18/2.6.32/etc.) kernel to a pvops one (3.0 or later), you may find that the amount of free memory inside dom0 has decreased significantly. This is because of changes in the way kernel handles the memory given to it by Xen. With some updates and configuration changes, the “lost” memory can be recovered.
tl;dr: If you previously had ‘dom0_mem=2G’ as a command line option to Xen, change this to ‘dom0_mem=2G,max:2G’. If that didn’t help, read on.
NUMA? What’s NUMA?
Having to deal with a Non-Uniform Memory Access (NUMA) machine is becoming more and more common. This is true no matter whether you are part of an engineering research center with access to one of the first Intel SCC-based machines, a virtual machine hosting provider with a bunch of dual 2376 AMD Opteron pieces of iron in your server farm, or even just a Xen developer using a dual socket E5620 Xeon based test-box (any reference to the recent experiences of the author of this post is purely accidental :-D). Just very quickly, NUMA means the memory accessing times of a program running on a CPU depends on the relative distance between that CPU and that memory. In fact, most of the NUMA systems are built in such a way that each processor has its local memory, on which it can operate very fast. On the other hand, getting and storing data from and on remote memory (that is, memory local to some other processor) is quite more complex and slow. Therefore, while hardware engineers bump their heads against cache coherency protocols and routing strategies for the system BUSes to be put on such machines, the most urgent issue for us, OS and hypervisor developers, is the following: how can we couple scheduling and memory management so that most of the accesses for most of our tasks/VMs stay local?
A quick round-up of Xen events in May and an update on Xen Summit in August: for more information see the Xen Events page.
Xen @ Ubuntu Developer Summit, May 7-11, Oakland, CA
The Xen and XCP teams will be participating in the Ubuntu Developer Summit in Oakland, CA. The agenda for technical discussions is not yet quite tied down: when it is we will let you know.
Xen @ Build a Cloud Day, May 10, San Francisco, CA
Learn how to build an open source cloud cloud with CloudStack, RightScale, Ubuntu, Xen and Zenoss at the free Build a Cloud Day. Build a Cloud Days are about learning, best practices and industry insights into building elastic, scalable and profitable open source clouds. The event is held in conjunction with Citrix Synergy 2012 in San Francisco and as a bonus you will get a free pass to the Synergy Cloud Keynotes and Solutions Expo.
Among the more unique features of Xen 4.2 is a feature called cpupools, designed and implemented by Jürgen Groß at Fujitsu. At its core it’s a simple idea, but one that allows it to be a flexible and powerful solution to a number of different problems.
The core idea behind cpupools is to divide the physical cores on the machine into different pools. Each of these pools has an entirely separate cpu scheduler, and can be set with different scheduling parameters. At any time, a given logical cpu can be assigned to only one of these pools (or none). A VM is assigned to one pool at a time, but can be moved from pool to pool.
There are a number of things one can do with this functionality. Suppose you are a hosting or cloud provider, and you have a number of customers who have multiple VMs with you. Instead of selling based on CPU metering, you want to sell access to a fixed number of cpus for all of their VMs: e.g. a customer with 6 single-vcpu VMs might buy 2 cores worth of computing space which all of the VMs share.
You could solve this problem by using cpu masks to pin all of the customer’s vcpus to a single set of cores. However, cpu masks do not work well with the scheduler’s weight algorithm — the customer wont’ be able to specify that VM A should get twice the cpu as VM B. Solving the weight issue in a general way is very difficult, since VMs can have any combination of overlapping cpu masks. Furthermore, this extra complication would be there for all users of the credit algorithm, regardless of whether they use this particular mode or not.
We have another Xen document day come up next Monday. Xen Document Days are for people who care about Xen Documentation and want to improve it. We introduced Documentation Days, because working on documentation in parallel with like minded-people, is just more fun than working alone! Everybody who can contribute is welcome to join!
For a list of items that need work, check out the community maintained TODO and wishlist. Of course, you can work on anything you like: the list just provides suggestions.
On March 18th, Linux 3.3 was released and it featured a number of interesting Xen related features.
- Re-engineered how tools can perform hypercalls – by using a standard interface (/dev/xen/privcmd instead of using /proc/xen/privcmd)
- Backends (netback, blkback) can now function in HVM mode. This means that a device driver domain can be in charge of a device (say network) and a subset of the network (netback). What is exciting about this it allows for security by isolation – so if one domain is compromised it does not affect the other domains. Both Qubes and NSA Research Center have been focusing on this functionality and it is exciting to see components of this goal taking shape!