Monthly Archives: June 2014

The Docker exploit and the security of containers

We normally only cover news and information directly related to Xen in this channel, but we thought it might be useful to briefly expand our scope a bit to mention the recent discussion about the Docker security exploit.

What’s the news?

Well to begin with, a few weeks ago Docker 1.0 was released, just in time for DockerCon.

Then last week, with timing that seems rather spiteful, someone released an exploit that allows a process running as root within a Docker container to break out of a Docker container.

I say it was a bit spiteful, because as the post-mortem on Docker’s site demonstrates, it exploits a vulnerability which was only present until Docker 0.11; it was fixed in Docker 0.12, which eventually became Docker 1.0.

Nonetheless, this kicked off a bit of a discussion about Docker, Linux containers, and security in several places, including Hacker News, the Register, seclists.org, and the OSv blog.

There is a lot of interesting discussion there. I recommend skimming through the Hacker News discussion in particular, if you have the time. But I think the comment on the Hacker News discussion by Solomon Hykes from Docker, puts it best:

As others already indicated this doesn’t work on 1.0. But it could have. Please remember that at this time, we don’t claim Docker out-of-the-box is suitable for containing untrusted programs with root privileges. So if you’re thinking “pfew, good thing we upgraded to 1.0 or we were toast”, you need to change your underlying configuration now. Add apparmor or selinux containment, map trust groups to separate machines, or ideally don’t grant root access to the application.

I’ve played around with Docker a little bit, and it seems like an excellent tool for packaging and deploying applications. Using containers to separate an application from the rest of the user-space of your distribution, exposing it only to the very forward-compatible Linux API is a really clever idea.

However, using containers for security isolation is not a good idea. In a blog last August, one of Docker’s engineers expressed optimism that containers would eventually catch up to virtual machines from a security standpoint. But in a presentation given in January, the same engineer said that the only way to have real isolation with Docker was to either run one Docker per host, or one Docker per VM. (Or, as Solomon Hykes says here, to use Dockers that trust each other in the same host or the same VM.)

We would concur with that assessment.

Release management: Risk, intuition, and freeze exceptions

I’ve been release coordinator for Xen’s 4.3 and 4.4 releases. For the 4.5 release, I’ve handed this role off to Konrad Wilk, from Oracle. In this blog, I try to capture some of my thoughts and experience about one aspect of release management: deciding what patches to accept during a freeze.

I have three goals when doing release management:

  1. A bug-free release
  2. An awesome release
  3. An on-time release

One of the most time-consuming seasons of being a release manager is during any kind of freeze. You can read in detail about our release process elsewhere; I’ll just summarize it here. During normal development, any patch which has the approval of the relevant maintainer can be accepted. As the release approaches, however, we want to start being more and more conservative in what patches we accept.

Obviously, no patch would ever be considered for acceptance which didn’t improve Xen in some way, making it more awesome. However, it’s a fact of software that any change, no matter how simple or obvious, may introduce a bug. If this bug is discovered before the release, it may delay the release, making it not on-time; or it may not be found until after the release, making the release not bug-free. The job of helping decide whether to take a patch or not falls to the release coordinator.

But how do you actually make decisions? There are two general things to say about this.

Risk

The only simple rule to follow that will make sure that there are no new bugs introduced is to do no development at all. Since we must do development, we have to learn to deal with risk.

Making decisions about accepting or rejecting patches as release coordinator is about making calculated risks: look at the benefits, look at the potential costs, look at the probabilities, and try to balance them the best you can.

Part of making calculated risks is accepting that sometimes your gamble won’t pay off. You may approve a patch to go in, and it will then turn out to have a bug in it which delays the release. This is not necessarily a failure: if you can look back at your decision and say with honesty, “That was the best decision I could have made given what I knew at the time”, then your choice was the right one.

The extreme example of this kind of thinking that of a poker player: a poker player may make a bet that she knows she only has a 1 in 4 chance of winning, if the pay-off is more than 4 to 1; say, 5 to 1. Even though she loses 75% of the time, the 25% that she does win will pay for the losses. And when she makes the bet and loses (as she will 75% of the time), she knows she didn’t make a mistake; taking risks is just a part of the game.

Obviously as release coordinator, the costs of a bug are generally higher than the benefits of a feature. But the general principle — of taking calculated risks, and accepting occasional failure as the inevitable consequence of doing so — applies.

Intuition

But how do we actually calculate the risks? A poker player frequently deals with known quantities: she can calculate that there is exactly a 25% chance of winning, exactly a 5x monetary pay-off, and do the math; the release coordinator’s decisions are not so quantifiable.

shutterstock_58404295This is where intuition comes in. While there are a handful of metrics that can be applied to patches (e.g., the number of lines in the patch), for the most part the risk and benefit are not very quantifiable at all: expert judgement / intuition is the only thing we have.

Now, research has shown that the intuition of experts can, under the right circumstances, be very good. Intuition can quickly analyze hundreds of independent factors, and compare against thousands of instances, and give you a result which is often very close to the mark.

However, research has also shown that in other circumstances, expert intuition is worse than random guessing, and far worse than a simple algorithm. (For a reference, see the books listed at the bottom.)

One of the biggest ways intuition goes wrong is by only looking at part of the picture. It is very natural for programmers, when looking at a patch, to consider only the benefits. The programmer’s intuition then accurately gives them a good sense of the advantage of taking the patch; but doesn’t warn them about the risk because they haven’t thought about it. Since they have a positive feeling, then they may end up taking a patch even when it’s actually too risky.

The key then is to make sure that your intuition considers the risks properly, as well as the benefits. To help myself with this, during the 4.4 code freeze I developed a sort of checklist of things to think about and consider. They are as follows:

  1. What is the benefit of this patch?
  2. What is the probability this patch has a bug?
  3. If this patch had a bug, what kind of bug might it be?
  4. If this patch had a bug, what is the probability we would find it
    before the release?

When considering the probability of a bug, I look at two things:

  • The complexity of the patch
  • My confidence in my / the other reviewers’ judgement.

Sometimes you’re looking at code you’re very familiar with or is straightforward; sometimes you’re looking at code that is very complicated or you’re not that familiar with. If the patch looks good and it’s code you’re familiar with, it’s probably fine. If the patch looks good but it’s code you’re not familiar with, there’s a risk that your judgement may be off.

When trying to think of what kind of bug it might be, I look at the code that it’s modifying, and consider things on a spectrum:

  1. Error / tool crash on unexpected trusted input; or normal input to library-only commands
  2. Error / tool crash on normal input, secondary commands / new functionality
  3. Error / tool crash on normal input, core commands
  4. Performance
  5. Guest crash / hang
  6. Host crash / hang
  7. Security vulnerability
  8. Data loss

Usually you can tell right away where in the list a bug might be. Modifying xenpm or xenctx? 3 max. Modifying the scheduler? Probably #6. Modifying hypercalls available to guests? #7. And so on.

When asking whether we’d find the bug before the release, consider the kind of testing the codepath is likely to get. Is it tested in osstest? In XenRT? Or is it in a corner case that few people really use?

After thinking through those four questions, and going over the criteria in detail, then my intuition is probably about as well-formed as it’s going to get. Now I ask the fifth question: given the risks, is it worth it to accept this patch?

After giving it it some thought, I went with my best guess. Sometimes I’m just not sure; in which case go away and do something else for a couple of hours, then come back to it (going over again the four questions to make sure they’re fresh in my mind). The first few dozen times this took a very long time; as I gained experience, judgements came faster (although many were still painfully slow).

In some cases, I just didn’t have enough knowledge of the code to make the judgement myself; this happened once or twice with the ARM code in the 4.4 release. In that case, my goal was to try to make sure that those who did have the relevant knowledge were making sound decisions: thinking about both the benefits and the risks and weighing them appropriately.

For those who want to look further into risk and intuition, several books have had a pretty big influence on my thinking in this area. Probably the best one, but also the hardest (most dense) one, is Thinking, Fast and Slow, by Daniel Kahneman. It’s very-well written and accessible, but just contains a huge amount of information that is different to the way you normally think. Not a light read. Another one I would recommend is The Black Swan, by Nassim Nicholas Taleb. And finally, Blink, by Malcolm Gladwell.

Developer Summit Line-up Announced

I am pleased to announce the schedule of the Xen Project Developer Summit. The event will take place in Chicago on August 18-19, 2014.

The Project’s second annual developer event highlights best practices, user testimonials and advancements with the industry-leading open source hypervisor. Powering many of the world’s largest clouds in production today, Xen Project developers are also leading the way in server density, million-node data centers, graphic-intensive workloads, cloud operating systems and sophisticated enterprise security.

This year’s summit will present the most relevant topics to Xen Project developers and users who are pushing the limits on virtualization, ranging from typical server virtualization and cloud computing on x86 servers to new developments with ARM servers, networking, automotive, cloud operating systems, enterprise security and mobility.

Following is a sampling of confirmed speakers and presentations to be discussed in Chicago:

  • James Bielman, Research and Engineering at Galois, XenStore Mandatory Access Control — proposes additional security access features for Xen Project software;
  • Mihai Donțu, Technical Project Manager at Bitdefender, Zero-Footprint Guest Memory Introspection from Xen — discusses how the introspection API in the Xen Project hypervisor can be used to detect, prevent and take action on several categories of malware attacks;
  • James Fehlig, Software Engineer at SUSE Linux, libvirt support for libxenlight – covers the status of Xen Project libvirt integration and outlines planned improvements;
  • Lars Kurth, Xen Project Advisory Board Chairman, State of Xen Project Software – gives an overview of the Xen Project development community and community at large;
  • Jun Nakajima, Principal Engineer at Intel Open Source Technology Center, Xen as a High-Performance Network Functions Virtualization (NFV) Platform – introduces Xen as a NFV platform and outlines solutions to remove challenges for deploying the Xen Project hypervisor for NFV applications as well as shares best practices;
  • Nathan Studer, Technical Lead at DornerWorks, Xen and The Art of Certification – gives an overview of certification requirements in emerging use-cases such as automotive, medical, and avionics and lays out a path toward certifying Xen Project technology in these industries;
  • Don Slutz, Software Architect at Verizon Terremark, Overview of Verizon Cloud Architecture – presents Verizon Cloud’s architecture, design goals and planned contributions to the Xen Project community; and
  • Stefano Stabellini, Senior Principal Software Engineer at Citrix and Xen Project Contributor, Xen on ARM Status Update and Performance Benchmarks — gives the latest developments with the Xen Project hypervisor on ARM architecture.

Birds of a Feather session and Discussions

Besides presentations, the developer summit will also provide an opportunity for in-depth interactive discussions (Birds of a Feather sessions), which allow deep interaction and collaboration between Xen Project developers and community members. These will happen in a second track alongside the main event. To submit a BoF, please go to the BoF submission page.

For more information about Xen Project Developer Summit 2014, including how to register and to view the complete schedule, visit: events.linuxfoundation.org/events/xen-project-developer-summit.

How fast is Xen on ARM, really?

With Xen on ARM getting out of the early preview phase and becoming more mature, it is time to run a few benchmarks to check that the design choices paid out, the architecture is sound and the code base is solid. It is time to find out how much is the overhead introduced by Xen on ARM and how it compares with Xen and other hypervisors on x86.
I measured the overhead by running the same benchmark on a virtual machine on Xen on ARM and on native Linux on the same hardware. It takes a bit longer to complete the benchmark inside a VM, but how much longer? The answer to this question is the virtualization overhead.

Setup

I chose AppliedMicro X-Gene as the ARM platform to run the benchmarks on: it is an ARMv8 64-bit SoC with an 8 cores cpu and 16GB of RAM. I had Dom0 running with 8 vcpus and 1GB of RAM, the virtual machine that ran the tests had 2GB of RAM and 8 vcpus. To make sure that the results are comparable I restricted the amount of memory available to the native Linux run, so that Linux had all the 8 physical cores at its disposal but only 2GB of RAM.

For the x86 tests, I used a Dell server with an Intel Xeon x5650, that is a 6 cores HyperThreading cpu. HyperThreading was disabled during the tests for better performances. Similarly to the ARM tests, I had Dom0 running with 6 vcpus and 1GB of RAM and the virtual machine running with 2GB of RAM and 6 vcpus. The native Linux run had 6 physical cores and 2GB of RAM. For the KVM tests I booted the host with 3GB of RAM, then assigned 2GB of RAM to the KVM virtual machine.

In terms of software on both ARMv8 and x86 I used:

  • Linux 3.13 as Dom0, DomU and native kernel
  • Xen 4.4
  • OpenSUSE 13.1
  • QEMU-KVM 1.6.2 (for the KVM tests on x86)

I could not test KVM on ARMv8 because KVM support for X-Gene is not upstream in Linux 3.13.

Benchmarks – lower is better

The y-axis shows the overhead in terms of percentage of native: “0%” means that it is a fast as native. “1%” means that it takes 1% longer than native Linux to complete the benchmark inside a virtual machine. Given that we are dealing with overheads, lower is better.

Kernbench

Kernbench is a popular benchmark that measures the time that it takes to compile the Linux kernel. It is a cpu and memory intensive benchmark.
chart_4 (1)

PBzip2

PBzip2 is a parallel implementation of bzip2. This benchmark measures the time that it takes to compress a 4GB file.
chart_3

SPECjbb2005 (non-compliant)

SPECjbb2005 simulates a Java server workload. It is a cpu and memory bound benchmark.
The runs are non-compliant (therefore cannot be compared with compliant runs) and the overhead is calculated on the peak warehouse alone.
chart_7

Next I ran a couple of disk IO benchmarks, but both X-Gene and the Dell server came with spinning disks for storage: the following tests showed that both disks were too slow to actually measure the virtualization overhead (it is lower than 1%).

FIO

FIO is a popular tool to measure disk performances. This benchmark uses FIO to perform a combination of random reads and writes and measures the overhead on iops.
chart_5

PGBench

PGBench is the PostgresSQL database benchmarking tool. This benchmark is disk IO bound.
chart_6

Conclusions

Developing Xen on ARM we have been focused on correctness and feature completeness rather than performances. Nonetheless it provides a very lower overhead that is already on par or lower than Xen’s on x86, that in turn is lower than KVM’s on x86. Given the benefits that virtualization brings to the table, including ease of deployment and lower downtimes, it really makes sense to deploy Xen on your ARM based cloud.

Welcoming Cavium as latest Xen Project Advisory Board member

Cavium-Logo-TransparentWe’re excited to welcome Cavium as our newest Xen Project Advisory Board member today. Since becoming a Linux Foundation Collaborative Project with eleven founding members, we announced five new members, including ARM, NetApp, Rackspace and Verizon Terremark.

Today’s announcement of Cavium’s ThunderX™ SoC Family, a scalable family of 64-bit ARMv8 processors incorporated into a highly differentiated SoC architecture optimized for cloud and data center applications, and Cavium’s deep engagement with hardware ecosystem vendors and software vendors and communities makes Cavium a valuable member and future contributor of the Xen Project. I am particularly excited about the future potential for Cavium’s ThunderX™ SoC Family targeted at the volume server market in addition to hyperscale and cloud, which goes way beyond the microserver concept we have seen in ARM SoCs so far. The Xen Project hypervisor shows great promise for this new class of SoCs.

Committed to ARMv8 standards, Cavium is driving key server capabilities, such as GICv3 with 48 core support along with multi-socket cache coherency up to 96 cores in a dual socket configuration. Cavium has already started working with the Xen Project community to enhance the hypervisor to support these technologies. These contributions will complement a growing community of Xen Project developers from ARM, AMD, Applied Micro, Broadcom, Citrix, Galois, GlobalLogic and Samsung, among others, that are already contributing to ARM support.

In addition, Cavium’s history in security and networking, both areas where Xen Project technology has an edge, will likely lead to interesting new innovations within our community in the future.

Xen Project virtualization is at the heart of our new ThunderX™ SoC Family targeted at users needing easier deployment, lower downtime and improved management and utilization,” said Larry Wikelius, Director Thunder Ecosystems and Partner Enablement at Cavium and Cavium’s representative on the Xen Project Advisory Board. “Xen Project software is proven in the cloud and first to market with ARM support. Collaboration with the open Xen Project community will be extremely valuable as data centers evolve in the future.

The following links provide more information on Cavium’s recent announcements