Tag Archives: PV Calls

PV Calls: a new paravirtualized protocol for POSIX syscalls

Let’s take a step back and look at the current state of virtualization in the software industry. X86 hypervisors were built to run a few different operating systems on the same machine. Nowadays they are mostly used to execute several instances of the same OS (Linux), each running a single server application in isolation. Containers are a better fit for this use case, but they expose a very large attack surface. It is possible to reduce the attack surface, however it is a very difficult task, one that requires minute knowledge of the app running inside. At any scale it becomes a formidable challenge. The 15-year-old hypervisor technologies, principally designed for RHEL 5 and Windows XP, are more a workaround than a solution for this use case. We need to bring them to the present and take them into the future by modernizing their design.

The typical workload we need to support is a Linux server application which is packaged to be self contained, complying to the OCI Image Format or Docker Image Specification. The app comes with all required userspace dependencies, including its own libc. It makes syscalls to the Linux kernel to access resources and functionalities. This is the only interface we must support.

Many of these syscalls closely correspond to function calls which are part of the POSIX family of standards. They have well known parameters and return values. POSIX stands for “Portable Operating System Interface”: it defines an API available on all major Unixes today, including Linux. POSIX is large to begin with and Linux adds its own set of non-standard calls on top of it. As a result a Linux system has a very high number of exposed calls and, inescapably, also a high number of vulnerabilities. It is wise to restrict syscalls by default. Linux containers struggle with it, but hypervisors are very accomplished in this respect. After all hypervisors don’t need to have full POSIX compatibility. By paravirtualizing hardware interfaces, Xen provides powerful functionalities with a small attack surface. But PV devices are the wrong abstraction layer for Docker apps. They cause duplication of functionalities between the guest and the host. For example, the network stack is traversed twice, first in DomU then in Dom0. This is unnecessary. It is better to raise hypervisor abstractions by paravirtualizing a small set of syscalls directly.

PV Calls

It is far easier and more efficient to write paravirtualized drivers for syscalls than to emulate hardware because syscalls are at a higher level and made for software. I wrote a protocol specification called PV Calls to forward POSIX calls from DomU to Dom0. I also wrote a couple of prototype Linux drivers for it that work at the syscall level. The initial set of calls covers socket, connect, accept, listen, recvmsg, sendmsg and poll. The frontend driver forwards syscalls requests over a ring. The backend implements the syscalls, then returns success or failure to the caller. The protocol creates a new ring for each active socket. The ring size is configurable on a per socket basis. Receiving data is copied to the ring by the backend, while sending data is copied to the ring by the frontend. An event channel per ring is used to notify the other end of any activity. This tiny set of PV Calls is enough to provide networking capabilities to guests.

We are still running virtual machines, but mainly to restrict the vast majority of applications syscalls to a safe and isolated environment. The guest operating system kernel, which is provided by the infrastructure (it doesn’t come with the app), implements syscalls for the benefit of the server application. Xen gives us the means to exploit hardware virtualization extensions to create strong security boundaries around the application. Xen PV VMs enable this approach to work even when virtualization extensions are not available, such as on top of Amazon EC2 or Google Compute Engine instances.

This solution is as secure as Xen VMs but efficiently tailored for containers workloads. Early measurements show excellent performance. It also provides a couple of less obvious advantages. In Docker’s default networking model, containers’ communications appear to be made from the host IP address and containers’ listening ports are explicitly bound to the host. PV Calls are a perfect match for it: outgoing communications are made from the host IP address directly and listening ports are automatically bound to it. No additional configurations are required.

Another benefit is ease of monitoring. One of the key aspects of hardening Linux containers is keeping applications under constant observation with logging and monitoring. We should not ignore it even though Xen provides a safer environment by default. PV Calls forward networking calls made by the application to Dom0. In Dom0 we can trivially log them and detect misbehavior. More powerful (and expensive) monitoring techniques like memory introspection offer further opportunities for malware detection.

PV Calls are unobtrusive. No changes to Xen are required as the existing interfaces are enough. Changes to Linux are very limited as the drivers are self-contained. Moreover, PV Calls perform extremely well! Let’s take a look at a couple of iperf graphs (higher is better):

iperf client

iperf server

The first graph shows network bandwidth measured by running an iperf server in Dom0 and an iperf client inside the VM (or container in the case of Docker). PV Calls reach 75 gbit/sec with 4 threads, far better than netfront/netback.

The second graph shows network bandwidth measured by running an iperf server in the guest (or container in the case of Docker) and an iperf client in Dom0. In this scenario PV Calls reach 55 gbit/sec and outperform not just netfront/netback but even Docker.

The benchmarks have been run on an Intel Xeon D-1540 machine, with 8 cores (16 threads) and 32 GB of ram. Xen is 4.7.0-rc3 and Linux is 4.6-rc2. Dom0 and DomU have 4 vcpus each, pinned. DomU has 4 GB of ram.

For more information on PV Calls, read the full protocol specification on xen-devel. You are welcome to join us and participate in the review discussions. Contributions to the project are very appreciated!

Xen – KVM – Linux – and the Community

At Xen Summit last week, several community members and I discussed the issues around the recent launch of RHEL without Xen and its implications for Xen and the Xen.org community. I thought that I would share my opinions with a wider audience via this blog and hopefully get feedback from the Xen community on this important topic. So, feel free to comment on this post or send me mail privately if you wish to express your opinion to just me.

Firstly, I would like to offer my congratulations to the KVM community for the successful launch of their solution in Red Hat 6 shipping later this year. We in the Xen.org community are very supportive of all open source projects and believe that innovations made in the Linux kernel for virtualization can equally be shared by KVM and Xen developers to further improve open source virtualization hypervisors. I look forward to KVM and Xen working together to ensure interoperability, common formats, and management interfaces to provide customers with the maximum flexibility in moving virtual machines between hypervisors as well as simplifying overall virtualization management infrastructure. Xen.org is currently promoting the DMTF management standard for virtualization and cloud computing and welcome the KVM community to join with us by leveraging our OVF and DMTF SVPC implementations.

Many Linux community members and technology press have been busy the past few weeks writing off Xen as no longer relevant based on the launch of KVM. I have enjoyed reading the many articles written about this and thought I would add some insight to help customers, companies, and journalists better understand the differences between KVM and Xen. KVM is a type-2 hypervisor built into the Linux kernel as a module and will ship with any Linux distribution moving forward as no work is required for the Linux distributions to add KVM. Having a virtualization platform built-in to the Linux kernel will be valuable to many customers looking for virtualization within a Linux based infrastructure; however these customers will lose the flexibility to run a bare-metal hypervisor, configure the hypervisor independent of the host operating system, and provide machine level security as a guest can bring down the operating system on KVM. Xen, on the other hand is a type-1 hypervisor built independent of any operating system and is a complete separate layer from the operating system and hardware and is seen by the community and customers as an Infrastructure Virtualization Platform to build their solutions upon. In fact, the Xen.org community is not in the business of building a complete solution, but rather a platform for companies and users to leverage for their virtualization and cloud solutions. In fact, the Xen hypervisor is found in many unique solutions today from standard server virtualization to cloud providers to grid computing platforms to networking devices, etc.

To get a better understanding of how Xen.org operates, you must understand what the mission and goal of the Xen.org community is:

  • Build the industry standard open source hypervisor
    • Core “engine” in multiple vendor’s products
  • Maintain Xen’s industry leading performance
    • First to exploit new hardware virtualization features
  • Help OS vendors paravirtualize their OSes
  • Maintain Xen’s reputation for stability and quality
  • Support multiple CPU types for large and small systems
  • Foster innovation
  • Drive interoperability

This mission statement has been in place for many years in Xen.org and is an accurate reflection of our community.  It is our most important mission to create an industry standard open source hypervisor that is a core engine in other vendor’s products. Clearly, Xen.org has succeeded in this mission as many companies including Amazon, GoGrid, RackSpace, Novell, Oracle, Citrix, Avaya, Fujitsu, VA Linux, and others are leveraging our technology as a core feature in their solutions. It is not the intention of Xen.org to build a competitive packaged solution for the marketplace, but rather create a best of breed open source technology that is available for anyone to leverage.  This distinction is critical to understand as many people are confused as to why Xen.org does not compete or market against other technologies such as VMWare, HyperV, and KVM. Our goal is to create the best hypervisor possible without any focus on creating a complete packaged solution for customers. We embrace the open model of allowing customers to choose from various solutions to create their optimal solution.

Xen.org also spends a great deal of developer effort in performance testing as well as ensuring that we leverage efforts from hardware companies such as AMD and Intel to support the latest available hardware technologies. For example, Xen 4.0 supports the latest in SR-IOV cards which are just now being shipped to customers.

The third bullet on the mission statement can now be checked off as Xen.org has been instrumental in the efforts to upstream DomU paravirtualization software into the Linux kernel so all Linux distributions are now available for paravirtualization with no user changes required.  Xen.org is also working to upstream changes for our Dom0 kernel  to Linux and is being led by Jeremy Fitzhardinge and Konrad Wilk who recently updated the community on their work at Xen Summit; slides here. As Xen is not written as a Linux module or specially for Linux only deployments, it takes additional efforts to properly include Xen dom0 support into the Linux kernel. The community is always open to new contributors to assist Jeremy and Konrad on their development project and can contact me for next steps.  Finally, it is worth remembering that a Dom0 for Xen can run on NetBSD, FreeBSD, Solaris, or other operating system and is not a Linux only solution. Xen continues to embrace the customer choice model in Dom0 operating system selection which is part of our core mission.

The remaining bullets also reflect what you see in Xen.org as we look to support customer choice in all computing elements as well as ensuring that Xen.org leads the industry in pushing the envelope in new features for hypervisors.

As you can see, Xen.org’s mission is not to create a stand-alone, Linux-only competitive product that is a single packaged offering for end-users. Instead, we focus exclusively on building the best open source hypervisor technology in the marketplace and allow others to leverage our technology in any manner they wish with a maximum amount of flexibility for processor choice, Dom0 operating system , DomU virtualization, management tools, storage tools, etc. This flexibility along with  technology capability is a competitive advantage for customers and companies that choose Xen. Going forward, the Xen.org community will continue to focus on these goals as we include our new Xen Cloud Platform project  and Xen Client Initiative into the technology deliverables from our open source community.

Simon Crosby on RHEL Release…

t’s a fast-moving world. I turn my back for a moment to log onto my XenDesktop, and before you know it, the Planet turns to KVM for Cloud Virtualization. Suddenly all those Xen, VMware and Hyper-V users must have switched to KVM overnight! Impressive.

I quickly check the XenServer download stats and see that the needle is still rapidly rising… about 3,000 server licenses per week. Something doesn’t compute… And then I realize that it all depends what planet you’re on and your sense of perspective.

The news that Red Hat has pushed out a beta of RHEL 6 without Xen, and is focussing solely on KVM going forward is an interesting moment in the in the constantly changing theory of RHEL-evance. It’s neither unexpected nor earth shattering for those in the cloud or virtualization markets broadly, unless you happened to use Xen in RHEL 5. If you did, you’re in for a painful “upgrade” (aka: V2V for all Xen VMs, and different management).

Rest of Post…

Simon Crosby on Xen, KVM, Novell, etc

Can a Chameleon Change its Spots?

I had lunch today with veteran virtualization blogger Alessandro Perilli, who was in the Seattle area for the Microsoft MVP Summit. Alessandro has repeatedly been the first to spot key industry trends. He is truly plugged-in, and brings to his analysis a level of technical insight and honesty that I find refreshing, and he doesn’t sensationalize just to get clicks.

We discussed the recent flurry of reporting on the fact that Novell is also developing for KVM, and it was good to see that Alessandro found this as unsurprising as I do. Novell SUSE Linux is, after all an enterprise Linux distribution. And KVM is just a kernel.org driver that comes with mainline Linux. So it’s logical to expect Novell’s customers to be aware of KVM and to expect them to ship and support it – like any other mainline feature. Indeed Novell’s activity on KVM has never been a secret – they announced a preview of KVM support in SLE 11 and have a roadmap for offering full support in due course.

Full post at http://community.citrix.com/pages/viewpage.action?pageId=116034454.

Xen and KVM Discussion from Ian Pratt

I read a blog posting on the web today at zdnet with comments from Ian about Xen and KVM. I don’t normally post any type of “political” thoughts on this blog but I thought Ian’s comments are worth reading for the community. You can read the article at http://blogs.zdnet.com/virtualization/?p=415.

I think the title of the article is a bit overstated but is the usual media looking for a controversy.