Tag Archives: security

Intel hosts OpenXT Summit on Xen Project based Client Virtualization, June 7-8 in Fairfax, VA, USA

This is a guest blog post by Rich Persaud, former member of the Citrix XenServer and XenClient engineering and business teams. He is currently a consultant to BAE Systems, working on the OpenXT project, which stands on the shoulders of the Xen Project, OpenEmbedded Linux and XenClient XT.

While the Xen Project is well known for servers and hosted infrastructure, Type-1 hypervisors have been used in client endpoints and network appliances, improving security and remote manageability. Virtualization-based security in Qubes and Windows 10 is also educating system administrators about hardware security (IOMMU and TPM) and application trust models.

Released as open-source software in 2014, OpenXT is a development toolkit for hardware-assisted security research and appliance integration. It includes hardened Linux VMs that can be configured as a user-facing software appliance for client devices, with virtualization of storage, network, input, sound, display and USB devices. Hardware targets include laptops, desktops and workstations.

OpenXT stands on the shoulders of the Xen Project, OpenEmbedded Linux and XenClient XT. It is optimized for hardware-assisted virtualization with an IOMMU and a TPM. It configures Xen network driver domains, Linux stub domains, Xen Security Modules, Intel TXT, SE Linux, GPU passthrough and VPNs. Guest operating systems include Windows, Linux and FreeBSD. VM storage options include encrypted VHD files with boot-time measurement and non-persistence.

The picture below shows a typical OpenXT software stack, including Xen, Linux and other components.

The picture above shows one of many configurations of the OpenXT software stack, including Xen, Linux and other components.

OpenXT enables loose coupling of open-source and proprietary software components, verifiable measurements of hardware and software, and verified launch of derivative products. It has been used to develop locally/centrally managed software appliances that isolate high-risk workloads, networks and devices.

The inaugural OpenXT Summit brings together developers and ecosystem participants for a 2-day conference in Fairfax, VA, USA on June 7-8, 2016. The event is hosted by Intel Corporation. The audience for this event includes kernel and application developers, hardware designers, system integrators and security architects.

The 2016 OpenXT Summit will chart the evolution of OpenXT from cross-domain endpoint virtualization to an extensible systems innovation platform, enabling derivative products to make security assurances for diverse hardware, markets and use cases.

The Summit includes one day of presentations, a networking reception and one day of moderated technical discussions. Presentation topics will include OpenXT architecture, TPM 2.0, Intel SGX, Xen security, measured launch, graphics virtualization and NSA research on virtualization and trusted computing.

For more information, please see the event website at http://openxt.org/summit.

For presentations and papers related to OpenXT, please see http://openxt.org/history.

Stealthy monitoring with Xen altp2m

One of the core features that differentiates Xen from other open-source hypervisors is its native support for stealthy and secure monitoring of guest internals (aka. virtual machine introspection [1]). In Xen 4.6 which was was released last autumn several new features have been introduced that make this subsystem better; a cleaned-up, optimized API and ARM support being just some of the biggest items on this list. As part of this release of Xen, a new and unique feature was also successfully added by a team from Intel that make stealthy monitoring even better on Xen: altp2m. In this blog entry we will take a look at what it’s all about.

p2mIn Xen’s terminology, p2m stands for the memory management layer that handles the translation from guest physical memory to machine physical. This translation is critical for safely partitioning the real memory of the machine between Xen and the various VMs running as to ensure a VM can’t access the memory of another without permission. There are several implementations of this mechanism, including one with hardware support via Intel Extended Page Tables (EPT) available to HVM guests and PVH . In Xen’s terminology, this is called Hardware Assisted Paging (hap). In this implementation the hypervisor maintains a second pagetable, similar to the one in 64-bit operating systems use, dedicated to running the p2m translation. All open-source hypervisors that use this hardware assisted paging method use a single EPT per virtual machine to handle this translation, as most of the time the memory of the guest is assigned at VM creation and doesn’t change much afterwards.

altp2mXen altp2m is the first implementation which changes this setup by allowing Xen to create more then one EPT for each guest. Interestingly, the Intel hardware has been capable of maintaining up to 512 EPT pointers from the VMCS since the Haswell generation of CPUs. However, no hypervisor made use of this capability until now. This changed in Xen 4.6, where we can now create of up to 10 EPTs per guest. The primary reason for this feature is to use it with the #VE and VMFUNC extensions.

It can also be used by external monitoring applications via the Xen vm_event system.

Why alt2pm is a game-changer

Alt2pm is a game-changer for applications performing purely external monitoring is because it simplifies the monitoring process of multi-vCPU guests. The EPT layer has been successfully used in stealthy monitoring applications to track the memory accesses made by the VM from a safe vantage point by restricting the type of access the VM may perform on various memory pages. Since EPT permission violations trap into the hypervisor, the VM would receive no indication that anything out of the ordinary has happened. While the method allowed for stealthy tracing of R/W/X memory accesses of the guest, the memory permission needs to be relaxed in order to allow the guest to continue execution. When a single EPT is shared across multiple running vCPUs, relaxing the permissions to allow one vCPU to continue may inadvertently allow another one to perform the memory access we would otherwise want to track. While under normal circumstances such race-condition may rarely occur, malicious code could easily use this to hide some of its actions from a monitoring application.

Solutions to this problem exist already. For example we can pause all vCPUs while the one violating the access is single-stepped. This approach however introduces heavy overhead just to avoid a race-condition that may rarely occur in practice. Alternatively, one could emulate the instruction that was violating the EPT permission without relaxing the EPT access permissions, as Xen’s built-in emulator doesn’t use EPT to access the guest memory. This solution, while supported in Xen, is not particularly ideal either as Xen’s emulator is incomplete and is known to have issues that can lead to guest instability [2]. Furthermore, over the years emulation has been a hotbed of various security issues in many hypervisors (including Xen [3]), thus building security tools based on emulation is simply asking for trouble. It can be handy but should be used only when no other option is available.

Xen’s altp2m system changes this problem quite significantly. By having multiple EPTs we can have differing access permissions defined in each table, which can be easily swapped around by changing the active EPT index in the VMCS. When the guest makes a memory access that is monitored, instead of having to relax the access permission, Xen can simply switch to an EPT (called a view) that allows the operation to continue. Afterwards the permissive view can be switched back to the restricted view to continue monitoring. Since each vCPU has its own VMCS where this switching is performed, this monitoring can be performed specific to each vCPU, without having to pause any of the other ones, or having to emulate the access. All without the guest noticing any of this switching at all. A truly simple and elegant solution.

Other introspection methods for stealthy monitoring

EPT based monitoring is not the only introspecting technique used for stealthy monitoring. For example, the Xen based DRAKVUF Dynamic Malware Analysis [4] uses it in combination with an additional technique to maximum effect. The main motivation for that is because EPT based monitoring is known to introduce significant overhead, even with altp2m: the granularity of the monitoring is that of a memory page (4KB). For example, if the monitoring application is really just interested in when a function-entry point is called, EPT based monitoring creates a lot of “false” events when that page is accessed for the rest of the function’s code.

This can be avoided by enabling the trapping of debug instructions into the hypervisor, a built-in feature of Intel CPUs that Xen exposes to third-party applications. This method is used in DRAKVUF, which writes breakpoint instructions into the guests’ memory at code-locations of interest. Since we will only get an event for precisely the code-location we are interested in this method effectively reduces the overhead. However, the trade-off is that unlike EPT permissions the breakpoints are now visible to the guest. Thus, to hide the presence of the breakpoints from the guest, these pages need to get further protected by restricting the pages to be execute-only in the EPT. This allows DRAKVUF to remove the breakpoints before in-guest code-integrity checking mechanisms (like Windows Patchguard) can access the page. While with altp2m the EPT permissions can be safely used with multi-vCPU systems, using breakpoints similarly presents a race-condition: the breakpoint hit by one vCPU has to be removed to allow the guest to execute the instruction that was originally overwritten, potentially allowing another vCPU to do so as well without notice.

altp2m-shadowFortunately, altp2m has another neat feature that can be used to solve this problem. Beside allowing for changing the memory permissions in the different altp2m views, it also allows to change the mapping itself! The same guest physical memory can be setup to be backed by different pages in the different views. With this feature we can really think of guest physical memory as “virtual”: where it is mapped really depends on which view the vCPU is running on. Using this feature allows us to hide the presence of the breakpoints in a brand new way. To do this, first we create a complete shadow copy of the memory page where a breakpoint is going to be written and only write the breakpoint into this shadow copy. Now, using altp2m, we setup a view where the guest physical memory of the page gets mapped to our shadow copy. The guest continues to access its physical memory as before, but underneath it is now using the trapped shadow copy. When the breakpoint is hit, or if something is trying to scan the code, we simply switch the view to the unaltered view for the duration of a single-step, then switch back to the trapped view. This allows us to hide the presence of the breakpoints specific to each vCPU! All without having pause any of the other vCPUs or having to emulate. The first open-source implementation of this tracing has been already merged into the DRAKVUF Malware Analysis System and is available as a reference implementation for those interested in more details.


As we can see, Xen continues to be on the forefront of advancing the development of virtualization based security application and allowing third-party tools to create some very exotic setups. This flexibility is what’s so great about Xen and why it will continue to be a trend-setter for the foreseeable future


[1] Virtual Machine Introspection
[2] xen-devel@: Failed vm entry with heavy use of emulator
[3] Hardening Hypervisors Against VENOM-Style Attacks
[4] DRAKVUF Malware Analysis System (drakvuf.com)
[5] Stealthy, Hypervisor-based Malware Analysis (Presentation)

Security vs Features

We’ve just released a rather interesting batch of Xen security advisories. This has given rise in some quarters to grumbling around Xen not taking security seriously.

I have a longstanding interest in computer security. Nowadays I am a member of the Xen Project Security Team (the team behind security@xenproject, which drafts the advisories and coordinates the response). But I’m going to put forward my personal opinions.

Of course Invisible Things are completely right that security isn’t taken seriously enough. The general state of computer security in almost all systems is very poor. The reason for this is quite simple: we all put up with it. We, collectively, choose convenience and functionality: both when we decide which software to run for ourselves, and when we decide what contributions to make to the projects we care about. For almost all software there is much stronger pressure (from all sides) to add features, than to improve security.

That’s not to say that many of us working on Xen aren’t working to improve matters. The first part of improving anything is to know what the real situation is. Unlike almost all corporations, and even most Free Software projects, the Xen Project properly discloses, via an advisory, every vulnerability discovered in supported configurations. We also often publish advisories about vulnerabilities in other relevant projects, such as Linux and QEMU.

Security bugs are bugs, and over the last few years the Xen Project’s code review process has become a lot more rigorous. As a result, the quality of code being newly introduced into Xen has improved a lot.

For researchers developing new analysis techniques, Xen is a prime target. A significant proportion of the reports to security@xenproject are the result of applying new scanning techniques to our codebase. So our existing code is being audited, with a focus on the areas and techniques likely to discover the most troublesome bugs.

As I say, the Xen Project is very transparent about disclosing security issues; much more so than other projects. This difference in approach to disclosure makes it difficult to compare the security bug density of competing systems. When I worked for a security hardware vendor I was constantly under pressure to explain why we needed to do a formal advisory for our bugs. That is what security-conscious users expect, but our competitors’ salesfolk would point to our advisories and say that our products were full of bugs. Their products had no publicly disclosed security bugs, so they would tell naive customers that their products had no bugs.

I do think Xen probably has fewer critical security bugs than other hypervisors (whether Free or proprietary). It’s the best available platform for building high security systems. The Xen Project’s transparency is very good for Xen’s users. But that doesn’t mean Xen is good enough.

Ultimately, of course, a Free Software project like Xen is what the whole community makes it. In the project as a whole we get a lot more submissions of new functionality than we get submissions aimed at improving the security.

So personally, I very much welcome the contributions made by security-focused contributors – even if that includes criticism.

Hardening Hypervisors Against VENOM-Style Attacks

This is a guest blog post by Tamas K. Lengyel, a long-time open source enthusiast and Xen contributor. Tamas works as a Senior Security Researcher at Novetta, while finishing his PhD on the topic of malware analysis and virtualization security at the University of Connecticut.

The recent disclosure of the VENOM bug affecting major open-source hypervisors, such as KVM and Xen, has many reevaluating their risks when using cloud infrastructures. That is a very good thing indeed. Complex software systems are prone to bugs and errors.  Virtualization and the cloud have been erroneously considered by many to be a silver bullet against intrusions and malware. The fact is the cloud is anything but a safe place. There are inherent risks in the cloud and it’s very important to put those risks in their proper context.


VENOM is one of many vulnerabilities involving hypervisors (e.g. [Qemu], [ESX], [KVM], [Xen]). And, while Venom is indeed a serious bug and can result in a VM escape, which in turn can compromise all VMs on the host, it doesn’t have to be. In fact, we’ve known about VENOM-style attacks for a long time. Yet, there are easy-to-deploy counter-measures to mitigate the risk of such exploits natively available in both Xen and KVM (see RedHat’s blog post on the same topic).

What is the root cause of VENOM? Emulation

While modern systems come with a plethora of virtualization extensions, many components are still being emulated, usually devices such as your network card, graphics card and your hard drive. While Linux comes with paravirtual (virtualization-aware) drivers to create such devices, emulation is often the only solution to run operating systems that do not have kernel drivers. This has been traditionally the case with Windows (note that Citrix recently open-sourced its Windows PV drivers so this is no longer the case). This emulation layer has been implemented in QEMU, which caused VENOM and a handful of other VM-escape bugs in recent years. However, QEMU is not the only place where emulation is used within Xen. For a variety of interrupts and timers, such as RTC, PIT, HPET, PMTimer, PIC, IOAPIC and LAPIC, there is a baked-in emulator in Xen itself for performance and scalability reasons. As with QEMU, this emulator has been just as exploitable. This is simply because programming emulators properly is really complex and error-prone.

Available Mitigations

We’ve known how complicated emulation is for some time. In 2011, this Black Hat presentation on Virtunoid demonstrated a VM escape attack against KVM through QEMU. The author of Virtunoid even cautioned there would be many bugs of that nature found by researchers. Fast-forward four years and we now have VENOM.

The good news is it’s easy to mitigate all present and future QEMU bugs, which the recent Xen Security Advisory emphasized as well. Stubdomains can nip the whole class of vulnerabilities exposed by QEMU in the bud by moving QEMU into a de-privileged domain of its own. Instead of having QEMU run as root in dom0, a stubdomain has access only to the VM it is providing emulation for. Thus, an escape through QEMU will only land an attacker in a stubdomain, without access to critical resources. Furthermore, QEMU in a stubdomain runs on MiniOS, so an attacker would only have a very limited environment to run code in (as in return-to-libc/ROP-style), having exactly the same level of privilege as in the domain where the attack started. Nothing is to be gained for a lot of work, effectively making the system as secure as it would be if only PV drivers were used.

While it might sound complex, it’s actually quite simple to take advantage of this protection. Simply add the following line to your Xen domain configuration, and you gain immediate protection against VENOM and the whole class of vulnerabilities brought to you by QEMU:

device_model_stubdomain_override = 1

However, as with most security systems, it comes at a cost. Running stubdomains requires a bit of extra memory as compared to the plain QEMU process in dom0. This in turn can limit the number of VMs a cloud provider may be able to sell, so they have little incentive to enable this feature by default on their end. However, this protection works best if all VMs on the server run with stubdomains. Otherwise you may end up protecting your neighbors against an escape attack from your domain, while not being protected from an escape attack from theirs. It’s no surprise that some users would not want to pay for this extra protection, even if it was available. But for private clouds and exclusive servers there is no reason why it shouldn’t be your default setting.


Why Unikernels Can Improve Internet Security

This is a reprint of a 3-part unikernel series published on Linux.com. In this post, Xen Project Advisory Board Chairman Lars Kurth explains how unikernels address security and allow for the careful management of particularly critical portions of an organization’s data and processing needs. (See part one, 7 Unikernel Projects to Take On Docker in 2015.)

Many industries are rapidly moving toward networked, scale-out designs with new and varying workloads and data types. Yet, pick any industry —  retail, banking, health care, social networking or entertainment —  and you’ll find security risks and vulnerabilities are highly problematic, costly and dangerous.

Adam Wick, creator of the The Haskell Lightweight Virtual Machine (HaLVM) and a research lead at Galois Inc., which counts the U.S. Department of Defense and DARPA as clients, says 2015 is already turning out to be a break-out year for security.

“Cloud computing has been a hot topic for several years now, and we’ve seen a wealth of projects and technologies that take advantage of the flexibility the cloud offers,” said Wick. “At the same time though, we’ve seen record-breaking security breach after record-breaking security breach.”

The names are more evocative and well-known thanks to online news and social media, but low-level bugs have always plagued network services, Wick said. So, why is security more important today than ever before?

Improving Security

The creator of MirageOS, Anil Madhavapeddy, says it’s “simply irresponsible to continue to knowingly provision code that is potentially unsafe, and especially so as we head into a year full of promise about smart cities and ubiquitous Internet of Things. We wouldn’t build a bridge on top of quicksand, and should treat our online infrastructure with the same level of respect and attention as we give our physical structures.”

In the hopes of improving security, performance and scalability, there’s a flurry of interesting work taking place around blocking out functionality into containers and lighter-weight unikernel alternatives. Galois, which specializes in R&D for new technologies, says enterprises are increasingly interested in the ability to cleanly separate functionality to limit the effect of a breach to just the component affected, rather than infecting the whole system.

For next-generation clouds and in-house clouds, unikernels make it possible to run thousands of small VMs per host. Galois, for example, uses this capability in their CyberChaff project, which uses minimal VMs to improve intrusion detection on sensitive networks, while others have used similar mechanisms to save considerable cost in hardware, electricity, and cooling; all while reducing the attack surface exposed to malicious hackers. These are welcome developments for anyone concerned with system and network security and help to explain why traditional hypervisors will remain relevant for a wide range of customers well into the future.

Madhavapeddy goes as far to say that certain unikernel architectures would have directly tackled last year’s Heartbleed and Shellshock bugs.

“For example, end-to-end memory safety prevents Heartbleed-style attacks in MirageOS and the HaLVM. And an emphasis on compile-time specialization eliminates complex runtime code such as Unix shells from the images that are deployed onto the cloud,” he said.

The MirageOS team has also put their stack to the test by releasing a “Bitcoin pinata,” which is a unikernel that guards a collection of Bitcoins.  The Bitcoins can only be claimed by breaking through the unikernel security (for example, by compromising the SSL/TLS stack) and then moving the coins.  If the Bitcoins are indeed transferred away, then the public transaction record will reflect that there is a security hole to be fixed.  The contest has been running since February 2015 and the Bitcoins have not yet been taken.


Linux container vs. unikernel security

Linux, as well as Linux containers and Docker images, rely on a fairly heavyweight core OS to provide critical services. Because of this, a vulnerability in the Linux kernel affects every Linux container, Wick said. Instead, using an approach similar to a la carte menus, unikernels only include the minimal functionality and systems needed to run an application or service, all of which makes writing an exploit to attack them much more difficult.

Cloudius Systems, which is running a private beta of OSv, which it tags as the operating system for the cloud, recognizes that progress is being made on this front.

“Rocket is indeed an improvement over Docker, but containers aren’t a multi-tenant solution by design,” said CEO Dor Laor. “No matter how many SELinux Linux policies you throw on containers, the attack surface will still span all aspects of the kernel.”

Martin Lucina, who is working on the Rump Kernel software stack, which enables running existing unmodified POSIX software without an operating system on various platforms, including bare metal embedded systems and unikernels on Xen, explains that unikernels running on the Xen Project hypervisor benefit from the strong isolation guarantees of hardware virtualization and a trusted computing base that is orders of magnitude smaller than that of container technologies.

“There is no shell, you cannot exec() a new process, and in some cases you don’t even need to include a full TCP stack. So there is very little exploit code can do to gain a permanent foothold in the system,” Lucina said.

The key takeaway for organizations worried about security is that they should treat their infrastructure in a less monolithic way. Unikernels allow for the careful management of particularly critical portions of an organization’s data and processing needs. While it does take some extra work, it’s getting easier every day as more developers work on solving challenges with orchestration, logging and monitoring. This means unikernels are coming of age just as many developers are getting serious about security as they begin to build scale-out, distributed systems.

For those interested in learning more about unikernels, the entire series is available as a white paper titled “The Next Generation Cloud: The Rise of the Unikernel.”

Read part 1: 7 Unikernel Projects to Take On Docker in 2015

Updates to Xen Project Security Process

lockBefore Christmas, the Xen Project ran a community consultation to refine its Security Problem Response Process.  We recently approved changes that, in essence, are tweaks to our existing process, which is based on a Responsible Disclosure philosophy.

Responsible Disclosure and our Security Problem Response Process are important components of keeping users of Xen Project-based products and services safe from security exploits. Both ensure that products and services can be patched by members of the pre-disclosure list before details of a vulnerability are published and before said vulnerabilities can be exploited by black hats.

The changes to our response process fall into a number of categories:

  • Clarify whether security updates can be deployed on publicly hosted systems (e.g. cloud or hosting providers) during embargo
  • Sharing of information among pre-disclosure list members
  • Applications procedure for pre-disclosure list membership

The complete discussion leading to the changes, the concrete changes to the process, and the voting records supporting the changes are tracked in Bug #44 -Security policy ambiguities. On February 11, 2015, the proposed changes were approved in accordance with Xen Project governance. Note that some process changes are already implemented, whereas others are waiting for implementation tasks (e.g. new secure mailing lists) before they can fully be put in place. We have however updated our Security Problem Response Process as the most important elements of the process are already in place.

Process Changes Already in Operation

The updated policy makes explicit whether or not patches related to a Xen Security Issue can be deployed by pre-disclosure list members. The concrete policy changes can be found here and here. In practice, every Xen Security Advisory will contain a section such as:


Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security

This section will clarify whether deploying fixed versions of Xen during the embargo is allowed. Any restrictions will also be stated in the embargoed advisory. The Security Team will impose deployment restrictions only to prevent the exposure of security vulnerability technicalities, which present a significant risk of vulnerability rediscovery (for example, by visible differences in behaviour). Such situations have been, and are expected, to be rare.

Changes to Application Procedure for Pre-disclosure List Membership

We also made additional changes related to streamlining and simplifying the process of applying for pre-disclosure list membership. Detailed policy changes can be found here and here. Moving forward, future applications to become members of the Xen Project pre-disclosure list have to be made publicly on the predisclosure-applications mailing list. This enables Xen Project community members to provide additional information and also is in line with one of our community’s core principles: transparency. In addition, we’ve clarified our eligibility criteria to make it easier for the Xen Project Security Team, as well as observers of the mailing list, to verify whether applicants are eligible to become members of the list.

Process Changes Not Fully Implemented:  Sharing of Information Among Pre-disclosure List Members

Finally, members of the pre-disclosure list will be explicitly allowed to share fixes to embargoed issues, analysis, and other relevant information with the security teams of other pre-disclosure members. Information sharing will happen on a private and secure mailing list hosted by the Xen Project.  Note the list is not yet in place; more details here.