Tag Archives: Xen 4.6

Stealthy monitoring with Xen altp2m

One of the core features that differentiates Xen from other open-source hypervisors is its native support for stealthy and secure monitoring of guest internals (aka. virtual machine introspection [1]). In Xen 4.6 which was was released last autumn several new features have been introduced that make this subsystem better; a cleaned-up, optimized API and ARM support being just some of the biggest items on this list. As part of this release of Xen, a new and unique feature was also successfully added by a team from Intel that make stealthy monitoring even better on Xen: altp2m. In this blog entry we will take a look at what it’s all about.

p2mIn Xen’s terminology, p2m stands for the memory management layer that handles the translation from guest physical memory to machine physical. This translation is critical for safely partitioning the real memory of the machine between Xen and the various VMs running as to ensure a VM can’t access the memory of another without permission. There are several implementations of this mechanism, including one with hardware support via Intel Extended Page Tables (EPT) available to HVM guests and PVH . In Xen’s terminology, this is called Hardware Assisted Paging (hap). In this implementation the hypervisor maintains a second pagetable, similar to the one in 64-bit operating systems use, dedicated to running the p2m translation. All open-source hypervisors that use this hardware assisted paging method use a single EPT per virtual machine to handle this translation, as most of the time the memory of the guest is assigned at VM creation and doesn’t change much afterwards.

altp2mXen altp2m is the first implementation which changes this setup by allowing Xen to create more then one EPT for each guest. Interestingly, the Intel hardware has been capable of maintaining up to 512 EPT pointers from the VMCS since the Haswell generation of CPUs. However, no hypervisor made use of this capability until now. This changed in Xen 4.6, where we can now create of up to 10 EPTs per guest. The primary reason for this feature is to use it with the #VE and VMFUNC extensions.

It can also be used by external monitoring applications via the Xen vm_event system.

Why alt2pm is a game-changer

Alt2pm is a game-changer for applications performing purely external monitoring is because it simplifies the monitoring process of multi-vCPU guests. The EPT layer has been successfully used in stealthy monitoring applications to track the memory accesses made by the VM from a safe vantage point by restricting the type of access the VM may perform on various memory pages. Since EPT permission violations trap into the hypervisor, the VM would receive no indication that anything out of the ordinary has happened. While the method allowed for stealthy tracing of R/W/X memory accesses of the guest, the memory permission needs to be relaxed in order to allow the guest to continue execution. When a single EPT is shared across multiple running vCPUs, relaxing the permissions to allow one vCPU to continue may inadvertently allow another one to perform the memory access we would otherwise want to track. While under normal circumstances such race-condition may rarely occur, malicious code could easily use this to hide some of its actions from a monitoring application.

Solutions to this problem exist already. For example we can pause all vCPUs while the one violating the access is single-stepped. This approach however introduces heavy overhead just to avoid a race-condition that may rarely occur in practice. Alternatively, one could emulate the instruction that was violating the EPT permission without relaxing the EPT access permissions, as Xen’s built-in emulator doesn’t use EPT to access the guest memory. This solution, while supported in Xen, is not particularly ideal either as Xen’s emulator is incomplete and is known to have issues that can lead to guest instability [2]. Furthermore, over the years emulation has been a hotbed of various security issues in many hypervisors (including Xen [3]), thus building security tools based on emulation is simply asking for trouble. It can be handy but should be used only when no other option is available.

Xen’s altp2m system changes this problem quite significantly. By having multiple EPTs we can have differing access permissions defined in each table, which can be easily swapped around by changing the active EPT index in the VMCS. When the guest makes a memory access that is monitored, instead of having to relax the access permission, Xen can simply switch to an EPT (called a view) that allows the operation to continue. Afterwards the permissive view can be switched back to the restricted view to continue monitoring. Since each vCPU has its own VMCS where this switching is performed, this monitoring can be performed specific to each vCPU, without having to pause any of the other ones, or having to emulate the access. All without the guest noticing any of this switching at all. A truly simple and elegant solution.

Other introspection methods for stealthy monitoring

EPT based monitoring is not the only introspecting technique used for stealthy monitoring. For example, the Xen based DRAKVUF Dynamic Malware Analysis [4] uses it in combination with an additional technique to maximum effect. The main motivation for that is because EPT based monitoring is known to introduce significant overhead, even with altp2m: the granularity of the monitoring is that of a memory page (4KB). For example, if the monitoring application is really just interested in when a function-entry point is called, EPT based monitoring creates a lot of “false” events when that page is accessed for the rest of the function’s code.

This can be avoided by enabling the trapping of debug instructions into the hypervisor, a built-in feature of Intel CPUs that Xen exposes to third-party applications. This method is used in DRAKVUF, which writes breakpoint instructions into the guests’ memory at code-locations of interest. Since we will only get an event for precisely the code-location we are interested in this method effectively reduces the overhead. However, the trade-off is that unlike EPT permissions the breakpoints are now visible to the guest. Thus, to hide the presence of the breakpoints from the guest, these pages need to get further protected by restricting the pages to be execute-only in the EPT. This allows DRAKVUF to remove the breakpoints before in-guest code-integrity checking mechanisms (like Windows Patchguard) can access the page. While with altp2m the EPT permissions can be safely used with multi-vCPU systems, using breakpoints similarly presents a race-condition: the breakpoint hit by one vCPU has to be removed to allow the guest to execute the instruction that was originally overwritten, potentially allowing another vCPU to do so as well without notice.

altp2m-shadowFortunately, altp2m has another neat feature that can be used to solve this problem. Beside allowing for changing the memory permissions in the different altp2m views, it also allows to change the mapping itself! The same guest physical memory can be setup to be backed by different pages in the different views. With this feature we can really think of guest physical memory as “virtual”: where it is mapped really depends on which view the vCPU is running on. Using this feature allows us to hide the presence of the breakpoints in a brand new way. To do this, first we create a complete shadow copy of the memory page where a breakpoint is going to be written and only write the breakpoint into this shadow copy. Now, using altp2m, we setup a view where the guest physical memory of the page gets mapped to our shadow copy. The guest continues to access its physical memory as before, but underneath it is now using the trapped shadow copy. When the breakpoint is hit, or if something is trying to scan the code, we simply switch the view to the unaltered view for the duration of a single-step, then switch back to the trapped view. This allows us to hide the presence of the breakpoints specific to each vCPU! All without having pause any of the other vCPUs or having to emulate. The first open-source implementation of this tracing has been already merged into the DRAKVUF Malware Analysis System and is available as a reference implementation for those interested in more details.

Conclusion

As we can see, Xen continues to be on the forefront of advancing the development of virtualization based security application and allowing third-party tools to create some very exotic setups. This flexibility is what’s so great about Xen and why it will continue to be a trend-setter for the foreseeable future

References

[1] Virtual Machine Introspection
[2] xen-devel@: Failed vm entry with heavy use of emulator
[3] Hardening Hypervisors Against VENOM-Style Attacks
[4] DRAKVUF Malware Analysis System (drakvuf.com)
[5] Stealthy, Hypervisor-based Malware Analysis (Presentation)

Best Quality and Quantity of Contributions in the New Xen Project 4.6 Release

I’m pleased to announce the release of Xen Project Hypervisor 4.6. This release focused on improving code quality, security hardening, enablement of security appliances, and release cycle predictability — this is the most punctual release we have ever had. We had a significant amount of contributions from cloud providers, software vendors, hardware vendors, academic researchers and individuals to help with this release. We continue to strive to make Xen Project Hypervisor the most secure open source hypervisor to match the security challenges in cloud computing, and for embedded and IoT use-cases. We are also continuing to improve upon the performance and scalability for our users, and aim to continuously bring many new features to our users in a timely manor.

Despite an increase of new features compared to previous releases, the Xen Project Hypervisor codebase has only increased by 6KLOC compared to Xen 4.5. In addition, we were able to increase the number of changesets that we integrated into Xen from 178/month (1812 in total) for Xen 4.5 to 259/month (2247 in total). In addition, the quality of Xen 4.6 was higher than in the past, enabling the CentOS 7 Virtualization SIG and XenServer to include Xen into their upcoming releases.

To make it easier to understand the major changes during this release cycle I have grouped the major updates into several categories:

  • Hypervisor
  • Toolstack
  • Xen Project Test Lab
  • Linux, FreeBSD and other OSes that utilise the new features
  • Greater Ecosystem

General Hypervisor Updates

  • The memory event subsystem has been reworked and extended to a new VM event subsystem. The new VM event subsystems supports both the ARM and x86 architectures. It can be used to intercept all sorts of VM events, such as memory access, register access and more. This enables security applications such as zero-footprint guest introspection, host-wide monitoring and many others. Have a look at Tamas’s presentations and Steve’s presentations on this topic to get more insights.
  • The Xen Security Modules (XSM) now have a default policy that is regularly tested in the Xen Project Test Lab to make sure it is not broken by mistake. This will enable us to switch on XSM by default in the future.
  • vTPM 2.0 support has been contributed by Intel and BitDefender [ 1 ]. To learn more about how to use vTPM and how it can make your host more secure, go to our wiki.
  • Grant table scalability has been improvement significantly by using finer-grained locks in grant tables. In some scenarios aggregate intrahost network throughput has been shown to improve by 100%. Other I/O drivers in Xen should potentially show significant performance improvements as well.
  • We introduced ticket lock to improve fairness, which provides better support of massive workloads from up to hundreds or thousands of VMs on a single host.
  • The unused SEDF scheduler has been removed from the hypervisor and toolstack. The Xen Project is committed to actively remove unused code to keep the code base small and to minimize security risks.
  • We removed Mini-OS from the Xen code base into its own tree. Mini-OS started as a demonstration OS, but received significant contributions in recent years (e.g. it is used by many Unikernels). We decide to treat it as a separately maintained independent project with it’s own mailing list and code tree to make it easier to consume. We hope this will help unikernel communities to more easily consume and contribute to Mini-OS, while reducing the Xen Project Hypervisor footprint.

x86-specific Hypervisor Updates

  • The Intel alternate P2M framework is a new capability for VM Introspection, Security and Privacy in Xen that gives Xen the ability to host up to 10 alternate guest to physical memory domains mappings for a specific guest-domain. It is one of the key technologies to enable zero-footprint VM introspection. It can also help Xen to implement faster NFV applications.
  • Intel Page Modification Logging Technology offloads the page dirty logging duty to hardware. Microbenchmark shows about 7% improvement in SPECJbb and should be particularly beneficial for Live Migration.
  • Intel Cache Allocation Technology allows system administrators to assign more L3 cache capacity to individual VMs, resulting in lower latency and higher performance for high-priority workloads such as NFV, real-time and video-on-demand applications.
  • Intel Memory Bandwidth Monitoring allows system administrators to identify memory bandwidth saturation on a Xen host that may be caused by several memory-intensive VMs running on the same host. Taking corrective actions, such as migrating VMs to a different Xen host, increases scalability and performance in the data center.
  • Intel Reserve Memory Region reporting provides a mechanism to report and reserve memory regions for legacy devices to allow for safe device passthrough.
  • Virtual Performance Monitoring Unit support makes it possible to profile the Xen Project Hypervisor with the Linux perf tool. Note that some work still needs to be completed within Linux to make perf fully functional.
  • Virtual NUMA for HVM guest is a continuation of the NUMA work performed in Xen 4.5 and previous releases. In this release, we exposed the functionality through the XL toolstack and added firmware changes to make the feature fully functional.

ARM-specific Hypervisor Updates

  • The supported number of VCPUs has been increased from 8 to 128 VCPUs on ARM64 platforms.
  • Passthrough for non-PCI devices allows users to passthrough devices via partial device trees. Full support for PCI device passthrough is currently being worked on.
  • ARM GICv2 on GICv3 support.
  • 32 bit userspace in 64 bit guest support.
  • OVMF for ARM contributed by Linaro.
  • 64K page ARM guest support.
  • Support for the following new Hardware Platforms has been added: Renesas R-Car Gen2, Thunder X, Huawei hip04-d04 and Xilinx ZynqMP SoC.

Toolstack Updates

  • Live Migration support in libxc / libxl and has been replaced with a completely new implementation (Migration v2). The new version respects different layers in the Xen Software stack and has been designed to be more robust and extensible to better support next-generation infrastructures and work planned in subsequent hypervisor releases.
  • Remus – our High Availability solution – has been reworked and is now based on Migration v2.
  • Libxl asynchronous operations can now be cancelled. This allows libxl users to cancel long-running asynchronous operations and benefits tool stacks such as libvirt and is beneficial for integration with cloud orchestration stacks.
  • Improved SPICE/QXL support.
  • AHCI disk controller support.
  • A new host I/O topology query interface gives upper layer in the software stack the ability to identify the I/O topology of underlying hardware platform.
  • Addition of Xenalyze, which is a tool for analyzing Hypervisor trace buffers and can be used for debugging and optimization, has been added to the Xen Project codebase as a maintained feature.

Xen Project Test Lab Updates

During the Xen 4.6 release cycle, the Xen Project created an Advisory Board funded Continuous Integration Test Lab. It currently has 24 hosts and is going to expanded in the future. This has led to significant improvements in Xen code quality and has allowed the project to expand automated test coverage. The number of test cases doubled during the 4.6 cycle. Some interesting new test cases that have been added are:

  • XSM
  • Stub Domain
  • VM migration using libvirt between two hosts is now tested.
  • Live Migration between hosts of different Xen versions is now tested and will help identify any breakage in our migration code or specification.
  • Test with different disk formats such as QCOW2, VHD and raw format.

More test cases are in the pipeline, including test case for OpenStack’s devstack, performance and scalability tests, FreeBSD Dom0 etc.

Linux, FreeBSD and other OSes

During the Xen 4.6 release cycle, we made significant improvements to major operating systems we rely on to improve interoperability. Some highlights on Linux kernel development spanning from Linux 3.18 to 4.3 were:

  • Xen blkfront multiqueue and multipage ring support.
  • Xen SCSI frontend and backend support.
  • VPMU kernel support.
  • Performance improvement in mmap call.
  • P2M in PV guest can address 512GB or more.

For FreeBSD there were these improvements:

  • Experimental PVH Dom0/DomU support.
  • Removal of classic i386 PV port by FreeBSD developer John Baldwin.
  • Blkfront indirect descriptor support by FreeBSD developer Colin Percival.
  • Removal of broken FreeBSD specific blkfront/back extensions.
  • ARM32 and ARM64 guest support are underway.

Greater Ecosystem

Summary

With dozens of major improvements, many more bug fixes and small improvements, efforts in other projects as well as a greater ecosystem, Xen 4.6 reflects a thriving community around the Xen Project Hypervisor. We are extremely proud of achieving the highest quality of the release while increasing development velocity. In particular, our latest security related features enable Xen to compete in the security appliance market and help answer some of the difficult questions regarding security in the cloud era.

We set out at the beginning of this release cycle to foster greater collaboration among vendors, individual developers, upstream maintainers, other projects and distributions. During this release cycle we continued to see an increasing influx of patches and newcomers. As the release manager, I would like to thank everyone for their contributions (either in the form of patches, bug reports or packaging efforts) to Xen. This release wouldn’t have happened without contributions from so many people around the world. Please check out our 4.6 contributor acknowledgement page.

The source can be located in the xen.git tree (tag RELEASE-4.6.0) or can be downloaded tarball from our website. More information can be found at


[ 1 ] Note that when this article was published, the contribution was mistakenly attributed to the US National Security Agency, instead of BitDefender.

Xen Project Test Day for 4.6 RC4 Scheduled for October 1

Our Fourth (and Possibly Final) 4.6 Release Candidate to be Tested This Thursday

TestDayOur Xen Project Test Days help insure that upcoming releases are ready for production, beyond what our automated testing through our Test Lab can accomplish. It is particularly important that our users test out the upcoming release in their own environment. We rely on your functional testing of features, stress-testing, edge case testing, and performance testing to prove that the code is ready for consumption. And this is your opportunity to verify that the new code will continue to work well in your particular situation.

Xen Project 4.6 Release Candidate 4 Testing

Continuing our current release cycle, the Test Day for Xen Project 4.6 RC4 has been set for Thursday, October 1, 2015.

This may be the final RC before release, so the time to test the software is now!

Test Day Information

Additrional information about Test Days can be found here:

Join us on Tuesday in #xentest on Freenode IRC!
Test a Release Candidate! Help others, get help! And have fun!

Our Next Test Day is September 15: Xen Project 4.6 RC3

The Third 4.6 Release Candidate to be Tested on Tuesday

TestDayOur Xen Project Test Days help insure that upcoming releases are ready for production, beyond what our automated testing through our Test Lab can accomplish. It is particularly important that our users test out the upcoming release in their own environment. We rely on your functional testing of features, stress-testing, edge case testing, and performance testing to prove that the code is ready for consumption. And this is your opportunity to verify that the new code will continue to work well in your particular situation.

Xen Project 4.6 Release Candidate 3 Testing

Continuing our current release cycle, the Test Day for Xen Project 4.6 RC3 has been set for Tuesday, September 15, 2015.

Additional Test Days are expected to be scheduled roughly ever other week until Xen Project 4.6 is ready for release.

Test Day Information

Additrional information about Test Days can be found here:

Join us on Tuesday in #xentest on Freenode IRC!
Test a Release Candidate! Help others, get help! And have fun!
If you can’t make Tuesday, remember that Test and Issue Reports are welcome any time.

Xen Project 4.6 RC2 Test Day is September 1, 2015

Join 4.6 Release Candidate Testing on September 1, 2015

39833137_mAlthough the Xen Project performs automated testing through the project’s Test Lab, we also depend on manual testing of release candidates by our users. Our Test Days help insure that upcoming releases are ready for production. It is particularly important that our users test out the upcoming release in their own environment. In addition, functional testing of features (in particular those which can’t be automated), stress-testing, edge case testing and performance testing are important for a new release.

Xen 4.6 Release Candidate Testing

A few weeks ago, Xen 4.6 went into code freeze and Xen 4.6 RC2 is now ready for testing. With this in mind the Test Day for Xen 4.6 RC2 has been set for next Tuesday, September 1, 2015.

Subsequent Test Days are expected to be scheduled roughly ever other week until Xen 4.6 is ready for release.

Test Day Information

General Information about Test Days can be found here:

Join us on Tuesday in #xentest on Freenode IRC!
Test a Release Candidate! Help others, get help! And have fun!
If you can’t make Tuesday, remember that Test and Issue Reports are welcome any time.

Project Raisin – Raise Xen!

It all started with pvgrub2: it was March 2015 and I wanted to add grub2 to the Xen build system. We were already building grub-legacy as part of the Xen build, so that we could produce a pvgrub binary to be used to boot PV guests. After Vladimir ‘phcoder‘ Serbinenko’s good work on grub2, the latest and greatest upstream grub2 could be built with Xen support and used to boot PV guests. It made perfect sense to add grub2 to the Xen build system too, right? Maybe not. Unexpectedly some key Xen Project contributors pushed back: “there doesn’t seem to be a good reason for cloning and building yet another third-party project as part of the Xen build”, wrote David Vrabel.

Conflicting requirements

It was then that I realized that we have two contrasting set of requirements: on one hand we want to support users that clone xen-unstable, build everything from source, and expect the system to be fully ready after typing ./configure; make; make install. On the other hand, we also want to support distros and product groups that take Xen releases and integrate them into their Linux distros or enterprise build systems. The former want things like grub2 to be part of the xen-unstable build, because the grub2 package provided by their distro doesn’t necessarily comes with Xen support enabled. While the distro packagers are already building a grub2 package and certainly don’t want xen-unstable to go and clone grub2 again. They probably abhor the whole idea of xen-unstable git cloning external trees without their explicit assent. In fact they had been carrying patches to make sure xen-unstable doesn’t clone anything else “behind their back”, until we provided build options to disable all the third-party builds.

Raisin: Xen’s DevStack

How to find a solution that would make both camps happy? Surely others must have had the same issue. Is there another open source project that has to build several separate components in order to be fully functional? Yes, of course, there are many. One of them is OpenStack and it solves the problem by providing a set of scripts called DevStack, which build and setup the system from scratch.

This is where “Raisin” comes from. I announced the new project on the 31st of March 2015. The idea is that Raisin takes care of building Xen and all the other components, which are required to have a fully functional Xen system, but that don’t belong to xen-unstable. For example QEMU, SeaBIOS, and, of course, grub2. Users that build everything from source will clone Raisin to find a single place where they can build all the latest and greatest Xen stuff with a single command. Raisin can be very useful to setup a development environment too. Distro people can refer to Raisin as the most common way to build, install and configure Xen and related components, but they are unlikely to actually use it to build their packages. Raisin helps Xen developers improve the boundaries and interfaces between Xen and external components, by making such boundaries clearer and more explicit. Things like QEMU and SeaBIOS, currently cloned and built by xen-unstable, will be moved out to Raisin, making both Xen maintainers and distro packagers happier. Other Xen related components, that are good to have but not actually required, such as libvirt, will find their place in Raisin too.

Raisin: where we are, what’s next

After few busy months of development, we now have a set of bash scripts that can install dependencies and build Xen, QEMU, qemu-traditional, SeaBIOS, OVMF, Grub2, Libvirt and Linux with a single command. All you need to do is edit the config file, type raise -y build, go get a coffee, and everything will be ready when you come back. Raisin is not tied to a specific version of Xen. In fact, one can choose any git tags or commit ids newer than Xen 4.5 (RELEASE-4.5.0 is the git tag for the Xen 4.5 release) and Raisin will build it. Other commands are available to install and configure the system with the most common settings. Give a look at the README for an introduction on how to use the command line tool.

During the last few weeks I have been working on integrating Raisin in OSSTest, the automated testing framework run by Xen Project. I am currently adding a new test to validate Raisin itself, but going forward it makes sense to actually use Raisin to build Xen, QEMU and anything else OSSTest needs, similarly to what DevStack does for the OpenStack gate.

Making testing easier and accessible to everybody

Talking about tests, this is another area where Raisin can help greatly. I always liked the idea of providing a set of unit and functional tests, quick and simple to run, that can be executed by any Xen contributors to validate their changes before sending a patch to xen-devel. However we didn’t really have place to put them. OSSTest is too big and tightly coupled to the Xen Project Test Lab infrastructure for this use case, and the last thing xen-unstable needs is more scripts. On the other hand, Raisin is at the right abstraction level to run functional tests for the components it already builds. I introduced a few simple tests, which can stack on top of each other, to create busybox based PV and HVM guests. I plan to continue adding more tests, then expose them to OSSTest via Raisin, so that they will be continuously run by the Xen Project Test Lab. But, at the same time, anybody can still manually execute them on their test box with a single raise test command. I am hoping to be able to start asking contributors to run Raisin tests before submitting patches early in the next release cycle. If you use Xen and know bash scripting, you should consider writing a Raisin test to validate your favourite functionality today!

Raisin, you didn’t know you needed it, you can’t live without it ;-)

Resources

The Raisin git repository is available here. The README is up to date and describes the command line interface. We also have quickstart guide on our wiki. Raisin patches are discussed on xen-devel and follow the regular Xen development process.