The Graphics Processing Unit (GPU) has become a fundamental building block in today’s computing environment, accelerating tasks from entertainment applications (gaming, video playback, etc.) to general purpose windowing (Windows* Aero*, Compiz Fusion, etc.) and high performance computing (medical image processing, weather forecast, computer aided designs, etc.).
Today, we see a trend toward moving GPU-accelerated tasks to virtual machines (VMs). Desktop virtualization simplifies the IT management infrastructure by moving a worker’s desktop to the VM. In the meantime, there is also demand for buying GPU computing resources from the cloud. Efficient GPU virtualization is required to address the increasing demands.
Enterprise applications (mail, browser, office, etc.) usually demand a moderate level of GPU acceleration capability. When they are moved to a virtual desktop, our integrated GPU can easily accommodate the acceleration requirements of multiple instances.
Let’s first look at the architecture of Intel Processor Graphics:
The render engine represents the GPU acceleration capabilities with fixed pipelines and execution units, which are used through GPU commands queued in command buffers. The display engine routes data from graphics memory to external monitors, and contains states of display attributes (resolution, color depth, etc.). The global state represents all the remaining functionality, including initialization, power control, etc. Graphics memory holds the data, used by the render engine and display engine.
The Intel Processor Graphics uses system memory as the graphics memory, through the graphics translation table (GTT). A single 2GB global virtual memory (GVM) space is available to all GPU components through the global GTT(GGTT). In the meantime, multiple per-process virtual memory (PPVM) spaces are created through the per-process GTTs (PPGTTs), extending the limited GVM resource and enforcing process isolation.
Graphics Virtualization Technologies
Several technologies achieve graphics virtualization, as illustrated in the image below, with more hardware acceleration toward the right.
Device emulation is mainly used in server virtualization, with emulation of an old VGA display card. Qemu is the most widely used vehicle. Full emulation of a GPU is almost impossible, because of complexity and extremely poor performance.
API forwarding implements a frontend/backend driver pair. The frontend driver forwards high-level DirectX/OpenGL API calls from the VM to the backend driver in the host through an optimized inter-VM channel. Multiple backend drivers behave like normal 3D applications in the host, so a single GPU can be multiplexed to accelerate multiple VMs. However, the difference between the VM and host graphics stacks easily leads to reduced performance or compatibility issues. Because it is hardware-agnostic, this is the most widely used technology, so far. Actual implementations vary, depending on the level where forwarding happens. For example, VMGL directly forwards GL commands, while VMware vGPU presents itself as a virtual device, with high-level DirectX calls translated to its private SVGA3D protocol. Another recent example is Virgil, with its experimental virtual 3D support for QEMU.
Direct pass-through, based on VT-d, assigns the whole GPU exclusively to a single VM. When achieving the best performance, it sacrifices the sharing capability.
Mediated pass-through extends direct pass-through, using a software approach. Every VM is allowed to access partial device resources without hypervisor intervention, while privileged operations are mediated through a software layer. It sustains the performance of direct pass-through, while still provides the sharing capability. XenGT adopts this technology.
XenGT is a full GPU virtualization solution with mediated pass-through, on Intel Processor Graphics. A virtual GPU instance is maintained for each VM, with part of performance critical resources directly assigned. The capability of running native graphics driver inside a VM, without hypervisor intervention in performance critical paths, achieves a good balance among performance, feature and sharing capability.
Above figure shows the overall XenGT architecture. Each VM is allowed to access a partial performance critical resource without hypervisor intervention. Privileged operations are trapped by Xen and forwarded to the mediator for emulation. The mediator emulates a virtual GPU instance for each VM. Context switches are conducted by the mediator when switching the GPU between VMs. XenGT implements the mediator in dom0. This avoids adding complex device knowledge to Xen, and also permits a more flexible release model. In the meantime, we want to have a unified architecture to mediate all the VMs, including dom0, itself. So, the mediator is implemented as a separate module from dom0’s graphics driver. It brings a new challenge, that Xen must selectively trap the accesses from dom0’s driver while granting permission to the mediator. We call it a “de-privileged” dom0 mode.
Performance critical resources are passed through to a VM:
- Part of the global virtual memory space
- VM’s own per-process virtual memory spaces
- VM’s own allocated command buffers (actually in graphics memory)
This minimizes hypervisor intervention in the critical rendering path. Even when a VM is not scheduled to use the render engine, that VM can continuously queue commands in parallel.
Other operations are privileged, and must be trapped and emulated by the mediator, including:
- PCI configuration registers
- GTT tables
- Submission of queued GPU commands
The mediator maintains the virtual GPU instance based on the traps mentioned above, and schedules use of the render engine among VMs to ensure secure sharing of the single physical GPU.
The latest source codes and the setup guide are available at the github repositories:
(The first repository has a XenGT_Setup_Guide.pdf, which supplies step-by-step instructions for getting a system set up.)
Patches are welcomed!
We plan to upstream this work, and are now preparing some cleanup.
XenGT was first announced in Sep 2013:
It was presented at the 2013 Xen Project Developer Summit, Edinburgh:
An update was announced recently in Feb 2014: