GPU virtualization refers to technologies that allow the use of a

GPU A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...

accelerate In mechanics, acceleration is the rate of change of the velocity of an object with respect to time. Accelerations are vector quantities (in that they have magnitude and direction). The orientation of an object's acceleration is given by t ...

graphics or

GPGPU General-purpose computing on graphics processing units (GPGPU, or less often GPGP) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditiona ...

applications running on a

virtual machine In computing, a virtual machine (VM) is the virtualization/ emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized h ...

. GPU virtualization is used in various applications such as

desktop virtualization Desktop virtualization is a software technology that separates the desktop environment and associated application software from the physical client device that is used to access it. Desktop virtualization can be used in conjunction with applicati ...

, cloud gaming and computational science (e.g. hydrodynamics simulations). GPU virtualization implementations generally involve one or more of the following techniques: device emulation, API remoting, fixed pass-through and mediated pass-through. Each technique presents different trade-offs regarding virtual machine to GPU

consolidation ratio Consolidation ratio within network infrastructure for Internet hosting, is the number of virtual servers that can run on each physical host machine. Many companies arrive at that figure through trial and error by stacking virtual machines on top o ...

, graphics

acceleration In mechanics, acceleration is the rate of change of the velocity of an object with respect to time. Accelerations are vector quantities (in that they have magnitude and direction). The orientation of an object's acceleration is given by t ...

, rendering

fidelity Fidelity is the quality of faithfulness or loyalty. Its original meaning regarded duty in a broader sense than the related concept of ''fealty''. Both derive from the Latin word ''fidēlis'', meaning "faithful or loyal". In the City of London fin ...

and

feature Feature may refer to: Computing * Feature (CAD), could be a hole, pocket, or notch * Feature (computer vision), could be an edge, corner or blob * Feature (software design) is an intentional distinguishing characteristic of a software item ...

support, portability to different hardware, isolation between virtual machines, and support for suspending/resuming and

live migration Live migration refers to the process of moving a running virtual machine (VM) or application between different physical machines without disconnecting the client or application. Memory, storage, and network connectivity of the virtual machine are ...

API remoting

API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...

remoting or API forwarding, calls to graphical APIs from guest applications are forwarded to the host by

remote procedure call In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure ( subroutine) to execute in a different address space (commonly on another computer on a shared network), which is coded as if it were a normal ( ...

, and the host then executes graphical commands from multiple guests using the host's GPU as a single user. It may be considered a form of

paravirtualization In computing, paravirtualization or para-virtualization is a virtualization technique that presents a software interface to the virtual machines which is similar, yet not identical, to the underlying hardware–software interface. The intent o ...

when combined with device emulation. This technique allows sharing GPU resources between multiple guests and the host when the GPU does not support hardware-assisted virtualization. It is conceptually simple to implement, but it has several disadvantages: * In pure API remoting, there is little isolation between virtual machines when accessing graphical APIs; isolation can be improved using paravirtualization * Performance ranges from 86% to as low as 12% of native performance in applications that issue a large number of drawing calls per

frame A frame is often a structural system that supports other components of a physical construction and/or steel frame that limits the construction's extent. Frame and FRAME may also refer to: Physical objects In building construction *Framing (con ...

* A large number of API entry points must be forwarded, and partial implementation of entry points may decrease fidelity * Applications on guest machines may be limited to few available APIs Hypervisors usually use

shared memory In computer science, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between progr ...

between guest and host to maximize performance and minimize latency. Using a network interface instead (a common approach in

distributed rendering Parallel rendering (or distributed rendering) is the application of parallel programming to the computational domain of computer graphics. Rendering (computer graphics), Rendering graphics can require massive computational resources for complex scen ...

), third-party software can add support for specific APIs (e.g. rCUDA for

CUDA CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ...

) or add support for typical APIs (e.g. VMGL for OpenGL) when it is not supported by the hypervisor's software package, although

network delay Network delay is a design and performance characteristic of a telecommunications network. It specifies the latency for a bit of data to travel across the network from one communication endpoint to another. It is typically measured in multiples ...

and serialization overhead may outweigh the benefits.

Fixed pass-through

In fixed pass-through or GPU pass-through (a special case of PCI pass-through), a GPU is accessed directly by a single virtual machine exclusively and permanently. This technique achieves 96100% of native performance and high fidelity, but the acceleration provided by the GPU cannot be shared between multiple virtual machines. As such, it has the lowest

and the highest cost, as each graphics-accelerated virtual machine requires an additional physical GPU. The following software technologies implement fixed pass-through: *

VMware VMware, Inc. is an American cloud computing and virtualization technology company with headquarters in Palo Alto, California. VMware was the first commercially successful company to virtualize the x86 architecture. VMware's desktop software ru ...

Virtual Dedicated Graphics Acceleration (vDGA) * Parallels Workstation Extreme *

Hyper-V Microsoft Hyper-V, codenamed Viridian, and briefly known before its release as Windows Server Virtualization, is a native hypervisor; it can create virtual machines on x86-64 systems running Windows. Starting with Windows 8, Hyper-V superseded W ...

Discrete Device Assignment (DDA) *

Citrix Citrix Systems, Inc. is an American multinational cloud computing and virtualization technology company that provides server, application and desktop virtualization, networking, software as a service (SaaS), and cloud computing technologi ...

XenServer GPU pass-through * Xen and

QEMU QEMU is a free and open-source emulator (Quick EMUlator). It emulates the machine's central processing unit, processor through dynamic binary translation and provides a set of different hardware and device models for the machine, enabling it t ...

/ KVM with Intel GVT-d

VirtualBox Oracle VM VirtualBox (formerly Sun VirtualBox, Sun xVM VirtualBox and Innotek VirtualBox) is a type-2 hypervisor for x86 virtualization developed by Oracle Corporation. VirtualBox was originally created by Innotek GmbH, which was acquired by S ...

removed support for PCI pass-through in version 6.1.0.

QEMU/KVM

For certain GPU models, Nvidia and AMD video card drivers attempt to detect the GPU is being accessed by a virtual machine and disable some or all GPU features. NVIDIA has recently changed virtualization rules for consumer GPUs by disabling the check in GeForce Game Ready driver 465.xx and later. For NVIDIA, various architectures of desktop and laptop consumer GPUs can be passed through in various ways. For desktop graphics cards, passthrough can be done via the KVM using either the legacy or UEFI BIOS configuration via SeaBIOS and OVMF, respectively.

NVIDIA

Desktops

For desktops, most graphics cards can be passed through, although for graphics cards with the Pascal architecture or older, the VBIOS of the graphics card must be passed through in the virtual machine if the GPU is used to boot the host.

Laptops

For laptops, the NVIDIA driver checks for the presence of a battery via ACPI, and without a battery, an error will be returned. To avoid this, an acpitable created from text converted into Base64 is required to spoof a battery and bypass the check.

= Pascal and earlier

= For the laptop graphics cards that are Pascal and older, passthrough varies widely on the configuration of the graphics card. For laptops that do not have NVIDIA Optimus, such as the MXM variants, passthrough can be achieved through traditional methods. For laptops that have NVIDIA Optimus on as well as rendering through the CPU's integrated graphics framebuffer as opposed to its own, the passthrough is more complicated, requiring a remote rendering display or service, the use of Intel GVT-g, as well as integrating the VBIOS into the boot configuration due to the VBIOS being present in the laptop's system BIOS as opposed to the GPU itself. For laptops that have a GPU with NVIDIA Optimus and have a dedicated framebuffer, the configurations may vary. If NVIDIA Optimus can be switched off, then passthrough is possible through traditional means. However, if Optimus is the only configuration, then it is most likely that the VBIOS is present in the laptop's system BIOS, requiring the same steps as the laptop rendering only on the integrated graphics framebuffer, but an external monitor is also possible.

Mediated pass-through

In mediated device pass-through or full GPU virtualization, the GPU hardware provides contexts with virtual memory ranges for each guest through IOMMU and the hypervisor sends graphical commands from guests directly to the GPU. This technique is a form of

hardware-assisted virtualization In computing, hardware-assisted virtualization is a platform virtualization approach that enables efficient full virtualization using help from hardware capabilities, primarily from the host processors. A full virtualization is used to emulate a ...

and achieves near-native performance and high fidelity. If the hardware exposes contexts as full logical devices, then guests can use any API. Otherwise, APIs and drivers must manage the additional complexity of GPU contexts. As a disadvantage, there may be little isolation between virtual machines when accessing GPU resources. The following software and hardware technologies implement mediated pass-through: *

Virtual Shared Pass-Through Graphics Acceleration with Nvidia vGPU or AMD MxGPU *

XenServer shared GPU with Nvidia vGPU, AMD MxGPU or Intel GVT-g * Xen and KVM with Intel GVT-g * Thincast Workstation - Virtual 3D feature (Direct X 12 & Vulkan 3D API) While API remoting is generally available for current and older GPUs, mediated pass-through requires hardware support available only on specific devices.

Device emulation

GPU architectures are very complex and change quickly, and their internal details are often kept secret. It is generally not feasible to fully virtualize new generations of GPUs, only older and simpler generations. For example,

PCem PCem (short for PC Emulator) is an IBM PC emulator for Windows and Linux that specializes in running old operating systems and software that are designed for IBM PC compatibles. Originally developed as an IBM PC XT emulator, it later added suppor ...

, a specialized emulator of the IBM PC architecture, can emulate a

S3 ViRGE The S3 ViRGE (Video and Rendering Graphics Engine) graphics chipset was one of the first 2D/ 3D accelerators designed for the mass market. Introduced in 1995 by then graphics powerhouse S3, Inc., the ViRGE was S3's first foray into 3D-graphics. ...

/DX graphics device, which supports Direct3D 3, and a 3dfx Voodoo2, which supports

Glide Glide may refer to: * Gliding flight, to fly without thrust Computing *Glide API, a 3D graphics interface *Glide OS, a web desktop *Glide (software), an instant video messenger *Glide, a molecular docking software by Schrödinger (company), Schr� ...

, among others. When using a

VGA Video Graphics Array (VGA) is a video display controller and accompanying de facto graphics standard, first introduced with the IBM PS/2 line of computers in 1987, which became ubiquitous in the PC industry within three years. The term can no ...

or an

SVGA Super VGA (SVGA) is a broad term that covers a wide range of computer display standards that extended IBM's VGA specification. When used as shorthand for a resolution, as VGA and XGA often are, SVGA refers to a resolution of 800×600. History I ...

virtual display adapter, the guest may not have 3D graphics acceleration, providing only minimal functionality to allow access to the machine via a graphics terminal. The emulated device may expose only basic 2D graphics modes to guests. The virtual machine manager may also provide common API implementations using

software rendering Software rendering is the process of generating an image from a model by means of computer software. In the context of computer graphics rendering, software rendering refers to a rendering process that is not dependent upon graphics hardware AS ...

to enable 3D graphics applications on the guest, albeit at speeds that may be low as 3% of hardware-accelerated native performance. The following software technologies implement graphics APIs using software rendering: *

SVGA 3D software renderer *

VMSVGA graphics controller *

XenServer OpenGL Software Accelerator *

Windows Advanced Rasterization Platform Windows Advanced Rasterization Platform (WARP) is a software rasterizer and a component of DirectX graphics runtime in Windows 7 and later. It is available for Windows Vista and Windows Server 2008 through platform update for Windows Vista. WARP ...

Core OpenGL Core OpenGL, or CGL, is Apple Inc.'s Macintosh Quartz windowing system interface to the OS X implementation of the OpenGL specification. CGL is analogous to GLX, which is the X11 interface to OpenGL, as well as WGL, which is the Microsoft Win ...

software renderer *

Mesa A mesa is an isolated, flat-topped elevation, ridge or hill, which is bounded from all sides by steep escarpments and stands distinctly above a surrounding plain. Mesas characteristically consist of flat-lying soft sedimentary rocks capped by a ...

software renderer

Notes

References

{{reflist Hardware virtualization