Deep learning super sampling (DLSS) is a family of

real-time Real-time or real time describes various operations in computing or other processes that must guarantee response times within a specified time (deadline), usually a relatively short time. A real-time process is generally one that happens in defined ...

deep learning image enhancement and upscaling technologies developed by

Nvidia Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...

that are exclusive to its RTX line of graphics cards, and available in a number of

video games Video games, also known as computer games, are electronic games that involves interaction with a user interface or input device such as a joystick, controller, keyboard, or motion sensing device to generate visual feedback. This feedbac ...

. The goal of these technologies is to allow the majority of the

graphics pipeline In computer graphics, a computer graphics pipeline, rendering pipeline or simply graphics pipeline, is a conceptual model that describes what steps a graphics system needs to perform to render a 3D scene to a 2D screen. Once ...

to run at a lower

resolution Resolution(s) may refer to: Common meanings * Resolution (debate), the statement which is debated in policy debate * Resolution (law), a written motion adopted by a deliberative body * New Year's resolution, a commitment that an individual mak ...

for increased performance, and then infer a higher resolution image from this that contains the same level of detail as if the image had been rendered at this higher resolution. This allows for higher graphical settings and/or

frame rates Frame rate (expressed in or FPS) is the frequency (rate) at which consecutive images ( frames) are captured or displayed. The term applies equally to film and video cameras, computer graphics, and motion capture systems. Frame rate may also be c ...

for a given output resolution, depending on user preference. As of September 2022, the 1st and 2nd generation of DLSS is available on all RTX branded cards from Nvidia in supported titles, while the 3rd generation unveiled at Nvidia's GTC 2022 event is exclusive to Ada Lovelace generation RTX 4000 series graphics cards. Nvidia has also introduced Deep learning dynamic super resolution (DLDSR), a related and opposite technology where the graphics are rendered at a higher resolution, then downsampled to the native display resolution using an AI-assisted downsampling algorithm to achieve higher image quality than rendering at native resolution.

History

Nvidia advertised DLSS as a key feature of the GeForce RTX 20 series cards when they launched in September 2018. At that time, the results were limited to a few video games (namely

Battlefield V ''Battlefield V'' is a first-person shooter game developed by DICE and published by Electronic Arts. It is the eleventh main installment in the ''Battlefield'' series and the successor to 2016's ''Battlefield 1'', and was released for Microsoft ...

and

Metro Exodus ''Metro Exodus'' is a first-person shooter video game developed by 4A Games and published by Deep Silver. It is the third installment in the ''Metro'' video game trilogy based on Dmitry Glukhovsky's novels, following the events of ''Metro 203 ...

) because the algorithm had to be trained specifically on each game on which it was applied and the results were usually not as good as simple resolution upscaling. In 2019, the video game ''

Control Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Controllin ...

'' shipped with ray tracing and an improved version of DLSS, which did not use the Tensor Cores. In April 2020, Nvidia advertised and shipped an improved version of DLSS named DLSS 2.0 with driver version 445.75. DLSS 2.0 was available for a few existing games including ''

'' and '' Wolfenstein: Youngblood'', and would later be added to many newly released games and game engines such as

Unreal Engine Unreal Engine (UE) is a 3D computer graphics game engine developed by Epic Games, first showcased in the 1998 first-person shooter game '' Unreal''. Initially developed for PC first-person shooters, it has since been used in a variety of g ...

and

Unity Unity may refer to: Buildings * Unity Building, Oregon, Illinois, US; a historic building * Unity Building (Chicago), Illinois, US; a skyscraper * Unity Buildings, Liverpool, UK; two buildings in England * Unity Chapel, Wyoming, Wisconsin, US; ...

. This time Nvidia said that it used the Tensor Cores again, and that the AI did not need to be trained specifically on each game. Despite sharing the DLSS branding, the two iterations of DLSS differ significantly and are not backwards-compatible.Edward Liu, NVIDI
"DLSS 2.0 - Image Reconstruction for Real-time Rendering with Deep Learning"
/ref>

Release history

Quality presets

Implementation

DLSS 1.0

The first iteration of DLSS is a predominantly spatial image upscaler with two stages, both relying on convolutional auto-encoder neural networks. The first step is an image enhancement network which uses the current frame and motion vectors to perform

edge enhancement Edge enhancement is an image processing filter that enhances the edge contrast of an image or video in an attempt to improve its acutance (apparent sharpness). The filter works by identifying sharp edge boundaries in the image, such as the e ...

, and

spatial anti-aliasing In digital signal processing, spatial anti-aliasing is a technique for minimizing the distortion artifacts ( aliasing) when representing a high-resolution image at a lower resolution. Anti-aliasing is used in digital photography, computer graphi ...

. The second stage is an image upscaling step which uses the single raw, low-resolution frame to upscale the image to the desired output resolution. Using just a single frame for upscaling means the neural network itself must generate a large amount of new information to produce the high resolution output, this can result in slight

hallucination A hallucination is a perception in the absence of an external stimulus that has the qualities of a real perception. Hallucinations are vivid, substantial, and are perceived to be located in external objective space. Hallucination is a combinati ...

s such as leaves that differ in style to the source content. The neural networks are trained on a per-game basis by generating a "perfect frame" using traditional

supersampling Supersampling or supersampling anti-aliasing (SSAA) is a spatial anti-aliasing method, i.e. a method used to remove aliasing (jagged and pixelated edges, colloquially known as "jaggies") from images rendered in computer games or other computer p ...

to 64 samples per pixel, as well as the motion vectors for each frame. The data collected must be as comprehensive as possible, including as many levels, times of day, graphical settings, resolutions etc. as possible. This data is also augmented using common augmentations such as rotations, colour changes, and random noise to help generalize the test data. Training is performed on Nvidia's Saturn V supercomputer. This first iteration received a mixed response, with many criticizing the often soft appearance and artifacting in certain situations; likely a side effect of the limited data from only using a single frame input to the neural networks which could not be trained to perform optimally in all scenarios and edge-cases. Nvidia also demonstrated the ability for the auto-encoder networks to learn the ability to recreate

depth-of-field The depth of field (DOF) is the distance between the nearest and the furthest objects that are in acceptably sharp focus in an image captured with a camera. Factors affecting depth of field For cameras that can only focus on one object dist ...

and

motion blur Motion blur is the apparent streaking of moving objects in a photograph or a sequence of frames, such as a film or animation. It results when the image being recorded changes during the recording of a single exposure, due to rapid movement or lo ...

, although this functionality has never been included in a publicly released product.

DLSS 2.0

DLSS 2.0 is a

temporal anti-aliasing Temporal anti-aliasing (TAA) is a spatial anti-aliasing technique for computer-generated video that combines information from past frames and the current frame to remove jaggies in the current frame. In TAA, each pixel is sampled once per frame but ...

upsampling In digital signal processing, upsampling, expansion, and interpolation are terms associated with the process of resampling in a multi-rate digital signal processing system. ''Upsampling'' can be synonymous with ''expansion'', or it can describe a ...

(TAAU) implementation, using data from previous frames extensively through sub-pixel jittering to resolve fine detail and reduce aliasing. The data DLSS 2.0 collects includes: the raw low-resolution input, motion vectors, depth buffers, and exposure / brightness information. It can also be used as a simpler TAA implementation where the image is rendered at 100% resolution, rather than being upsampled by DLSS, Nvidia brands this as DLAA (Deep Learning Anti-Aliasing). TAA(U) is used in many modern video games and game engines, however all previous implementations have used some form of manually written

heuristic A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate ...

s to prevent temporal artifacts such as ghosting and flickering. One example of this is neighborhood clamping which forcefully prevents samples collected in previous frames from deviating too much compared to nearby pixels in newer frames. This helps to identify and fix many temporal artifacts, but deliberately removing fine details in this way is analogous to applying a blur filter, and thus the final image can appear blurry when using this method. DLSS 2.0 uses a convolutional auto-encoder neural network trained to identify and fix temporal artifacts, instead of manually programmed heuristics as mentioned above. Because of this, DLSS 2.0 can generally resolve detail better than other TAA and TAAU implementations, while also removing most temporal artifacts. This is why DLSS 2.0 can sometimes produce a sharper image than rendering at higher, or even native resolutions using traditional TAA. However, no temporal solution is perfect, and artifacts (ghosting in particular) are still visible in some scenarios when using DLSS 2.0. Because temporal artifacts occur in most art styles and environments in broadly the same way, the neural network that powers DLSS 2.0 does not need to be retrained when being used in different games. Despite this, Nvidia does frequently ship new minor revisions of DLSS 2.0 with new titles, so this could suggest some minor training optimizations may be performed as games are released, although Nvidia does not provide changelogs for these minor revisions to confirm this. The main advancements compared to DLSS 1.0 include: Significantly improved detail retention, a generalized neural network that does not need to be re-trained per-game, and ~2x less overhead (~1-2 ms vs ~2-4 ms). It should also be noted that forms of TAAU such as DLSS 2.0 are not upscalers in the same sense as techniques such as ESRGAN or DLSS 1.0, which attempt to create new information from a low-resolution source; instead TAAU works to recover data from previous frames, rather than creating new data. In practice, this means low resolution textures in games will still appear low-resolution when using current TAAU techniques. This is why Nvidia recommends game developers use higher resolution textures than they would normally for a given rendering resolution by applying a mip-map bias when DLSS 2.0 is enabled.

DLSS 3.0

Augments DLSS 2.0 by making use of an optical-flow frame generation technique. The DLSS frame generation algorithm takes two rendered frames from the rendering pipeline, and generates a new frame that smoothly transitions between them. So for every frame rendered, one additional frame is generated. DLSS 3.0 makes use of a new generation Optical Flow Accelerator (OFA) included in Ada Lovelace generation RTX GPUs. The new OFA is faster and more accurate than the OFA already available in previous Turing and Ampere RTX GPUs. This results in DLSS 3.0 being exclusive for the RTX 4000 Series. At release, DLSS 3.0 does not work for VR displays.

Anti-aliasing

DLSS requires and applies its own anti-aliasing method. It operates on similar principles to TAA. Like TAA, it uses information from past frames to produce the current frame. Unlike TAA, DLSS does not sample every pixel in every frame. Instead, it samples different pixels in different frames and uses pixels sampled in past frames to fill in the unsampled pixels in the current frame. DLSS uses machine learning to combine samples in the current frame and past frames, and it can be thought of as an advanced and superior TAA implementation made possible by the available tensor cores.

offers deep learning anti-aliasing (DLAA). DLAA provides the same AI-driven anti-aliasing DLSS uses, but without any upscaling or downscaling functionality.

Architecture

With the exception of the shader-core version implemented in ''Control'', DLSS is only available on GeForce RTX 20, GeForce RTX 30, GeForce RTX 40, and

Quadro RTX Quadro was Nvidia's brand for graphics cards intended for use in workstations running professional computer-aided design (CAD), computer-generated imagery (CGI), digital content creation (DCC) applications, scientific calculations and machin ...

series of video cards, using dedicated

AI accelerator An AI accelerator is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and machine vision. Typical applications i ...

s called Tensor Cores. Tensor Cores are available since the Nvidia Volta

GPU A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...

microarchitecture, which was first used on the Tesla V100 line of products. They are used for doing

fused multiply-add Fuse or FUSE may refer to: Devices * Fuse (electrical), a device used in electrical systems to protect against excessive current ** Fuse (automotive), a class of fuses for vehicles * Fuse (hydraulic), a device used in hydraulic systems to protect ...

(FMA) operations that are used extensively in neural network calculations for applying a large series of multiplications on weights, followed by the addition of a bias. Tensor cores can operate on FP16, INT8, INT4, and INT1 data types. Each core can do 1024 bits of FMA operations per clock, so 1024 INT1, 256 INT4, 128 INT8, and 64 FP16 operations per clock per tensor core, and most Turing GPUs have a few hundred tensor cores. The Tensor Cores use CUDA

Warp Warp, warped or warping may refer to: Arts and entertainment Books and comics * WaRP Graphics, an alternative comics publisher * ''Warp'' (First Comics), comic book series published by First Comics based on the play ''Warp!'' * Warp (comics), a ...

-Level Primitives on 32 parallel threads to take advantage of their parallel architecture. A Warp is a set of 32 threads which are configured to execute the same instruction.

References

External links

*
DLSS on official Nvidia developer website
{{NVIDIA Graphics processing units Graphics cards 3D computer graphics Nvidia