Ampere is the codename for a

graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...

(GPU) microarchitecture developed by

Nvidia Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...

as the successor to both the Volta and

Turing Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical co ...

architectures, officially announced on May 14, 2020. It is named after French mathematician and physicist André-Marie Ampère. Nvidia announced the next-generation

GeForce 30 series The GeForce 30 series is a suite of graphics processing units (GPUs) designed and marketed by Nvidia, succeeding the GeForce 20 series. The GeForce 30 series is based on the Ampere architecture, which feature Nvidia's second-generation ray trac ...

consumer GPUs at a GeForce Special Event on September 1, 2020. Nvidia announced A100 80GB GPU at SC20 on November 16, 2020. Mobile RTX graphics cards and the RTX 3060 were revealed on January 12, 2021. Nvidia also announced Ampere's successor, Hopper, at GTC 2022, and "Ampere Next Next" for a 2024 release at GPU Technology Conference 2021.

Details

Architectural improvements of the Ampere architecture include the following: *

CUDA CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ...

Compute Capability 8.0 for A100 and 8.6 for the GeForce 30 series *

TSMC Taiwan Semiconductor Manufacturing Company Limited (TSMC; also called Taiwan Semiconductor) is a Taiwanese multinational semiconductor contract manufacturing and design company. It is the world's most valuable semiconductor company, the world' ...

7 nm In semiconductor manufacturing, the International Technology Roadmap for Semiconductors defines the 7 nm process as the MOSFET technology node following the 10 nm node. It is based on FinFET (fin field-effect transistor) technology, ...

FinFET A fin field-effect transistor (FinFET) is a multigate device, a MOSFET (metal-oxide-semiconductor field-effect transistor) built on a substrate where the gate is placed on two, three, or four sides of the channel or wrapped around the channel, f ...

process for A100 * Custom version of

Samsung The Samsung Group (or simply Samsung) ( ko, 삼성 ) is a South Korean multinational manufacturing conglomerate headquartered in Samsung Town, Seoul, South Korea. It comprises numerous affiliated businesses, most of them united under the ...

's 8 nm process (8N) for the GeForce 30 series * Third-generation Tensor Cores with FP16, bfloat16, TensorFloat-32 (TF32) and FP64 support and sparsity acceleration. The individual Tensor cores have with 256 FP16 FMA operations per second 4x processing power (GA100 only, 2x on GA10x) compared to previous Tensor Core generations; the Tensor Core Count is reduced to one per SM. * Second-generation ray tracing cores; concurrent ray tracing, shading, and compute for the GeForce 30 series * High Bandwidth Memory 2 (HBM2) on A100 40GB & A100 80GB *

GDDR6X Graphics Double Data Rate 6 Synchronous Dynamic Random-Access Memory (GDDR6 SDRAM) is a type of synchronous graphics random-access memory (SGRAM) with a high bandwidth, "double data rate" interface, designed for use in graphics cards, game conso ...

memory for GeForce RTX 3090, RTX 3080 Ti, RTX 3080, RTX 3070 Ti * Double FP32 cores per SM on GA10x GPUs * NVLink 3.0 with a 50Gbit/s per pair throughput * PCI Express 4.0 with

SR-IOV In virtualization, single root input/output virtualization (SR-IOV) is a specification that allows the isolation of PCI Express resources for manageability and performance reasons. Details A single physical PCI Express bus can be shared in a virt ...

support (SR-IOV is reserved only for A100) * Multi-instance GPU (MIG) virtualization and GPU partitioning feature in A100 supporting up to seven instances *

PureVideo PureVideo is Nvidia's hardware SIP core that performs video decoding. PureVideo is integrated into some of the Nvidia GPUs, and it supports hardware decoding of multiple video codec standards: MPEG-2, VC-1, H.264, HEVC, and AV1. PureVideo occu ...

feature set K hardware video decoding with

AV1 AOMedia Video 1 (AV1) is an open, royalty-free video coding format initially designed for video transmissions over the Internet. It was developed as a successor to VP9 by the Alliance for Open Media (AOMedia), a consortium founded in 2015 th ...

hardware decoding for the GeForce 30 series and feature set J for A100 * 5 NVDEC for A100 * Adds new hardware-based 5-core

JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...

decode (NVJPG) with YUV420, YUV422, YUV444, YUV400, RGBA. Should not be confused with Nvidia NVJPEG (GPU-accelerated

library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...

for JPEG encoding/decoding)

Chips

* GA100 * GA102 * GA103 * GA104 * GA106 * GA107 Comparison of Compute Capability: GP100 vs GV100 vs GA100 Comparison of Precision Support Matrix Legend: * FPnn: floating point with nn bits * INTn: integer with n bits * INT1: binary * TF32: TensorFloat32 * BF16: bfloat16 Comparison of Decode Performance

A100 accelerator and DGX A100

Announced and released on May 14, 2020 was the Ampere-based A100 accelerator. The A100 features 19.5 teraflops of FP32 performance, 6912 CUDA cores, 40GB of graphics memory, and 1.6TB/s of graphics memory bandwidth. The A100 accelerator was initially available only in the 3rd generation of

DGX DGX might refer to: * Nvidia DGX, a series of super computer nodes * MOD St Athan (IATA: DGX), a Ministry of Defence airfield connected to the RAF (British Royal Air Force) * DGX, a division of Dollar General * Quest Diagnostics (NYSE: DGX) * A ty ...

server, including 8 A100s. Also included in the DGX A100 is 15TB of

PCIe PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe or PCI-e, is a high-speed serial computer expansion bus standard, designed to replace the older PCI, PCI-X and AGP bus standards. It is the common ...

gen 4

NVMe NVM Express (NVMe) or Non-Volatile Memory Host Controller Interface Specification (NVMHCIS) is an open, logical-device interface specification for accessing a computer's non-volatile storage media usually attached via PCI Express (PCIe) bus. The ...

storage, two 64-core AMD

Rome , established_title = Founded , established_date = 753 BC , founder = King Romulus (legendary) , image_map = Map of comune of Rome (metropolitan city of Capital Rome, region Lazio, Italy).svg , map_caption ...

7742 CPUs, 1 TB of RAM, and

Mellanox Mellanox Technologies Ltd. ( he, מלאנוקס טכנולוגיות בע"מ) was an Israeli-American multinational supplier of computer networking products based on InfiniBand and Ethernet technology. Mellanox offered adapters, switches, softwa ...

-powered HDR InfiniBand interconnect. The initial price for the DGX A100 was $199,000.

Products using Ampere

* GeForce MX series ** GeForce MX570 (mobile) (GA107) *

GeForce 20 series The GeForce 20 series is a family of graphics processing units developed by Nvidia. Serving as the successor to the GeForce 10 series, the line started shipping on September 20, 2018, and after several editions, on July 2, 2019, the GeForce R ...

** GeForce RTX 2050 (mobile) (GA107) *

** GeForce RTX 3050 (mobile) (GA107) ** GeForce RTX 3050 (GA106 or GA107) ** GeForce RTX 3050 Ti (mobile) (GA107) ** GeForce RTX 3060 (mobile) (GA106) ** GeForce RTX 3060 (GA106 or GA104) ** GeForce RTX 3060 Ti (GA104 or GA103) ** GeForce RTX 3070 (mobile) (GA104) ** GeForce RTX 3070 (GA104) ** GeForce RTX 3070 Ti (mobile) (GA104) ** GeForce RTX 3070 Ti (GA104) ** GeForce RTX 3080 (mobile) (GA104) ** GeForce RTX 3080 (GA102) ** GeForce RTX 3080 12GB (GA102) ** GeForce RTX 3080 Ti (mobile) (GA103) ** GeForce RTX 3080 Ti (GA102) ** GeForce RTX 3090 (GA102) ** GeForce RTX 3090 Ti (GA102) * Nvidia Workstation GPUs (formerly

Quadro Quadro was Nvidia's brand for graphics cards intended for use in workstations running professional computer-aided design (CAD), computer-generated imagery (CGI), digital content creation (DCC) applications, scientific calculations and machine ...

) ** RTX A2000 (mobile) (GA107) ** RTX A2000 (GA106) ** RTX A3000 (mobile) (GA104) ** RTX A4000 (mobile) (GA104) ** RTX A4000 (GA104) ** RTX A4500 (GA102) ** RTX A5000 (mobile) (GA104) ** RTX A5000 (GA102) ** RTX A5500 (GA102) ** RTX A6000 (GA102) * Nvidia Data Center GPUs (formerly Tesla) ** Nvidia A2 (GA107) ** Nvidia A10 (GA102) ** Nvidia A16 (4 × GA107) ** Nvidia A30 (GA100) ** Nvidia A40 (GA102) ** Nvidia A100 (GA100) ** Nvidia A100 80GB (GA100)

References

External links

NVIDIA A100 Tensor Core GPU Architecture whitepaper

Nvidia Ampere GA102 GPU Architecture whitepaper

Nvidia Ampere Architecture

Nvidia A100 Tensor Core GPU

NVIDIA Ampere Architecture In-Depth
{{Nvidia Nvidia microarchitectures Nvidia Ampere