DGX-1
   HOME
*



picture info

DGX-1
Nvidia DGX is a line of Nvidia-produced servers and workstations which specialize in using GPGPU to accelerate deep learning applications. The typical design of a DGX system is based upon a rackmount chassis with motherboard that carries high performance x86 server CPUs (Typically Intel Xeons, though recently the DGX A100 and DGX Station A100 utilize AMD EPYC CPUs). The main component of a DGX system is a set of 4 to 16 Nvidia Tesla GPU modules on an independent system board. DGX systems have large heatsinks and powerful fans to adequately cool thousands of watts of thermal output. The GPU modules are typically integrated into the system using a version of the SXM socket. Models Pascal - Volta DGX-1 DGX-1 servers feature 8 GPUs based on the Pascal or Volta daughter cards with 128GB of total HBM2 memory, connected by an NVLink mesh network. The DGX-1 was announced on the 6th of April in 2016. All models are based on a dual socket configuration of Intel Xeon E5 CPUs, and a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




NVLink
NVLink is a wire-based serial multi-lane near-range communications link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central hub. The protocol was first announced in March 2014 and uses a proprietary high-speed signaling interconnect (NVHS). Principle NVLink is a wire-based communications protocol for near-range semiconductor communications developed by Nvidia that can be used for data and control code transfers in processor systems between CPUs and GPUs and solely between GPUs. NVLink specifies a point-to-point connection with data rates of 20, 25 and 50 Gbit/s (v1.0/v2.0/v3.0 resp.) per differential pair. Eight differential pairs form a "sub-link" and two "sub-links", one for each direction, form a "link". The total data rate for a sub-link is 25 GByte/s and the total data rate for a link is 50 GByte/s. Each V100 GPU supports up to six links. Thus, each GPU is capable of supporti ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

SXM (socket)
SXM is a high bandwidth socket solution for connecting Nvidia Compute Accelerators to a system. Each generation of Nvidia Tesla since P100 models, the DGX computer series and the HGX boards come with an SXM socket type that realizes high bandwidth, power delivery and more for the matching GPU daughter cards. Nvidia offers these combinations as an end-user product e.g. in their models of the DGX system series. Current socket generations are SXM for Pascal based GPUs, SXM2 and SXM3 for Volta based GPUs, SXM4 for Ampere based GPUs, and SXM5 for Hopper based GPUs. These sockets are used for specific models of these accelerators, and offer higher performance per card than PCIe equivalents. The DGX-1 system was the first to be equipped with SXM-2 sockets and thus was the first to carry the form factor compatible SXM modules with P100 GPUs and later was unveiled to be capable of allowing upgrading to (or being pre-equipped with) SXM2 modules with V100 GPUs. SXM boards are typical ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Volta (microarchitecture)
Volta is the codename for a GPU microarchitecture developed by Nvidia, succeeding Pascal. It was first announced on a roadmap in March 2013, although the first product was not announced until May 2017. The architecture is named after 18th–19th century Italian chemist and physicist Alessandro Volta. It was NVIDIA's first chip to feature Tensor Cores, specially designed cores that have superior deep learning performance over regular CUDA cores. The architecture is produced with TSMC's 12 nm FinFET process. The Ampere microarchitecture is the successor to Volta. The first graphics card to use it was the datacenter Tesla V100, e.g. as part of the Nvidia DGX-1 system. It has also been used in the Quadro GV100 and Titan V. There were no mainstream GeForce graphics cards based on Volta. Details Architectural improvements of the Volta architecture include the following: * CUDA Compute Capability 7.0 ** concurrent execution of integer and floating point operations * TSMC's 12& ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Nvidia
Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to early-mid 2000s. Though unofficial, second letter capitalization of NVIDIA, i.e. nVidia, may be found within enthusiast communities and publications. ( ) is an American multinational technology company incorporated in Delaware and based in Santa Clara, California. It is a software and fabless company which designs graphics processing units (GPUs), application programming interface (APIs) for data science and high-performance computing as well as system on a chip units (SoCs) for the mobile computing and automotive market. Nvidia is a global leader in artificial intelligence hardware and software. Its professional line of GPUs are used in workstations for applications in such fields as architecture, engineering and construction, media ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

PCI Express
PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe or PCI-e, is a high-speed serial computer expansion bus standard, designed to replace the older PCI, PCI-X and AGP bus standards. It is the common motherboard interface for personal computers' graphics cards, hard disk drive host adapters, SSDs, Wi-Fi and Ethernet hardware connections. PCIe has numerous improvements over the older standards, including higher maximum system bus throughput, lower I/O pin count and smaller physical footprint, better performance scaling for bus devices, a more detailed error detection and reporting mechanism (Advanced Error Reporting, AER), and native hot-swap functionality. More recent revisions of the PCIe standard provide hardware support for I/O virtualization. The PCI Express electrical interface is measured by the number of simultaneous lanes. (A lane is a single send/receive line of data. The analogy is a highway with traffic in both directions. ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Ampere (microarchitecture)
Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures, officially announced on May 14, 2020. It is named after French mathematician and physicist André-Marie Ampère. Nvidia announced the next-generation GeForce 30 series consumer GPUs at a GeForce Special Event on September 1, 2020. Nvidia announced A100 80GB GPU at SC20 on November 16, 2020. Mobile RTX graphics cards and the RTX 3060 were revealed on January 12, 2021. Nvidia also announced Ampere's successor, Hopper, at GTC 2022, and "Ampere Next Next" for a 2024 release at GPU Technology Conference 2021. Details Architectural improvements of the Ampere architecture include the following: * CUDA Compute Capability 8.0 for A100 and 8.6 for the GeForce 30 series * TSMC's 7 nm FinFET process for A100 * Custom version of Samsung's 8 nm process (8N) for the GeForce 30 series * Third-generation Tensor Cores with ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


InfiniBand
InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used as either a direct or switched interconnect between servers and storage systems, as well as an interconnect between storage systems. It is designed to be scalable and uses a switched fabric network topology. By 2014, it was the most commonly used interconnect in the TOP500 list of supercomputers, until about 2016. Mellanox (acquired by Nvidia) manufactures InfiniBand host bus adapters and network switches, which are used by large computer system and database vendors in their product lines. As a computer cluster interconnect, IB competes with Ethernet, Fibre Channel, and Intel Omni-Path. The technology is promoted by the InfiniBand Trade Association. History InfiniBand originated in 1999 from the merger of two competing designs: F ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


High-performance Computing
High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems. Overview HPC integrates systems administration (including network and security knowledge) and parallel programming into a multidisciplinary field that combines digital electronics, computer architecture, system software, programming languages, algorithms and computational techniques. HPC technologies are the tools and systems used to implement and create high performance computing systems. Recently, HPC systems have shifted from supercomputing to computing clusters and grids. Because of the need of networking in clusters and grids, High Performance Computing Technologies are being promoted by the use of a collapsed network backbone, because the collapsed backbone architecture is simple to troubleshoot and upgrades can be applied to a single router as opposed to multiple ones. The term is most commonly associated with computing used for scientific research or ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




FP16
In computing, half precision (sometimes called FP16) is a binary floating-point computer number format that occupies 16 bits (two bytes in modern computers) in computer memory. It is intended for storage of floating-point values in applications where higher precision is not essential, in particular image processing and neural networks. Almost all modern uses follow the IEEE 754-2008 standard, where the 16-bit base-2 format is referred to as binary16, and the exponent uses 5 bits. This can express values in the range ±65,504, with the minimum value above 1 being 1 + 1/1024. Depending on the computer, half-precision can be over an order of magnitude faster than double precision, e.g. 550 PFLOPS for half-precision vs 37 PFLOPS for double precision on one cloud provider. History Several earlier 16-bit floating point formats have existed including that of Hitachi's HD61810 DSP of 1982, Scott's WIF and the 3dfx Voodoo Graphics processor. ILM was searching for an image forma ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Water Cooling
Cooling tower and water discharge of a nuclear power plant Water cooling is a method of heat removal from components and industrial equipment. Evaporative cooling using water is often more efficient than air cooling. Water is inexpensive and non-toxic; however, it can contain impurities and cause corrosion. Water cooling is commonly used for cooling automobile internal combustion engines and power stations. Water coolers utilising convective heat transfer are used inside high-end personal computers to lower the temperature of CPUs. Other uses include the cooling of lubricant oil in pumps; for cooling purposes in heat exchangers; for cooling buildings in HVAC and in chillers. Mechanism Advantages Water is inexpensive, non-toxic, and available over most of the earth's surface. Liquid cooling offers higher thermal conductivity than air cooling. Water has unusually high specific heat capacity among commonly available liquids at room temperature and atmospheric pressure allowing ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

NVMe
NVM Express (NVMe) or Non-Volatile Memory Host Controller Interface Specification (NVMHCIS) is an open, logical-device interface specification for accessing a computer's non-volatile storage media usually attached via PCI Express (PCIe) bus. The initialism ''NVM'' stands for ''non-volatile memory'', which is often NAND flash memory that comes in several physical form factors, including solid-state drives (SSDs), PCIe add-in cards, and M.2 cards, the successor to mSATA cards. NVM Express, as a logical-device interface, has been designed to capitalize on the low latency and internal parallelism of solid-state storage devices. Architecturally, the logic for NVMe is physically stored within and executed by the NVMe controller chip that is physically co-located with the storage media, usually an SSD. Version changes for NVMe, e.g., 1.3 to 1.4, are incorporated within the storage media, and do not affect PCIe-compatible components such as motherboards and CPUs. By its design, NVM Exp ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

High Bandwidth Memory
High Bandwidth Memory (HBM) is a high-speed computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerators, network devices, high-performance datacenter AI ASICs and FPGAs and in some supercomputers (such as the NEC SX-Aurora TSUBASA and Fujitsu A64FX). The first HBM memory chip was produced by SK Hynix in 2013, and the first devices to use HBM were the AMD Fiji GPUs in 2015. High Bandwidth Memory has been adopted by JEDEC as an industry standard in October 2013.High Bandwidth Memory (HBM) DRAM (JESD235)
JEDEC, October 2013
The second generation, HBM2, was accepted by JEDEC in January 2016.


Technology


[...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]