SXM2
   HOME

TheInfoList



OR:

SXM is a high bandwidth
socket Socket may refer to: Mechanics * Socket wrench, a type of wrench that uses separate, removable sockets to fit different sizes of nuts and bolts * Socket head screw, a screw (or bolt) with a cylindrical head containing a socket into which the hexag ...
solution for connecting
Nvidia Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
Compute Accelerators to a system. Each generation of
Nvidia Tesla Nvidia Tesla was the name of Nvidia's line of products targeted at stream processing or general-purpose graphics processing units (GPGPU), named after pioneering electrical engineer Nikola Tesla. Its products began using GPUs from the G80 ser ...
since P100 models, the DGX computer series and the HGX boards come with an SXM socket type that realizes high bandwidth, power delivery and more for the matching GPU daughter cards. Nvidia offers these combinations as an end-user product e.g. in their models of the DGX system series. Current socket generations are SXM for
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Fren ...
based GPUs, SXM2 and SXM3 for Volta based GPUs, SXM4 for
Ampere The ampere (, ; symbol: A), often shortened to amp,SI supports only the use of symbols and deprecates the use of abbreviations for units. is the unit of electric current in the International System of Units (SI). One ampere is equal to elect ...
based GPUs, and SXM5 for
Hopper Hopper or hoppers may refer to: Places *Hopper, Illinois * Hopper, West Virginia * Hopper, a mountain and valley in the Hunza–Nagar District of Pakistan * Hopper (crater), a crater on Mercury People with the name * Hopper (surname) * Grace H ...
based GPUs. These sockets are used for specific models of these accelerators, and offer higher performance per card than
PCIe PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe or PCI-e, is a high-speed serial computer expansion bus standard, designed to replace the older PCI, PCI-X and AGP bus standards. It is the common mo ...
equivalents. The DGX-1 system was the first to be equipped with SXM-2 sockets and thus was the first to carry the form factor compatible SXM modules with P100 GPUs and later was unveiled to be capable of allowing upgrading to (or being pre-equipped with) SXM2 modules with V100 GPUs. SXM boards are typically built with four or eight GPU slots, although some solutions such as the Nvidia DGX-2 connect multiple boards to deliver high performance. While third party solutions for SXM boards exist, most System Integrators such as
Supermicro Super Micro Computer, Inc., dba Supermicro, is an information technology company based in San Jose, California. It has manufacturing operations in the Silicon Valley, the Netherlands and at its Science and Technology Park in Taiwan. Founded on ...
use prebuilt Nvidia HGX boards, which come in four or eight socket configurations. This solution greatly lowers the cost and difficulty of SXM based GPU servers, and enables compatibility and reliability across all boards of the same generation. SXM modules on e.g. HGX boards, particularly recent generations, may have
NVLink NVLink is a wire-based serial multi-lane near-range communications link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central hub. The protocol was f ...
switches to allow faster GPU-to-GPU communication. This as well reduces bottlenecks which would normally be located within CPU and
PCIe PCI Express (Peripheral Component Interconnect Express), officially abbreviated as PCIe or PCI-e, is a high-speed serial computer expansion bus standard, designed to replace the older PCI, PCI-X and AGP bus standards. It is the common mo ...
. The GPUs on the daughter cards are just using NVLink as their main communication protocol. For example a Hopper-based H100 SXM5 based GPU can use up to 900GB/s of bandwidth across 18 NVLink 4 channels, with each contributing a 50GB/s of bandwidth; This compared to PCIe 5.0, which can handle up to 64GB/s of bandwidth within a x16 slot. This high bandwidth also means that GPUs can share memory over the NVLink bus, allowing an entire HGX board to present to the host system as a single, massive GPU. Power delivery is also handled by the SXM socket, negating the need for external power cables such as those needed in PCIe equivalent cards. This, combined with the horizontal mounting allows cooling options of higher efficiency which in turn allows the SXM based GPUs to operate at a much higher TDP. The Hopper-based H100, for example, can draw up to 700W solely from the SXM socket. The lack of cabling also makes assembling and repairing of large systems much easier, and also reduces the possible points of failure. The early
Nvidia Tegra Tegra is a system on a chip (SoC) series developed by Nvidia for mobile devices such as smartphones, personal digital assistants, and mobile Internet devices. The Tegra integrates an ARM architecture central processing unit (CPU), graphics pro ...
automotive targeted evaluation board, 'Drive PX2', had two MXM (Mobile PCI Express Module) sockets on both sides of the card, this dual MXM design can be considered a predecessor to the Nvidia Tesla implementation of the SXM socket.


References

{{reflist


External links


Erlangen National High Performance Computing Center page on high performance computing with 4x and 8x A100 per computer node
also showing switch topology dumps. Nvidia hardware