Summit (supercomputer)
   HOME

TheInfoList



OR:

Summit or OLCF-4 is a supercomputer developed by IBM for use at
Oak Ridge Leadership Computing Facility The Oak Ridge Leadership Computing Facility (OLCF), formerly the National Leadership Computing Facility, is a designated user facility operated by Oak Ridge National Laboratory and the Department of Energy. It contains several supercomputers, th ...
(OLCF), a facility at the
Oak Ridge National Laboratory Oak Ridge National Laboratory (ORNL) is a U.S. multiprogram science and technology national laboratory sponsored by the U.S. Department of Energy (DOE) and administered, managed, and operated by UT–Battelle as a federally funded research an ...
, capable of 200
petaFLOPS In computing, floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate meas ...
thus making it the 4th fastest supercomputer in the world after Frontier (OLCF-5), Fugaku, and
LUMI Lumi may refer to: Computing * Lumi (software), chemical analysis software * Lumi masking, a technique used by video compression software * LUMI, a supercomputer located in Finland Music * ''Lumi'' (album), a 1987 album by Edward Vesala ...
. It held the number 1 position from November 2018 to June 2020. Its current
LINPACK benchmark The LINPACK Benchmarks are a measure of a system's floating-point computing power. Introduced by Jack Dongarra, they measure how fast a computer solves a dense ''n'' by ''n'' system of linear equations ''Ax'' = ''b'', which is a common ...
is clocked at 148.6 petaFLOPS. As of November 2019, the supercomputer had ranked as the 5th most energy efficient in the world with a measured power efficiency of 14.668 gigaFLOPS/watt. Summit was the first supercomputer to reach exaflop (a quintillion operations per second) speed, achieving 1.88 exaflops during a genomic analysis and is expected to reach 3.3 exaflops using
mixed-precision Mixed-precision arithmetic is a form of floating-point arithmetic that uses numbers with varying widths in a single operation. Arithmetic A common usage of mixed-precision arithmetic is for operating on inaccurate numbers with a small width and ex ...
calculations.


History

The
United States Department of Energy The United States Department of Energy (DOE) is an executive department of the U.S. federal government that oversees U.S. national energy policy and manages the research and development of nuclear power and nuclear weapons in the United Stat ...
awarded a $325 million contract in November 2014 to IBM,
NVIDIA Nvidia CorporationOfficially written as NVIDIA and stylized in its logo as VIDIA with the lowercase "n" the same height as the uppercase "VIDIA"; formerly stylized as VIDIA with a large italicized lowercase "n" on products from the mid 1990s to ...
and
Mellanox Mellanox Technologies Ltd. ( he, מלאנוקס טכנולוגיות בע"מ) was an Israeli-American multinational supplier of computer networking products based on InfiniBand and Ethernet technology. Mellanox offered adapters, switches, softwa ...
. The effort resulted in construction of Summit and Sierra. Summit is tasked with civilian scientific research and is located at the Oak Ridge National Laboratory in Tennessee. Sierra is designed for nuclear weapons simulations and is located at the Lawrence Livermore National Laboratory in California. Summit is estimated to cover 873 square meters and require 219 kilometers of cabling. Researchers will utilize Summit for diverse fields such as
cosmology Cosmology () is a branch of physics and metaphysics dealing with the nature of the universe. The term ''cosmology'' was first used in English in 1656 in Thomas Blount's ''Glossographia'', and in 1731 taken up in Latin by German philosopher ...
,
medicine Medicine is the science and practice of caring for a patient, managing the diagnosis, prognosis, prevention, treatment, palliation of their injury or disease, and promoting their health. Medicine encompasses a variety of health care pr ...
and
climatology Climatology (from Greek , ''klima'', "place, zone"; and , ''-logia'') or climate science is the scientific study of Earth's climate, typically defined as weather conditions averaged over a period of at least 30 years. This modern field of study ...
. In 2015, the project called Collaboration of Oak Ridge, Argonne and Lawrence Livermore (CORAL) included a third supercomputer named
Aurora An aurora (plural: auroras or aurorae), also commonly known as the polar lights, is a natural light display in Earth's sky, predominantly seen in high-latitude regions (around the Arctic and Antarctic). Auroras display dynamic patterns of bri ...
and was planned for installation at Argonne National Laboratory. By 2018, Aurora was re-engineered with completion anticipated in 2021 as an exascale computing project along with Frontier and El Capitan to be completed shortly thereafter; Aurora is now planned for completion in late 2022.


Uses

The Summit supercomputer provides scientists and researchers the opportunity to solve complex tasks in the fields of energy, artificial intelligence, human health and other research areas. It has been used in Earthquake Simulation, Extreme Weather simulation using AI, Material science, Genomics and in predicting the lifetime of Neutrinos in physics.


Design

Each one of its 4,608 nodes (with 2 IBM
POWER9 POWER9 is a family of superscalar, multithreading, multi-core microprocessors produced by IBM, based on the Power ISA. It was announced in August 2016. The POWER9-based processors are being manufactured using a 14 nm FinFET process, in ...
CPUs and 6
Nvidia Tesla Nvidia Tesla was the name of Nvidia's line of products targeted at stream processing or general-purpose graphics processing units (GPGPU), named after pioneering electrical engineer Nikola Tesla. Its products began using GPUs from the G80 seri ...
GPU A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
s in each node) has over 600 GB of coherent memory (96 GB
HBM2 High Bandwidth Memory (HBM) is a high-speed computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerato ...
plus 512 GB
DDR4 SDRAM Double Data Rate 4 Synchronous Dynamic Random-Access Memory (DDR4 SDRAM) is a type of synchronous dynamic random-access memory with a high bandwidth (" double data rate") interface. Released to the market in 2014, it is a variant of dynamic r ...
) which is addressable by all CPUs and GPUs plus 800 GB of non-volatile RAM that can be used as a burst buffer or as extended memory. The
POWER9 POWER9 is a family of superscalar, multithreading, multi-core microprocessors produced by IBM, based on the Power ISA. It was announced in August 2016. The POWER9-based processors are being manufactured using a 14 nm FinFET process, in ...
CPUs and Nvidia Volta GPUs are connected using NVIDIA's high speed
NVLink NVLink is a wire-based serial multi-lane near-range communications link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central hub. The protocol was f ...
. This allows for a
heterogeneous computing Heterogeneous computing refers to systems that use more than one kind of processor or cores. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incor ...
model. To provide a high rate of data throughput, the nodes will be connected in a non-blocking fat-tree topology using a dual-rail Mellanox EDR
InfiniBand InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also use ...
interconnect for both storage and inter-process communications traffic which delivers both 200Gbit/s bandwidth between nodes and in-network computing acceleration for communications frameworks such as MPI and
SHMEM SHMEM (from Cray Research’s “shared memory” library) is a family of parallel programming libraries, providing one-sided, RDMA, parallel-processing interfaces for low-latency distributed-memory supercomputers. The SHMEM acronym was subsequent ...
/ PGAS.


See also

*
Titan (supercomputer) Titan or OLCF-3 was a supercomputer built by Cray at Oak Ridge National Laboratory for use in a variety of science projects. Titan was an upgrade of Jaguar, a previous supercomputer at Oak Ridge, that uses graphics processing units (GPUs) in a ...
– OLCF-3 * Frontier (supercomputer) - OLCF-5 *
TOP500 The TOP500 project ranks and details the 500 most powerful non- distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coinci ...
*
OpenBMC The OpenBMC project is a Linux Foundation collaborative open-source project whose goal is to produce an open source implementation of the Baseboard Management Controllers (BMC) Firmware Stack. OpenBMC is a Linux distribution for BMCs meant to w ...
*
Red Hat Enterprise Linux Red Hat Enterprise Linux (RHEL) is a commercial open-source Linux distribution developed by Red Hat for the commercial market. Red Hat Enterprise Linux is released in server versions for x86-64, Power ISA, ARM64, and IBM Z and a desktop ...


References


External links


A time-lapse video of Summit construction
{{S-end GPGPU supercomputers IBM supercomputers Oak Ridge National Laboratory Petascale computers 64-bit computers