HPC Challenge Benchmark combines several

benchmark Benchmark may refer to: Business and economics * Benchmarking, evaluating performance within organizations * Benchmark price * Benchmark (crude oil), oil-specific practices Science and technology * Benchmark (surveying), a point of known elevati ...

s to test a number of independent attributes of the performance of high-performance

computer A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations ( computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These prog ...

(HPC) systems. The project has been co-sponsored by the

DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the A ...

High Productivity Computing Systems program, the

United States Department of Energy The United States Department of Energy (DOE) is an executive department of the U.S. federal government that oversees U.S. national energy policy and manages the research and development of nuclear power and nuclear weapons in the United States ...

and the

National Science Foundation The National Science Foundation (NSF) is an independent agency of the United States government that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National ...

Context

The performance of complex applications on HPC systems can depend on a variety of independent performance attributes of the hardware. The HPC Challenge Benchmark is an effort to improve visibility into this multidimensional space by combining the measurement of several of these attributes into a single program. Although the performance attributes of interest are not specific to any particular computer architecture, the reference implementation of the HPC Challenge Benchmark in C and

MPI MPI or Mpi may refer to: Science and technology Biology and medicine * Magnetic particle imaging, an emerging non-invasive tomographic technique * Myocardial perfusion imaging, a nuclear medicine procedure that illustrates the function of the hear ...

assumes that the system under test is a

cluster may refer to: Science and technology Astronomy * Cluster (spacecraft), constellation of four European Space Agency spacecraft * Asteroid cluster, a small asteroid family * Cluster II (spacecraft), a European Space Agency mission to study th ...

of shared memory multiprocessor systems connected by a

network Network, networking and networked may refer to: Science and technology * Network theory, the study of graphs as a representation of relations between discrete objects * Network science, an academic field that studies complex networks Mathematic ...

. Due to this assumption of a hierarchical system structure most of the tests are run in several different modes of operation. Following the notation used by the benchmark reports, results labeled "single" mean that the test was run on one randomly chosen processor in the system, results labeled "star" mean that an independent copy of the test was run concurrently on each processor in the system, and results labeled "global" mean that all the processors were working in coordination to solve a single problem (with data distributed across the nodes of the system).

Components

The benchmark currently consists of 7 tests (with the modes of operation indicated for each): # HPL (High Performance LINPACK) – measures performance of a solver for a dense

system of linear equations In mathematics, a system of linear equations (or linear system) is a collection of one or more linear equations involving the same variables. For example, :\begin 3x+2y-z=1\\ 2x-2y+4z=-2\\ -x+\fracy-z=0 \end is a system of three equations in t ...

(global). #

DGEMM Basic Linear Algebra Subprograms (BLAS) is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matri ...

– measures performance for matrix-matrix multiplication (single, star). # STREAM – measures sustained memory bandwidth to/from memory (single, star). # PTRANS – measures the rate at which the system can

transpose In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other notations). The tr ...

a large array (global). # RandomAccess – measures the rate of 64-bit updates to randomly selected elements of a large table (single, star, global). # FFT – performs a

Fast Fourier Transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in ...

on a large one-dimensional vector using the generalized Cooley–Tukey algorithm (single, star, global). # Communication Bandwidth and Latency – MPI-centric performance measurements based on the b_eff bandwidth/latency benchmark.

Performance attributes

At a high level, the tests are intended to provide coverage of four important attributes of performance: double-precision floating-point arithmetic (DGEMM and HPL), local memory bandwidth (STREAM), network bandwidth for "large" messages (PTRANS, RandomAccess, FFT, b_eff), and network bandwidth for "small" messages (RandomAccess, b_eff). Some of the codes are more complex than others and can have additional performance sensitivities. For example, in some systems HPL performance can be limited by network bandwidth and/or network latency.

Competition

The annual HPC Challenge Award Competition at the

Supercomputing Conference SC (formerly Supercomputing), the International Conference for High Performance Computing, Networking, Storage and Analysis, is the annual conference established in 1988 by the Association for Computing Machinery and the IEEE Computer Society. In ...

focuses on four of the most challenging benchmarks in the suite: * Global HPL * Global RandomAccess (OR BSS Random Access Benchmark) * EP STREAM (Triad) per system * Global FFT There are two classes of awards: * Class 1: Best performance on a base or optimized run submitted to the HPC Challenge website. * Class 2: Most "elegant" implementation of four or five computational kernels including three or more of the HPC Challenge benchmarks.

References

External links