HOME

TheInfoList



OR:

HPC Challenge Benchmark combines several
benchmark Benchmark may refer to: Business and economics * Benchmarking, evaluating performance within organizations * Benchmark price * Benchmark (crude oil), oil-specific practices Science and technology * Benchmark (surveying), a point of known elevatio ...
s to test a number of independent attributes of the performance of high-performance
computer A computer is a machine that can be programmed to Execution (computing), carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as C ...
(HPC) systems. The project has been co-sponsored by the
DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adv ...
High Productivity Computing Systems High Productivity Computing Systems (HPCS) is a DARPA project for developing a new generation of economically viable high productivity computing systems for national security and industry in the 2002–10 timeframe. The HPC Challenge (High-perfo ...
program, the
United States Department of Energy The United States Department of Energy (DOE) is an executive department of the U.S. federal government that oversees U.S. national energy policy and manages the research and development of nuclear power and nuclear weapons in the United Stat ...
and the
National Science Foundation The National Science Foundation (NSF) is an independent agency of the United States government that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National I ...
.


Context

The performance of complex applications on HPC systems can depend on a variety of independent performance attributes of the hardware. The HPC Challenge Benchmark is an effort to improve visibility into this multidimensional space by combining the measurement of several of these attributes into a single program. Although the performance attributes of interest are not specific to any particular computer architecture, the reference implementation of the HPC Challenge Benchmark in C and MPI assumes that the system under test is a
cluster may refer to: Science and technology Astronomy * Cluster (spacecraft), constellation of four European Space Agency spacecraft * Asteroid cluster, a small asteroid family * Cluster II (spacecraft), a European Space Agency mission to study t ...
of shared memory multiprocessor systems connected by a
network Network, networking and networked may refer to: Science and technology * Network theory, the study of graphs as a representation of relations between discrete objects * Network science, an academic field that studies complex networks Mathematics ...
. Due to this assumption of a hierarchical system structure most of the tests are run in several different modes of operation. Following the notation used by the benchmark reports, results labeled "single" mean that the test was run on one randomly chosen processor in the system, results labeled "star" mean that an independent copy of the test was run concurrently on each processor in the system, and results labeled "global" mean that all the processors were working in coordination to solve a single problem (with data distributed across the nodes of the system).


Components

The benchmark currently consists of 7 tests (with the modes of operation indicated for each): # HPL (High Performance LINPACK) – measures performance of a solver for a dense
system of linear equations In mathematics, a system of linear equations (or linear system) is a collection of one or more linear equations involving the same variable (math), variables. For example, :\begin 3x+2y-z=1\\ 2x-2y+4z=-2\\ -x+\fracy-z=0 \end is a system of three ...
(global). #
DGEMM Basic Linear Algebra Subprograms (BLAS) is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matrix ...
– measures performance for matrix-matrix multiplication (single, star). # STREAM – measures sustained
memory bandwidth Memory bandwidth is the rate at which data can be read from or stored into a semiconductor memory by a central processing unit, processor. Memory bandwidth is usually expressed in units of bytes per second, bytes/second, though this can vary for ...
to/from memory (single, star). # PTRANS – measures the rate at which the system can
transpose In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other notations). The tr ...
a large array (global). # RandomAccess – measures the rate of 64-bit updates to randomly selected elements of a large table (single, star, global). # FFT – performs a
Fast Fourier Transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in th ...
on a large one-dimensional vector using the generalized Cooley–Tukey algorithm (single, star, global). # Communication Bandwidth and Latency – MPI-centric performance measurements based on the b_eff bandwidth/latency benchmark.


Performance attributes

At a high level, the tests are intended to provide coverage of four important attributes of performance: double-precision floating-point arithmetic (DGEMM and HPL), local memory bandwidth (STREAM), network bandwidth for "large" messages (PTRANS, RandomAccess, FFT, b_eff), and network bandwidth for "small" messages (RandomAccess, b_eff). Some of the codes are more complex than others and can have additional performance sensitivities. For example, in some systems HPL performance can be limited by network bandwidth and/or network latency.


Competition

The annual HPC Challenge Award Competition at the
Supercomputing Conference SC (formerly Supercomputing), the International Conference for High Performance Computing, Networking, Storage and Analysis, is the annual conference established in 1988 by the Association for Computing Machinery and the IEEE Computer Society. In ...
focuses on four of the most challenging benchmarks in the suite: * Global HPL * Global RandomAccess (OR BSS Random Access Benchmark) * EP STREAM (Triad) per system * Global FFT There are two classes of awards: * Class 1: Best performance on a base or optimized run submitted to the HPC Challenge website. * Class 2: Most "elegant" implementation of four or five computational kernels including three or more of the HPC Challenge benchmarks.


See also

{{Portal, Free and open-source software *
Locality of reference In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...


References


External links


HPC Challenge Benchmark Official Website

HPC Challenge Award Competition Official Website

BSS Random Access Benchmark
Performance Evaluation and Optimization of Random Memory Access on Multicores with High Productivity (Best Paper Award) a
ACM/IEEE HiPC 2010
Supercomputer benchmarks Software using the BSD license