HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...
, array of structures (AoS), structure of arrays (SoA) and array of structures of arrays (AoSoA) refer to contrasting ways to arrange a sequence of
record A record, recording or records may refer to: An item or collection of data Computing * Record (computer science), a data structure ** Record, or row (database), a set of fields in a database related to one entity ** Boot sector or boot record, ...
s in
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
, with regard to
interleaving Interleaving may refer to: * Interleaving, a technique for making forward error correction more robust with respect to burst errors * An optical interleaver, a fiber-optic device to combine two sets of dense wavelength-division multiplexing (DW ...
, and are of interest in
SIMD Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should ...
and SIMT programming.


Structure of arrays

Structure of arrays (SoA) is a layout separating elements of a
record A record, recording or records may refer to: An item or collection of data Computing * Record (computer science), a data structure ** Record, or row (database), a set of fields in a database related to one entity ** Boot sector or boot record, ...
(or 'struct' in the
C programming language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
) into one parallel array per
field Field may refer to: Expanses of open ground * Field (agriculture), an area of land used for agricultural purposes * Airfield, an aerodrome that lacks the infrastructure of an airport * Battlefield * Lawn, an area of mowed grass * Meadow, a grass ...
. The motivation is easier manipulation with packed
SIMD instruction In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ' ...
s in most
instruction set architecture In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ' ...
s, since a single
SIMD register A processor register is a quickly accessible location available to a computer's Processor (computing), processor. Registers usually consist of a small amount of fast Computer storage, storage, although some registers have specific hardware functi ...
can load
homogeneous data Homogeneity and heterogeneity are concepts often used in the sciences and statistics relating to the Uniformity (chemistry), uniformity of a Chemical substance, substance or organism. A material or image that is homogeneous is uniform in compos ...
, possibly transferred by a wide internal datapath (e.g.
128-bit While there are currently no mainstream general-purpose processors built to operate on 128-bit ''integers'' or addresses, a number of processors do have specialized ways to operate on 128-bit chunks of data. Representation 128-bit processors co ...
). If only a specific part of the record is needed, only those parts need to be iterated over, allowing more data to fit onto a single cache line. The downside is requiring more cache ways when traversing data, and inefficient
indexed addressing Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions i ...
. For example, to store N points in 3D space using a structure of arrays: struct pointlist3D ; struct pointlist3D points; float get_point_x(int i)


Array of structures

Array of structures (AoS) is the opposite (and more conventional) layout, in which data for different fields is interleaved. This is often more intuitive, and supported directly by most
programming languages A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
. For example, to store N points in 3D space using an array of structures: struct point3D ; struct point3D points float get_point_x(int i)


Array of structures of arrays

Array of structures of arrays (AoSoA) or tiled array of structs is a hybrid approach between the previous layouts, in which data for different fields is interleaved using tiles or blocks with size equal to the SIMD vector size. This is often less intuitive, but can achieve the memory throughput of the SoA approach, while being more friendly to the cache locality and load port architectures of modern processors. In particular, memory requests in modern processors have to be fulfilled in fixed width (e.g., size of a cacheline). The tiled storage of AoSoA aligns the memory access pattern to the requests' fixed width, leading to fewer access operations to complete a memory request and thus increasing the efficiency. For example, to store N points in 3D space using an array of structures of arrays with a SIMD register width of 8 floats (or 8×32 = 256 bits): struct point3Dx8 ; struct point3Dx8 points N+7)/8 float get_point_x(int i) A different width may be needed depending on the actual SIMD register width. The interior arrays may be replaced with SIMD types such as for languages with such support.


Alternatives

It is possible to split some subset of a structure (rather than each individual field) into a
parallel array In computing, a group of parallel arrays (also known as structure of arrays or SoA) is a form of implicit data structure that uses multiple arrays to represent a singular array of records. It keeps a separate, homogeneous data array for each field ...
and this can actually improve
locality of reference In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
if different pieces of fields are used at different times in the program (see data oriented design). Some
SIMD Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should ...
architectures provide strided load/store instructions to load homogeneous data from the SoA format. Yet another option used in some
Cell Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Locations * Monastic cell, a small room, hut, or cave in which a religious recluse lives, alternatively the small precursor of a monastery w ...
libraries is to de-interleave data from the AoS format when loading sources into registers, and interleave when writing out results (facilitated by the
superscalar issue A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a supe ...
of
permute In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or proc ...
s). Some vector maths libraries align
floating point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...
4D vectors with the SIMD register to leverage the associated data path and instructions, while still providing programmer convenience, although this does not scale to SIMD units wider than four lanes.


4D vectors

AoS vs. SoA presents a choice when considering 3D or 4D vector data on machines with four-lane SIMD hardware. SIMD ISAs are usually designed for homogeneous data, however some provide a
dot product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used sometimes for other symmetric bilinear forms, for example in a pseudo-Euclidean space. is an algebra ...
instruction and additional permutes, making the AoS case easier to handle. Although most
GPU A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobil ...
hardware has moved away from 4D instructions to scalar SIMT pipelines, modern
compute kernel In computing, a compute kernel is a routine compiled for high throughput accelerators (such as graphics processing units (GPUs), digital signal processors (DSPs) or field-programmable gate arrays (FPGAs)), separate from but used by a main progr ...
s using SoA instead of AoS can still give better performance due to memory coalescing.


Software support

Most languages support the AoS format more naturally by combining
records A record, recording or records may refer to: An item or collection of data Computing * Record (computer science), a data structure ** Record, or row (database), a set of fields in a database related to one entity ** Boot sector or boot record, ...
and various array
abstract data types In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a ''user'', of the data, specifically in terms of possible values, pos ...
. SoA is mostly found in languages, libraries, or
metaprogramming Metaprogramming is a programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyze or transform other programs, and even modify itself ...
tools used to support a
data-oriented design In computing, data-oriented design is a program optimization approach motivated by efficient usage of the CPU cache, used in video game development. The approach is to focus on the data layout, separating and sorting fields according to when they ...
. Examples include: * The SIMD-oriented features of the experimental Jai programming language is a recent attempt to provide language level SoA support. * "Data frames," as implemented in R,
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
's Pandas package, and
Julia Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e.g. ...
's DataFrames.jl package, are interfaces to access SoA like AoS. * The Julia package StructArrays.jl allows for accessing SoA as AoS to combine the performance of SoA with the intuitiveness of AoS. * Code generators for the C language, includin
Datadraw
and the X Macro technique. Automated creation of AoSoA is more complex. An example of AoSoA in metaprogramming is found in
LANL Los Alamos National Laboratory (often shortened as Los Alamos and LANL) is one of the sixteen research and development laboratories of the United States Department of Energy (DOE), located a short distance northwest of Santa Fe, New Mexico, ...
's Cabana library written in C++; it assumes a vector width of 16 lanes by default.


References

{{reflist SIMD computing