Array-of-Structures (AoS) And Structure-of-Arrays (SoA)
   HOME

TheInfoList



OR:

In computing, array of structures (AoS), structure of arrays (SoA) and array of structures of arrays (AoSoA) refer to contrasting ways to arrange a sequence of
record A record, recording or records may refer to: An item or collection of data Computing * Record (computer science), a data structure ** Record, or row (database), a set of fields in a database related to one entity ** Boot sector or boot record, ...
s in memory, with regard to
interleaving Interleaving may refer to: * Interleaving, a technique for making forward error correction more robust with respect to burst errors * An optical interleaver, a fiber-optic device to combine two sets of dense wavelength-division multiplexing (DW ...
, and are of interest in SIMD and SIMT programming.


Structure of arrays

Structure of arrays (SoA) is a layout separating elements of a
record A record, recording or records may refer to: An item or collection of data Computing * Record (computer science), a data structure ** Record, or row (database), a set of fields in a database related to one entity ** Boot sector or boot record, ...
(or 'struct' in the
C programming language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
) into one parallel array per field. The motivation is easier manipulation with packed
SIMD instruction In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ' ...
s in most
instruction set architecture In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ' ...
s, since a single SIMD register can load
homogeneous data Homogeneity and heterogeneity are concepts often used in the sciences and statistics relating to the Uniformity (chemistry), uniformity of a Chemical substance, substance or organism. A material or image that is homogeneous is uniform in compos ...
, possibly transferred by a wide internal datapath (e.g.
128-bit While there are currently no mainstream general-purpose processors built to operate on 128-bit ''integers'' or addresses, a number of processors do have specialized ways to operate on 128-bit chunks of data. Representation 128-bit processors co ...
). If only a specific part of the record is needed, only those parts need to be iterated over, allowing more data to fit onto a single cache line. The downside is requiring more cache ways when traversing data, and inefficient indexed addressing. For example, to store N points in 3D space using a structure of arrays: struct pointlist3D ; struct pointlist3D points; float get_point_x(int i)


Array of structures

Array of structures (AoS) is the opposite (and more conventional) layout, in which data for different fields is interleaved. This is often more intuitive, and supported directly by most programming languages. For example, to store N points in 3D space using an array of structures: struct point3D ; struct point3D points float get_point_x(int i)


Array of structures of arrays

Array of structures of arrays (AoSoA) or tiled array of structs is a hybrid approach between the previous layouts, in which data for different fields is interleaved using tiles or blocks with size equal to the SIMD vector size. This is often less intuitive, but can achieve the memory throughput of the SoA approach, while being more friendly to the cache locality and load port architectures of modern processors. In particular, memory requests in modern processors have to be fulfilled in fixed width (e.g., size of a cacheline). The tiled storage of AoSoA aligns the memory access pattern to the requests' fixed width, leading to fewer access operations to complete a memory request and thus increasing the efficiency. For example, to store N points in 3D space using an array of structures of arrays with a SIMD register width of 8 floats (or 8×32 = 256 bits): struct point3Dx8 ; struct point3Dx8 points N+7)/8 float get_point_x(int i) A different width may be needed depending on the actual SIMD register width. The interior arrays may be replaced with SIMD types such as for languages with such support.


Alternatives

It is possible to split some subset of a structure (rather than each individual field) into a
parallel array In computing, a group of parallel arrays (also known as structure of arrays or SoA) is a form of implicit data structure that uses multiple arrays to represent a singular array of records. It keeps a separate, homogeneous data array for each field ...
and this can actually improve locality of reference if different pieces of fields are used at different times in the program (see
data oriented design In computing, data-oriented design is a program optimization approach motivated by efficient usage of the CPU cache, used in video game development. The approach is to focus on the data layout, separating and sorting fields according to when the ...
). Some SIMD architectures provide strided load/store instructions to load homogeneous data from the SoA format. Yet another option used in some Cell libraries is to de-interleave data from the AoS format when loading sources into registers, and interleave when writing out results (facilitated by the
superscalar issue A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a supe ...
of
permute In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or proc ...
s). Some vector maths libraries align floating point 4D vectors with the SIMD register to leverage the associated data path and instructions, while still providing programmer convenience, although this does not scale to SIMD units wider than four lanes.


4D vectors

AoS vs. SoA presents a choice when considering 3D or 4D vector data on machines with four-lane SIMD hardware. SIMD ISAs are usually designed for homogeneous data, however some provide a dot product instruction and additional permutes, making the AoS case easier to handle. Although most GPU hardware has moved away from 4D instructions to scalar SIMT pipelines, modern compute kernels using SoA instead of AoS can still give better performance due to memory coalescing.


Software support

Most languages support the AoS format more naturally by combining records and various array abstract data types. SoA is mostly found in languages, libraries, or metaprogramming tools used to support a
data-oriented design In computing, data-oriented design is a program optimization approach motivated by efficient usage of the CPU cache, used in video game development. The approach is to focus on the data layout, separating and sorting fields according to when they ...
. Examples include: * The SIMD-oriented features of the experimental Jai programming language is a recent attempt to provide language level SoA support. * "Data frames," as implemented in R, Python's Pandas package, and Julia's DataFrames.jl package, are interfaces to access SoA like AoS. * The Julia package StructArrays.jl allows for accessing SoA as AoS to combine the performance of SoA with the intuitiveness of AoS. * Code generators for the C language, includin
Datadraw
and the X Macro technique. Automated creation of AoSoA is more complex. An example of AoSoA in metaprogramming is found in
LANL Los Alamos National Laboratory (often shortened as Los Alamos and LANL) is one of the sixteen research and development laboratories of the United States Department of Energy (DOE), located a short distance northwest of Santa Fe, New Mexico, in ...
's Cabana library written in C++; it assumes a vector width of 16 lanes by default.


References

{{reflist SIMD computing