computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

, array programming refers to solutions that allow the application of operations to an entire set of values at once. Such solutions are commonly used in

scientific Science is a systematic discipline that builds and organises knowledge in the form of testable hypotheses and predictions about the universe. Modern science is typically divided into twoor threemajor branches: the natural sciences, which stu ...

and engineering settings. Modern programming languages that support array programming (also known as

vector Vector most often refers to: * Euclidean vector, a quantity with a magnitude and a direction * Disease vector, an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematics a ...

multidimensional In physics and mathematics, the dimension of a Space (mathematics), mathematical space (or Mathematical object, object) is informally defined as the minimum number of coordinates needed to specify any Point (geometry), point within it. Thus, a ...

languages) have been engineered specifically to generalize operations on scalars to apply transparently to

matrices Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the ...

, and higher-dimensional arrays. These include APL, J, Fortran,

MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...

, Analytica,

Octave In music, an octave (: eighth) or perfect octave (sometimes called the diapason) is an interval between two notes, one having twice the frequency of vibration of the other. The octave relationship is a natural phenomenon that has been referr ...

, R, Cilk Plus, Julia, Perl Data Language (PDL),

Raku (programming language) Raku is a member of the Perl family of programming languages. Formerly named Perl 6, it was renamed in October 2019. Raku introduces elements of many modern and historical languages. Compatibility with Perl was not a goal, though a compatibility ...

. In these languages, an operation that operates on entire arrays can be called a ''vectorized'' operation, regardless of whether it is executed on a

vector processor In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...

, which implements vector instructions. Array programming primitives concisely express broad ideas about data manipulation. The level of concision can be dramatic in certain cases: it is not uncommon to find array programming language one-liners that require several pages of object-oriented code.

Concepts of array

The fundamental idea behind array programming is that operations apply at once to an entire set of values. This makes it a high-level programming model as it allows the programmer to think and operate on whole aggregates of data, without having to resort to explicit loops of individual scalar operations. Kenneth E. Iverson described the rationale behind array programming (actually referring to APL) as follows: The basis behind array programming and thinking is to find and exploit the properties of data where individual elements are similar or adjacent. Unlike object orientation which implicitly breaks down data to its constituent parts (or scalar quantities), array orientation looks to group data and apply a uniform handling.

Function rank Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-orien ...

is an important concept to array programming languages in general, by analogy to

tensor In mathematics, a tensor is an algebraic object that describes a multilinear relationship between sets of algebraic objects associated with a vector space. Tensors may map between different objects such as vectors, scalars, and even other ...

rank in mathematics: functions that operate on data may be classified by the number of dimensions they act on. Ordinary multiplication, for example, is a scalar ranked function because it operates on zero-dimensional data (individual numbers). The

cross product In mathematics, the cross product or vector product (occasionally directed area product, to emphasize its geometric significance) is a binary operation on two vectors in a three-dimensional oriented Euclidean vector space (named here E), and ...

operation is an example of a vector rank function because it operates on vectors, not scalars.

Matrix multiplication In mathematics, specifically in linear algebra, matrix multiplication is a binary operation that produces a matrix (mathematics), matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the n ...

is an example of a 2-rank function, because it operates on 2-dimensional objects (matrices). Collapse operators reduce the dimensionality of an input data array by one or more dimensions. For example, summing over elements collapses the input array by 1 dimension.

Uses

Array programming is very well suited to

implicit parallelization In computer science, implicit parallelism is a characteristic of a programming language that allows a compiler or interpreter to automatically exploit the parallelism inherent to the computations expressed by some of the language's constructs ...

; a topic of much research nowadays. Further,

Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...

and compatible CPUs developed and produced after 1997 contained various instruction set extensions, starting from MMX and continuing through

SSSE3 Supplemental Streaming SIMD Extensions 3 (SSSE3 or SSE3S) is a SIMD instruction set created by Intel and is the fourth iteration of the SSE technology. History SSSE3 was first introduced with Intel processors based on the Core microarchitect ...

and

3DNow! 3DNow! is a deprecated extension to the x86 instruction set developed by Advanced Micro Devices (AMD). It adds single instruction multiple data (SIMD) instructions to the base x86 instruction set, enabling it to perform vector processing of float ...

, which include rudimentary

SIMD Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...

array capabilities. This has continued into the 2020s with instruction sets such as

AVX-512 AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and first implemented in the 2016 Intel Xeon Phi x200 (Knights Landing), and then ...

, making modern CPUs sophisticated vector processors. Array processing is distinct from parallel processing in that one physical processor performs operations on a group of items simultaneously while parallel processing aims to split a larger problem into smaller ones ( MIMD) to be solved piecemeal by numerous processors. Processors with multiple cores and GPUs with thousands of general computing cores are common as of 2023.

Languages

The canonical examples of array programming languages are Fortran, APL, and J. Others include: A+, Analytica,

Chapel A chapel (from , a diminutive of ''cappa'', meaning "little cape") is a Christianity, Christian place of prayer and worship that is usually relatively small. The term has several meanings. First, smaller spaces inside a church that have their o ...

, IDL, Julia, K, Klong, Q,

GNU Octave GNU Octave is a scientific programming language for scientific computing and numerical computation. Octave helps in solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly ...

Scilab Scilab is a free and open-source, cross-platform numerical computational package and a high-level, numerically oriented programming language. It can be used for signal processing, statistical analysis, image enhancement, fluid dynamics simul ...

, FreeMat, Perl Data Language (PDL), R, Raku,

S-Lang The S-Lang programming library is a software library for Unix, Windows, VMS, OS/2, and Mac OS X. It provides routines for embedding an interpreter for the S-Lang scripting language, and components to facilitate the creation of text-based applica ...

, SAC, Nial, ZPL,

Futhark Runes are the letters in a set of related alphabets, known as runic rows, runic alphabets or futharks (also, see '' futhark'' vs ''runic alphabet''), native to the Germanic peoples. Runes were primarily used to represent a sound value (a ...

, and

TI-BASIC TI-BASIC is the official name of a BASIC-like language built into Texas Instruments' graphing calculators. TI-BASIC is a language family of three different and incompatible versions, released on different products: * TI-BASIC 83 (on Z80 proces ...

Scalar languages

In scalar languages such as C and Pascal, operations apply only to single values, so ''a''+''b'' expresses the addition of two numbers. In such languages, adding one array to another requires indexing and looping, the coding of which is tedious. for (i = 0; i < n; i++) for (j = 0; j < n; j++) a j] += b j]; In array-based languages, for example in Fortran, the nested for-loop above can be written in array-format in one line, a = a + b or alternatively, to emphasize the array nature of the objects, a(:,:) = a(:,:) + b(:,:) While scalar languages like C do not have native array programming elements as part of the language proper, this does not mean programs written in these languages never take advantage of the underlying techniques of vectorization (i.e., utilizing a CPU's Single instruction, multiple data, vector-based instructions if it has them or by using multiple CPU cores). Some C compilers like GCC at some optimization levels detect and vectorize sections of code that its heuristics determine would benefit from it. Another approach is given by the

OpenMP OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, ...

API, which allows one to parallelize applicable sections of code by taking advantage of multiple CPU cores.

Array languages

In array languages, operations are generalized to apply to both scalars and arrays. Thus, ''a''+''b'' expresses the sum of two scalars if ''a'' and ''b'' are scalars, or the sum of two arrays if they are arrays. An array language simplifies programming but possibly at a cost known as the ''abstraction penalty''. Because the additions are performed in isolation from the rest of the coding, they may not produce the optimally most efficient code. (For example, additions of other elements of the same array may be subsequently encountered during the same execution, causing unnecessary repeated lookups.) Even the most sophisticated

optimizing compiler An optimizing compiler is a compiler designed to generate code that is optimized in aspects such as minimizing program execution time, memory usage, storage size, and power consumption. Optimization is generally implemented as a sequence of op ...

would have an extremely hard time amalgamating two or more apparently disparate functions which might appear in different program sections or sub-routines, even though a programmer could do this easily, aggregating sums on the same pass over the array to minimize overhead).

Ada

The previous C code would become the following in the Ada language, which supports array-programming syntax. A := A + B;

APL

APL uses single character Unicode symbols with no syntactic sugar. A ← A + B This operation works on arrays of any rank (including rank 0), and on a scalar and an array. Dyalog APL extends the original language with

augmented assignment Augment or augmentation may refer to: Language *Augment (Indo-European), a syllable added to the beginning of the word in certain Indo-European languages * Augment (Bantu languages), a morpheme that is prefixed to the noun class prefix of nouns ...

s: A +← B

Analytica

Analytica provides the same economy of expression as Ada.

A := A + B;

BASIC

Dartmouth BASIC Dartmouth BASIC is the original version of the BASIC programming language. It was designed by two professors at Dartmouth College, John G. Kemeny and Thomas E. Kurtz. With the underlying Dartmouth Time-Sharing System (DTSS), it offered an interac ...

had MAT statements for matrix and array manipulation in its third edition (1966). DIM A(4),B(4),C(4) MAT A = 1 MAT B = 2 * A MAT C = A + B MAT PRINT A,B,C

Mata

Stata Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose Statistics, statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers ...

's matrix programming language Mata supports array programming. Below, we illustrate addition, multiplication, addition of a matrix and a scalar, element by element multiplication, subscripting, and one of Mata's many inverse matrix functions. . mata: : A = (1,2,3) \(4,5,6) : A 1 2 3 +-------------+ 1 , 1 2 3 , 2 , 4 5 6 , +-------------+ : B = (2..4) \(1..3) : B 1 2 3 +-------------+ 1 , 2 3 4 , 2 , 1 2 3 , +-------------+ : C = J(3,2,1) // A 3 by 2 matrix of ones : C 1 2 +---------+ 1 , 1 1 , 2 , 1 1 , 3 , 1 1 , +---------+ : D = A + B : D 1 2 3 +-------------+ 1 , 3 5 7 , 2 , 5 7 9 , +-------------+ : E = A*C : E 1 2 +-----------+ 1 , 6 6 , 2 , 15 15 , +-----------+ : F = A:*B : F 1 2 3 +----------------+ 1 , 2 6 12 , 2 , 4 10 18 , +----------------+ : G = E :+ 3 : G 1 2 +-----------+ 1 , 9 9 , 2 , 18 18 , +-----------+ : H = F 2\1), (1, 2) // Subscripting to get a submatrix of F and : // switch row 1 and 2 : H 1 2 +-----------+ 1 , 4 10 , 2 , 2 6 , +-----------+ : I = invsym(F'*F) // Generalized inverse (F*F^(-1)F=F) of a : // symmetric positive semi-definite matrix : I ymmetric 1 2 3 +-------------------------------------------+ 1 , 0 , 2 , 0 3.25 , 3 , 0 -1.75 .9444444444 , +-------------------------------------------+ : end

MATLAB

The implementation in

allows the same economy allowed by using the Fortran language. A = A + B; A variant of the MATLAB language is the

language, which extends the original language with augmented assignments: A += B; Both MATLAB and GNU Octave natively support

linear algebra Linear algebra is the branch of mathematics concerning linear equations such as :a_1x_1+\cdots +a_nx_n=b, linear maps such as :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrix (mathemat ...

operations such as matrix multiplication,

matrix inversion In linear algebra, an invertible matrix (''non-singular'', ''non-degenarate'' or ''regular'') is a square matrix that has an inverse. In other words, if some other matrix is multiplied by the invertible matrix, the result can be multiplied by an ...

, and the numerical solution of

system of linear equations In mathematics, a system of linear equations (or linear system) is a collection of two or more linear equations involving the same variable (math), variables. For example, : \begin 3x+2y-z=1\\ 2x-2y+4z=-2\\ -x+\fracy-z=0 \end is a system of th ...

, even using the Moore–Penrose pseudoinverse. The Nial example of the inner product of two arrays can be implemented using the native matrix multiplication operator. If a is a row vector of size nand b is a corresponding column vector of size 1 a * b; By contrast, the entrywise product is implemented as: a .* b; The inner product between two matrices having the same number of elements can be implemented with the auxiliary operator (:), which reshapes a given matrix into a column vector, and the

transpose In linear algebra, the transpose of a Matrix (mathematics), matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other ...

operator ': A(:)' * B(:);

rasql

The rasdaman query language is a database-oriented array-programming language. For example, two arrays could be added with the following query: SELECT A + B FROM A, B

R

The R language supports

array paradigm An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...

by default. The following example illustrates a process of multiplication of two matrices followed by an addition of a scalar (which is, in fact, a one-element vector) and a vector: > A <- matrix(1:6, nrow=2) # !!this has nrow=2 ... and A has 2 rows > A 1 2 3 , 1 3 5 , 2 4 6 > B <- t( matrix(6:1, nrow=2) ) # t() is a transpose operator !!this has nrow=2 ... and B has 3 rows --- a clear contradiction to the definition of A > B 1 2 , 6 5 , 4 3 , 2 1 > C <- A %*% B > C 1 2 , 28 19 , 40 28 > D <- C + 1 > D 1 2 , 29 20 , 41 29 > D + c(1, 1) # c() creates a vector 1 2 , 30 21 , 42 30

Raku

Raku supports the array paradigm via its Metaoperators. The following example demonstrates the addition of arrays @a and @b using the Hyper-operator in conjunction with the plus operator. > my @a = 1,1 ,2 ,3; 1 1 2 3 > my @b = 4,4 ,5 ,6; 4 4 5 6 > @a »+« @b; 5 5[7 7[9 9"> 7"> 5[7 7[9 9

Mathematical reasoning and language notation

The matrix left-division operator concisely expresses some semantic properties of matrices. As in the scalar equivalent, if the (determinant of the) coefficient (matrix) A is not null then it is possible to solve the (vectorial) equation A * x = b by left-multiplying both sides by the inverse matrix, inverse of A: A⁻¹ (in both MATLAB and GNU Octave languages: A^-1). The following mathematical statements hold when A is a full rank

square matrix In mathematics, a square matrix is a Matrix (mathematics), matrix with the same number of rows and columns. An ''n''-by-''n'' matrix is known as a square matrix of order Any two square matrices of the same order can be added and multiplied. Squ ...

: :

A^-1  *(A * x)
A^-1 * (b)

(A^-1 * A)* x 
A^-1 *  b

(matrix-multiplication

associativity In mathematics, the associative property is a property of some binary operations that rearranging the parentheses in an expression will not change the result. In propositional logic, associativity is a Validity (logic), valid rule of replaceme ...

) :x = A^-1 * b where is the equivalence

relational operator In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality (e.g., ) and inequalities (e.g., ). In programmi ...

. The previous statements are also valid MATLAB expressions if the third one is executed before the others (numerical comparisons may be false because of round-off errors). If the system is overdetermined – so that A has more rows than columns – the pseudoinverse A⁺ (in MATLAB and GNU Octave languages: pinv(A)) can replace the inverse A⁻¹, as follows: : : (matrix-multiplication associativity) : However, these solutions are neither the most concise ones (e.g. still remains the need to notationally differentiate overdetermined systems) nor the most computationally efficient. The latter point is easy to understand when considering again the scalar equivalent a * x = b, for which the solution x = a^-1 * b would require two operations instead of the more efficient x = b / a. The problem is that generally matrix multiplications are not

commutative In mathematics, a binary operation is commutative if changing the order of the operands does not change the result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it. Perhaps most familiar as a pr ...

as the extension of the scalar solution to the matrix case would require: :

(a * x)/ a 
b / a

(x * a)/ a 
b / a

(commutativity does not hold for matrices!) :

x * (a / a)
b / a

(associativity also holds for matrices) :x = b / a The MATLAB language introduces the left-division operator \ to maintain the essential part of the analogy with the scalar case, therefore simplifying the mathematical reasoning and preserving the conciseness: :

A \ (A * x)
A \ b

(A \ A)* x 
A \ b

(associativity also holds for matrices, commutativity is no more required) :x = A \ b This is not only an example of terse array programming from the coding point of view but also from the computational efficiency perspective, which in several array programming languages benefits from quite efficient linear algebra libraries such as

ATLAS An atlas is a collection of maps; it is typically a bundle of world map, maps of Earth or of a continent or region of Earth. Advances in astronomy have also resulted in atlases of the celestial sphere or of other planets. Atlases have traditio ...

LAPACK LAPACK ("Linear Algebra Package") is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It als ...

. Returning to the previous quotation of Iverson, the rationale behind it should now be evident:

Third-party libraries

The use of specialized and efficient libraries to provide more terse abstractions is also common in other programming languages. In C++ several linear algebra libraries exploit the language's ability to overload operators. In some cases a very terse abstraction in those languages is explicitly influenced by the array programming paradigm, as the

NumPy NumPy (pronounced ) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. The predeces ...

extension library to Python,

Armadillo Armadillos () are New World placental mammals in the order (biology), order Cingulata. They form part of the superorder Xenarthra, along with the anteaters and sloths. 21 extant species of armadillo have been described, some of which are dis ...

and Blitz++ libraries do.

References

External links

"No stinking loops" programming"Types of Arrays" programming
{{Types of programming languages Programming paradigms Articles with example MATLAB/Octave code Articles with example BASIC code Articles with example Ada code Articles with example R code