type MyTable = array ..4,1..2of integer
, defines a new array data type called MyTable
. The declaration var A: MyTable
then defines a variable A
of that type, which is an aggregate of eight elements, each being an integer variable identified by two indices. In the Pascal program, those elements are denoted A ,1/code>, A ,2/code>, A ,1/code>, …, A ,2/code>.K. Jensen and Niklaus Wirth, ''PASCAL User Manual and Report''. Springer. Paperback edition (2007) 184 pages, Special array types are often defined by the language's standard libraries
A library is a collection of Document, materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or electronic media, digital access (soft copies) materials, and may be a ...
.
Dynamic list
In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
s are also more common and easier to implement than dynamic arrays. Array types are distinguished from record types mainly because they allow the element indices to be computed at run time
Run(s) or RUN may refer to:
Places
* Run (island), one of the Banda Islands in Indonesia
* Run (stream), a stream in the Dutch province of North Brabant
People
* Run (rapper), Joseph Simmons, now known as "Reverend Run", from the hip-hop group ...
, as in the Pascal assignment
Assignment, assign or The Assignment may refer to:
* Homework
* Sex assignment
* The process of sending National Basketball Association players to its development league; see
Computing
* Assignment (computer science), a type of modification to ...
A ,J:= A -I,2*J/code>. Among other things, this feature allows a single iterative statement to process arbitrarily many elements of an array variable.
In more theoretical contexts, especially in type theory
In mathematics, logic, and computer science, a type theory is the formal presentation of a specific type system, and in general type theory is the academic study of type systems. Some type theories serve as alternatives to set theory as a fou ...
and in the description of abstract algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s, the terms "array" and "array type" sometimes refer to an abstract data type
In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a ''user'', of the data, specifically in terms of possible values, pos ...
(ADT) also called ''abstract array'' or may refer to an ''associative array
In computer science, an associative array, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In mathematical terms an ...
'', a mathematical
Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
model with the basic operations and behavior of a typical array type in most languages — basically, a collection of elements that are selected by indices computed at run-time.
Depending on the language, array types may overlap (or be identified with) other data types that describe aggregates of values, such as lists and strings
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian anim ...
. Array types are often implemented by array data structures, but sometimes by other means, such as hash table
In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', ...
s, linked lists, or search tree
In computer science, a search tree is a tree data structure used for locating specific keys from within a set. In order for a tree to function as a search tree, the key for each node must be greater than any keys in subtrees on the left, and less ...
s.
History
Heinz Rutishauser
Heinz Rutishauser (30 January 1918 – 10 November 1970) was a Swiss mathematician and a pioneer of modern numerical mathematics and computer science.
Life
Rutishauser's father died when he was 13 years old and his mother died three years lat ...
's programming language Superplan (1949–1951) included multi-dimensional arrays. Rutishauser however although describing how a compiler for his language should be built, did not implement one.
Assembly languages and low-level languages like BCPL generally have no syntactic support for arrays.
Because of the importance of array structures for efficient computation, the earliest high-level programming languages, including FORTRAN (1957), COBOL (1960), and Algol 60 (1960), provided support for multi-dimensional arrays.
Abstract arrays
An array data structure can be mathematically modeled as an abstract data structure
In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a '' user'', of the data, specifically in terms of possible values, ...
(an ''abstract array'') with two operations
:''get''(''A'', ''I''): the data stored in the element of the array ''A'' whose indices are the integer tuple
In mathematics, a tuple is a finite ordered list (sequence) of elements. An -tuple is a sequence (or ordered list) of elements, where is a non-negative integer. There is only one 0-tuple, referred to as ''the empty tuple''. An -tuple is defi ...
''I''.
:''set''(''A'',''I'',''V''): the array that results by setting the value of that element to ''V''.
These operations are required to satisfy the axioms
:''get''(''set''(''A'',''I'', ''V''), ''I'') = ''V''
:''get''(''set''(''A'',''I'', ''V''), ''J'') = ''get''(''A'', ''J'') if ''I'' ≠ ''J''
for any array state ''A'', any value ''V'', and any tuples ''I'', ''J'' for which the operations are defined.
The first axiom means that each element behaves like a variable. The second axiom means that elements with distinct indices behave as disjoint variables, so that storing a value in one element does not affect the value of any other element.
These axioms do not place any constraints on the set of valid index tuples ''I'', therefore this abstract model can be used for triangular matrices and other oddly-shaped arrays.
Implementations
In order to effectively implement variables of such types as array structures (with indexing done by pointer arithmetic
In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer ''ref ...
), many languages restrict the indices to integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
data types (or other types that can be interpreted as integers, such as byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
s and enumerated type
In computer programming, an enumerated type (also called enumeration, enum, or factor in the R programming language, and a categorical variable in statistics) is a data type consisting of a set of named values called ''elements'', ''members'', '' ...
s), and require that all elements have the same data type and storage size. Most of those languages also restrict each index to a finite interval of integers, that remains fixed throughout the lifetime of the array variable. In some compiled
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
languages, in fact, the index ranges may have to be known at compile time
In computer science, compile time (or compile-time) describes the time window during which a computer program is compiled.
The term is used as an adjective to describe concepts related to the context of program compilation, as opposed to concep ...
.
On the other hand, some programming languages provide more liberal array types, that allow indexing by arbitrary values, such as floating-point numbers
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can ...
, strings
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian anim ...
, objects
Object may refer to:
General meanings
* Object (philosophy), a thing, being, or concept
** Object (abstract), an object which does not exist at any particular time or place
** Physical object, an identifiable collection of matter
* Goal, an ...
, references, etc.. Such index values cannot be restricted to an interval, much less a fixed interval. So, these languages usually allow arbitrary new elements to be created at any time. This choice precludes the implementation of array types as array data structures. That is, those languages use array-like syntax to implement a more general associative array
In computer science, an associative array, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In mathematical terms an ...
semantics, and must therefore be implemented by a hash table
In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', ...
or some other search data structure In computer science, a search data structure is any data structure that allows the efficient retrieval of specific items from a set of items, such as a specific record from a database.
The simplest, most general, and least efficient search struc ...
.
Language support
Multi-dimensional arrays
The number of indices needed to specify an element is called the ''dimension'', ''dimensionality'', or rank
Rank is the relative position, value, worth, complexity, power, importance, authority, level, etc. of a person or object within a ranking, such as:
Level or position in a hierarchical organization
* Academic rank
* Diplomatic rank
* Hierarchy
* ...
of the array type. (This nomenclature conflicts with the concept of dimension in linear algebra, where it is the number of elements. Thus, an array of numbers with 5 rows and 4 columns, hence 20 elements, is said to have dimension 2 in computing contexts, but represents a matrix with dimension 4-by-5 or 20 in mathematics. Also, the computer science meaning of "rank" is similar to its meaning in tensor algebra but not to the linear algebra concept of rank of a matrix
In linear algebra, the rank of a matrix is the dimension of the vector space generated (or spanned) by its columns. p. 48, § 1.16 This corresponds to the maximal number of linearly independent columns of . This, in turn, is identical to the dime ...
.)
Many languages support only one-dimensional arrays. In those languages, a multi-dimensional array is typically represented by an Iliffe vector
In computer programming, an Iliffe vector, also known as a display, is a data structure used to implement multi-dimensional arrays. An Iliffe vector for an ''n''-dimensional array (where ''n'' ≥ 2) consists of a vector (or 1-dimension ...
, a one-dimensional array of references to arrays of one dimension less. A two-dimensional array, in particular, would be implemented as a vector of pointers to its rows. Thus an element in row ''i'' and column ''j'' of an array ''A'' would be accessed by double indexing (''A'' 'i''''j''] in typical notation). This way of emulating multi-dimensional arrays allows the creation of jagged array
In computer science, a jagged array, also known as a ragged array, irregular array is an array of arrays of which the member arrays can be of different lengths, producing rows of jagged edges when visualized as output. In contrast, two-dimensio ...
s, where each row may have a different size — or, in general, where the valid range of each index depends on the values of all preceding indices.
This representation for multi-dimensional arrays is quite prevalent in C and C++ software. However, C and C++ will use a linear indexing formula for multi-dimensional arrays that are declared with compile time constant size, e.g. by int A 020]
or int A n]
, instead of the traditional int **A
.
Indexing notation
Most programming languages that support arrays support the ''store'' and ''select'' operations, and have special syntax for indexing. Early languages used parentheses, e.g. A(i,j)
, as in FORTRAN; others choose square brackets, e.g. A ,j/code> or A j]
, as in Algol 60 and Pascal (to distinguish from the use of parentheses for function call
In computer programming, a function or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed.
Functions may ...
s).
Index types
Array data types are most often implemented as array structures: with the indices restricted to integer (or totally ordered) values, index ranges fixed at array creation time, and multilinear element addressing. This was the case in most "third generation" languages, and is still the case of most systems programming language
A system programming language is a programming language used for system programming; such languages are designed for writing system software, which usually requires different development approaches when compared with application software. Edsger ...
s such as Ada
Ada may refer to:
Places
Africa
* Ada Foah, a town in Ghana
* Ada (Ghana parliament constituency)
* Ada, Osun, a town in Nigeria
Asia
* Ada, Urmia, a village in West Azerbaijan Province, Iran
* Ada, Karaman, a village in Karaman Province, ...
, C, and C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
. In some languages, however, array data types have the semantics of associative arrays, with indices of arbitrary type and dynamic element creation. This is the case in some scripting languages
A scripting language or script language is a programming language that is used to manipulate, customize, and automate the facilities of an existing system. Scripting languages are usually interpreted at runtime rather than compiled.
A scripting ...
such as Awk
AWK (''awk'') is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems.
The AWK lang ...
and Lua
Lua or LUA may refer to:
Science and technology
* Lua (programming language)
* Latvia University of Agriculture
* Last universal ancestor, in evolution
Ethnicity and language
* Lua people, of Laos
* Lawa people, of Thailand sometimes referred t ...
, and of some array types provided by standard C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
libraries.
Bounds checking
Some languages (like Pascal and Modula) perform bounds checking
In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as ...
on every access, raising an exception or aborting the program when any index is out of its valid range. Compilers may allow these checks to be turned off to trade safety for speed. Other languages (like FORTRAN and C) trust the programmer and perform no checks. Good compilers may also analyze the program to determine the range of possible values that the index may have, and this analysis may lead to bounds-checking elimination.
Index origin
Some languages, such as C, provide only zero-based array types, for which the minimum valid value for any index is 0. This choice is convenient for array implementation and address computations. With a language such as C, a pointer to the interior of any array can be defined that will symbolically act as a pseudo-array that accommodates negative indices. This works only because C does not check an index against bounds when used.
Other languages provide only ''one-based'' array types, where each index starts at 1; this is the traditional convention in mathematics for matrices and mathematical sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is calle ...
s. A few languages, such as Pascal and Lua, support ''n-based'' array types, whose minimum legal indices are chosen by the programmer. The relative merits of each choice have been the subject of heated debate. Zero-based indexing can avoid off-by-one or fencepost errors, Edsger W. Dijkstra,
Why numbering should start at zero
specifically a 0-based for(i=0;i<5;i+=1) iterates (5-0) times, whereas in the equivalent 1-based half-open range for(i=1;i<6;i+=1) the 6 is itself a potential such error, typically requiring length()+1, and the 1-based inclusive range for(i=1;i<=5;i+=1) iterates (5-1)+1 times.
Highest index
The relation between numbers appearing in an array declaration and the index of that array's last element also varies by language. In many languages (such as C), one should specify the number of elements contained in the array; whereas in others (such as Pascal and Visual Basic .NET
Visual Basic, originally called Visual Basic .NET (VB.NET), is a multi-paradigm, object-oriented programming language, implemented on .NET, Mono, and the .NET Framework. Microsoft launched VB.NET in 2002 as the successor to its original Visua ...
) one should specify the numeric value of the index of the last element. Needless to say, this distinction is immaterial in languages where the indices start at 1, such as Lua
Lua or LUA may refer to:
Science and technology
* Lua (programming language)
* Latvia University of Agriculture
* Last universal ancestor, in evolution
Ethnicity and language
* Lua people, of Laos
* Lawa people, of Thailand sometimes referred t ...
.
Array algebra
Some programming languages support array programming
In computer science, array programming refers to solutions which allow the application of operations to an entire set of values at once. Such solutions are commonly used in computational science, scientific and engineering settings.
Modern progr ...
, where operations and functions defined for certain data types are implicitly extended to arrays of elements of those types. Thus one can write ''A''+''B'' to add corresponding elements of two arrays ''A'' and ''B''. Usually these languages provide both the element-by-element multiplication and the standard matrix product
In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the s ...
of linear algebra
Linear algebra is the branch of mathematics concerning linear equations such as:
:a_1x_1+\cdots +a_nx_n=b,
linear maps such as:
:(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n,
and their representations in vector spaces and through matrices ...
, and which of these is represented by the ''*'' operator varies by language.
Languages providing array programming capabilities have proliferated since the innovations in this area of APL. These are core capabilities of domain-specific languages such as
GAUSS
Johann Carl Friedrich Gauss (; german: Gauß ; la, Carolus Fridericus Gauss; 30 April 177723 February 1855) was a German mathematician and physicist who made significant contributions to many fields in mathematics and science. Sometimes refer ...
, IDL, Matlab
MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementa ...
, and Mathematica. They are a core facility in newer languages, such as Julia
Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e.g ...
and recent versions of Fortran. These capabilities are also provided via standard extension libraries for other general purpose programming languages (such as the widely used NumPy library for Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
).
String types and arrays
Many languages provide a built-in string data type, with specialized notation ("string literal
A string literal or anonymous string is a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally " bracketed delimiters", as in x = "foo", where "foo" is a string ...
s") to build values of that type. In some languages (such as C), a string is just an array of characters, or is handled in much the same way. Other languages, like Pascal, may provide vastly different operations for strings and arrays.
Array index range queries
Some programming languages provide operations that return the size (number of elements) of a vector, or, more generally, range of each index of an array. In C and C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
arrays do not support the ''size'' function, so programmers often have to declare separate variable to hold the size, and pass it to procedures as a separate parameter.
Elements of a newly created array may have undefined values (as in C), or may be defined to have a specific "default" value such as 0 or a null pointer (as in Java).
In C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
a std::vector object supports the ''store'', ''select'', and ''append'' operations with the performance characteristics discussed above. Vectors can be queried for their size and can be resized. Slower operations like inserting an element in the middle are also supported.
Slicing
An array slicing
In computer programming, array slicing is an operation that extracts a subset of elements from an array and packages them as another array, possibly in a different dimension from the original.
Common examples of array slicing are extracting a su ...
operation takes a subset of the elements of an array-typed entity (value or variable) and then assembles them as another array-typed entity, possibly with other indices. If array types are implemented as array structures, many useful slicing operations (such as selecting a sub-array, swapping indices, or reversing the direction of the indices) can be performed very efficiently by manipulating the dope vector In computer programming, a dope vector is a data structure used to hold information about a data object, especially its memory layout.
Purpose
Dope vectors are most commonly used to describe arrays, which commonly store multiple instances of a pa ...
of the structure. The possible slicings depend on the implementation details: for example, FORTRAN allows slicing off one column of a matrix variable, but not a row, and treat it as a vector; whereas C allow slicing off a row from a matrix, but not a column.
On the other hand, other slicing operations are possible when array types are implemented in other ways.
Resizing
Some languages allow dynamic arrays (also called resizable, growable, or extensible): array variables whose index ranges may be expanded at any time after creation, without changing the values of its current elements.
For one-dimensional arrays, this facility may be provided as an operation "append
(''A'',''x'')" that increases the size of the array ''A'' by one and then sets the value of the last element to ''x''. Other array types (such as Pascal strings) provide a concatenation operator, which can be used together with slicing to achieve that effect and more. In some languages, assigning a value to an element of an array automatically extends the array, if necessary, to include that element. In other array types, a slice can be replaced by an array of different size, with subsequent elements being renumbered accordingly — as in Python's list assignment "''A'' :5= 0,20,30, that inserts three new elements (10,20, and 30) before element "''A'' . Resizable arrays are conceptually similar to lists, and the two concepts are synonymous in some languages.
An extensible array can be implemented as a fixed-size array, with a counter that records how many elements are actually in use. The append
operation merely increments the counter; until the whole array is used, when the append
operation may be defined to fail. This is an implementation of a dynamic array with a fixed capacity, as in the string
type of Pascal. Alternatively, the append
operation may re-allocate the underlying array with a larger size, and copy the old elements to the new area.
See also
* Array access analysis
* Array database management system
* Bounds-checking elimination
* Delimiter-separated values
Formats that use delimiter-separated values (also DSV)DSV stands for ''Delimiter Separated Values'' store two-dimensional arrays of data by separating the values in each row with specific delimiter characters. Most database and spreadsheet program ...
* Index checking
* Parallel array
* Sparse array
Sparse is a computer software tool designed to find possible coding faults in the Linux kernel. Unlike other such tools, this static analysis tool was initially designed to only flag constructs that were likely to be of interest to kernel deve ...
* Variable-length array In computer programming, a variable-length array (VLA), also called variable-sized or runtime-sized, is an array data structure whose length is determined at run time (instead of at compile time).
In C, the VLA is said to have a variably modified t ...
References
External links
NIST's Dictionary of Algorithms and Data Structures: Array
{{Data types
*
Data types
Composite data types