In
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, a data structure is a
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
organization and storage format that is usually chosen for
efficient access
Access may refer to:
Companies and organizations
* ACCESS (Australia), an Australian youth network
* Access (credit card), a former credit card in the United Kingdom
* Access Co., a Japanese software company
* Access International Advisors, a hed ...
to data. More precisely, a data structure is a collection of data values, the relationships among them, and the
functions or
operations that can be applied to the data, i.e., it is an
algebraic structure about
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
.
Usage
Data structures serve as the basis for
abstract data types (ADT). The ADT defines the logical form of the data type. The data structure implements the physical form of the
data type.
Different types of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks. For example,
relational databases commonly use
B-tree indexes for data retrieval, while
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
implementations usually use
hash tables to look up
identifiers.
Data structures provide a means to manage large amounts of data efficiently for uses such as large
database
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
s and internet indexing services. Usually, efficient data structures are key to designing efficient
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s. Some formal design methods and
programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s emphasize data structures, rather than algorithms, as the key organizing factor in software design. Data structures can be used to organize the storage and retrieval of information stored in both
main memory and
secondary memory.
Implementation
Data structures can be implemented using a variety of programming languages and techniques, but they all share the common goal of efficiently organizing and storing data. Data structures are generally based on the ability of a
computer
A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
to fetch and store data at any place in its memory, specified by a
pointer—a
bit string, representing a
memory address, that can be itself stored in memory and manipulated by the program. Thus, the
array and
record data structures are based on computing the addresses of data items with
arithmetic operations
Arithmetic is an elementary branch of mathematics that deals with numerical operations like addition, subtraction, multiplication, and Division (mathematics), division. In a wider sense, it also includes exponentiation, extraction of nth root, ...
, while the
linked data structures are based on storing addresses of data items within the structure itself. This approach to data structuring has profound implications for the efficiency and scalability of algorithms. For instance, the contiguous memory allocation in arrays facilitates rapid access and modification operations, leading to optimized performance in sequential data processing scenarios.
The implementation of a data structure usually requires writing a set of
procedures that create and manipulate instances of that structure. The efficiency of a data structure cannot be analyzed separately from those operations. This observation motivates the theoretical concept of an
abstract data type, a data structure that is defined indirectly by the operations that may be performed on it, and the mathematical properties of those operations (including their space and time cost).
Examples

There are numerous types of data structures, generally built upon simpler
primitive data types. Well known examples are:
* An
array is a number of elements in a specific order, typically all of the same type (depending on the language, individual elements may either all be forced to be the same type, or may be of almost any type). Elements are accessed using an integer index to specify which element is required. Typical implementations allocate contiguous memory words for the elements of arrays (but this is not always a necessity). Arrays may be fixed-length or resizable.
* A
linked list
In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...
(also just called ''list'') is a linear collection of data elements of any type, called nodes, where each node has itself a value, and points to the next node in the linked list. The principal advantage of a linked list over an array is that values can always be efficiently inserted and removed without relocating the rest of the list. Certain other operations, such as
random access to a certain element, are however slower on lists than on arrays.
* A
record (also called ''tuple'' or ''struct'') is an
aggregate data structure. A record is a value that contains other values, typically in fixed number and sequence and typically indexed by names. The elements of records are usually called ''fields'' or ''members''. In the context of
object-oriented programming
Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
, records are known as
plain old data structures to distinguish them from objects.
*
Hash tables, also known as hash maps, are data structures that provide fast retrieval of values based on keys. They use a hashing function to map keys to indexes in an array, allowing for constant-time access in the average case. Hash tables are commonly used in dictionaries, caches, and database indexing. However, hash collisions can occur, which can impact their performance. Techniques like chaining and open addressing are employed to handle collisions.
*
Graphs are collections of nodes connected by edges, representing relationships between entities. Graphs can be used to model social networks, computer networks, and transportation networks, among other things. They consist of vertices (nodes) and edges (connections between nodes). Graphs can be directed or undirected, and they can have cycles or be acyclic. Graph traversal algorithms include breadth-first search and depth-first search.
*
Stacks and
queues are abstract data types that can be implemented using arrays or linked lists. A stack has two primary operations: push (adds an element to the top of the stack) and pop (removes the topmost element from the stack), that follow the Last In, First Out (LIFO) principle. Queues have two main operations: enqueue (adds an element to the rear of the queue) and dequeue (removes an element from the front of the queue) that follow the First In, First Out (FIFO) principle.
*
Trees
In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, e.g., including only woody plants with secondary growth, only p ...
represent a hierarchical organization of elements. A tree consists of nodes connected by edges, with one node being the root and all other nodes forming subtrees. Trees are widely used in various algorithms and data storage scenarios.
Binary tree
In computer science, a binary tree is a tree data structure in which each node has at most two children, referred to as the ''left child'' and the ''right child''. That is, it is a ''k''-ary tree with . A recursive definition using set theor ...
s (particularly
heaps),
AVL trees, and
B-trees are some popular types of trees. They enable efficient and optimal searching, sorting, and hierarchical representation of data.
A
trie
In computer science, a trie (, ), also known as a digital tree or prefix tree, is a specialized search tree data structure used to store and retrieve strings from a dictionary or set. Unlike a binary search tree, nodes in a trie do not store t ...
, or prefix tree, is a special type of tree used to efficiently retrieve strings. In a trie, each node represents a character of a string, and the edges between nodes represent the characters that connect them. This structure is especially useful for tasks like autocomplete, spell-checking, and creating dictionaries. Tries allow for quick searches and operations based on string prefixes.
Language support
Most
assembly language
In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
s and some
low-level languages, such as
BCPL
BCPL ("Basic Combined Programming Language") is a procedural, imperative, and structured programming language. Originally intended for writing compilers for other languages, BCPL is no longer in common use. However, its influence is still f ...
(Basic Combined Programming Language), lack built-in support for data structures. On the other hand, many
high-level programming language
A high-level programming language is a programming language with strong Abstraction (computer science), abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be ea ...
s and some higher-level assembly languages, such as
MASM, have special syntax or other built-in support for certain data structures, such as records and arrays. For example, the
C (a direct descendant of BCPL) and
Pascal languages support
structs and records, respectively, in addition to vectors (one-dimensional
arrays
An array is a systematic arrangement of similar objects, usually in rows and columns.
Things called an array include:
{{TOC right
Music
* In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
) and multi-dimensional arrays.
Most programming languages feature some sort of
library
A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
mechanism that allows data structure implementations to be reused by different programs. Modern languages usually come with standard libraries that implement the most common data structures. Examples are the
C++ Standard Template Library
The Standard Template Library (STL) is a software library originally designed by Alexander Stepanov for the C++ programming language that influenced many parts of the C++ Standard Library. It provides four components called ''algorithms'', '' ...
, the
Java Collections Framework
The Java collections framework is a set of classes and interfaces that implement commonly reusable collection data structures.
Although referred to as a framework, it works in a manner of a library. The collections framework provides both inte ...
, and the
Microsoft
Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
.NET Framework.
Modern languages also generally support
modular programming
Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect or "concern" of the d ...
, the separation between the
interface of a library module and its implementation. Some provide
opaque data types that allow clients to hide implementation details.
Object-oriented programming language
Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impleme ...
s, such as
C++,
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, and
Smalltalk
Smalltalk is a purely object oriented programming language (OOP) that was originally created in the 1970s for educational use, specifically for constructionist learning, but later found use in business. It was created at Xerox PARC by Learni ...
, typically use
classes for this purpose.
Many known data structures have
concurrent versions which allow multiple computing threads to access a single concrete instance of a data structure simultaneously.
See also
*
Abstract data type
In computer science, an abstract data type (ADT) is a mathematical model for data types, defined by its behavior (semantics) from the point of view of a '' user'' of the data, specifically in terms of possible values, possible operations on data ...
*
Concurrent data structure
*
Data model
A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be ...
*
Dynamization
*
Linked data structure
In computer science, a linked data structure is a data structure which consists of a set of data records ('' nodes'') linked together and organized by references (''links'' or '' pointers''). The link between data can also be called a connector. ...
*
List of data structures
This is a list of well-known data structures. For a wider list of terms, see list of terms relating to algorithms and data structures. For a comparison of running times for a subset of this list see comparison of data structures.
Data types
...
*
Persistent data structure
In computing, a persistent data structure or not ephemeral data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (v ...
*
Plain old data structure
In computer science and object-oriented programming, a passive data structure (PDS), also termed a plain old data structure or plain old data (POD), is a record, in contrast with objects. It is a data structure that is represented only as passive ...
*
Queap
*
Succinct data structure
*
Tree (data structure)
In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be co ...
References
Bibliography
* Peter Brass, ''Advanced Data Structures'',
Cambridge University Press
Cambridge University Press was the university press of the University of Cambridge. Granted a letters patent by King Henry VIII in 1534, it was the oldest university press in the world. Cambridge University Press merged with Cambridge Assessme ...
, 2008,
*
Donald Knuth
Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist and mathematician. He is a professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of comp ...
, ''
The Art of Computer Programming
''The Art of Computer Programming'' (''TAOCP'') is a comprehensive multi-volume monograph written by the computer scientist Donald Knuth presenting programming algorithms and their analysis. it consists of published volumes 1, 2, 3, 4A, and 4 ...
'', vol. 1.
Addison-Wesley
Addison–Wesley is an American publisher of textbooks and computer literature. It is an imprint of Pearson plc, a global publishing and education company. In addition to publishing books, Addison–Wesley also distributes its technical titles ...
, 3rd edition, 1997,
* Dinesh Mehta and
Sartaj Sahni
Professor Sartaj Kumar Sahni (born July 22, 1949, in Pune, India) is a computer scientist based in the United States, and is one of the pioneers in the field of data structures. He is a distinguished professor in the Department of Computer and I ...
, ''Handbook of Data Structures and Applications'',
Chapman and Hall/
CRC Press
The CRC Press, LLC is an American publishing group that specializes in producing technical books. Many of their books relate to engineering, science and mathematics. Their scope also includes books on business, forensics and information technol ...
, 2004,
*
Niklaus Wirth
Niklaus Emil Wirth ( IPA: ) (15 February 1934 – 1 January 2024) was a Swiss computer scientist. He designed several programming languages, including Pascal, and pioneered several classic topics in software engineering. In 1984, he won the Tu ...
, ''Algorithms and Data Structures'',
Prentice Hall
Prentice Hall was a major American publishing#Textbook_publishing, educational publisher. It published print and digital content for the 6–12 and higher-education market. It was an independent company throughout the bulk of the twentieth cen ...
, 1985,
Further reading
Open Data Structures by Pat Morin*
G. H. Gonnet and
R. Baeza-Yates,
Handbook of Algorithms and Data Structures - in Pascal and C', second edition, Addison-Wesley, 1991,
*
Ellis Horowitz and Sartaj Sahni, ''Fundamentals of Data Structures in Pascal'',
Computer Science Press, 1984,
External links
Descriptionsfrom the
Dictionary of Algorithms and Data Structures
Data structures courseAn Examination of Data Structures from .NET perspectiveSchaffer, C. ''Data Structures and Algorithm Analysis''
{{DEFAULTSORT:Data Structure