Data Structure
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
, a data structure is a
data In the pursuit of knowledge, data (; ) is a collection of discrete Value_(semiotics), values that convey information, describing quantity, qualitative property, quality, fact, statistics, other basic units of meaning, or simply sequences of sy ...
organization, management, and storage format that is usually chosen for efficient
access Access may refer to: Companies and organizations * ACCESS (Australia), an Australian youth network * Access (credit card), a former credit card in the United Kingdom * Access Co., a Japanese software company * Access Healthcare, an Indian BPO se ...
to data. More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data, i.e., it is an algebraic structure about
data In the pursuit of knowledge, data (; ) is a collection of discrete Value_(semiotics), values that convey information, describing quantity, qualitative property, quality, fact, statistics, other basic units of meaning, or simply sequences of sy ...
.


Usage

Data structures serve as the basis for
abstract data type In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a ''user'', of the data, specifically in terms of possible values, pos ...
s (ADT). The ADT defines the logical form of the data type. The data structure implements the physical form of the data type. Different types of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks. For example,
relational databases A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
commonly use
B-tree In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for ...
indexes for data retrieval, while
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
implementations usually use
hash table In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', ...
s to look up identifiers. Data structures provide a means to manage large amounts of data efficiently for uses such as large
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases s ...
s and internet indexing services. Usually, efficient data structures are key to designing efficient
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s. Some formal design methods and
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s emphasize data structures, rather than algorithms, as the key organizing factor in software design. Data structures can be used to organize the storage and retrieval of information stored in both main memory and secondary memory.


Implementation

Data structures are generally based on the ability of a computer to fetch and store data at any place in its memory, specified by a pointer—a
bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...
string, representing a
memory address In computing, a memory address is a reference to a specific memory location used at various levels by software and hardware. Memory addresses are fixed-length sequences of digits conventionally displayed and manipulated as unsigned integers. ...
, that can be itself stored in memory and manipulated by the program. Thus, the
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
and record data structures are based on computing the addresses of data items with
arithmetic operations Arithmetic () is an elementary part of mathematics that consists of the study of the properties of the traditional operations on numbers—addition, subtraction, multiplication, division, exponentiation, and extraction of roots. In the 19th ce ...
, while the
linked data structure In computer science, a linked data structure is a data structure which consists of a set of data records ('' nodes'') linked together and organized by references (''links'' or '' pointers''). The link between data can also be called a connector. I ...
s are based on storing addresses of data items within the structure itself. The implementation of a data structure usually requires writing a set of
procedures Procedure may refer to: * Medical procedure * Instructions or recipes, a set of commands that show how to achieve some result, such as to prepare or make something * Procedure (business), specifying parts of a business process * Standard opera ...
that create and manipulate instances of that structure. The efficiency of a data structure cannot be analyzed separately from those operations. This observation motivates the theoretical concept of an
abstract data type In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a ''user'', of the data, specifically in terms of possible values, pos ...
, a data structure that is defined indirectly by the operations that may be performed on it, and the mathematical properties of those operations (including their space and time cost).


Examples

There are numerous types of data structures, generally built upon simpler
primitive data type In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled pr ...
s. Well known examples are: * An ''array'' is a number of elements in a specific order, typically all of the same type (depending on the language, individual elements may either all be forced to be the same type, or may be of almost any type). Elements are accessed using an integer index to specify which element is required. Typical implementations allocate contiguous memory words for the elements of arrays (but this is not always a necessity). Arrays may be fixed-length or resizable. * A '' linked list'' (also just called ''list'') is a linear collection of data elements of any type, called nodes, where each node has itself a value, and points to the next node in the linked list. The principal advantage of a linked list over an array is that values can always be efficiently inserted and removed without relocating the rest of the list. Certain other operations, such as random access to a certain element, are however slower on lists than on arrays. * A ''record'' (also called ''tuple'' or ''struct'') is an
aggregate data Aggregate data is high-level data which is acquired by combining individual-level data. For instance, the output of an industry is an aggregate of the firms’ individual outputs within that industry. Aggregate data are applied in statistics, d ...
structure. A record is a value that contains other values, typically in fixed number and sequence and typically indexed by names. The elements of records are usually called ''fields'' or ''members''. In the context of
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
, records are known as
plain old data structure In computer science and object-oriented programming, a passive data structure (PDS, also termed a plain old data structure, or plain old data, POD) is a term for a record, to contrast with objects. It is a data structure that is represented only ...
s to distinguish them from objects. * ''
Hash table In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', ...
s'', ''graphs'' and '' binary trees''.


Language support

Most assembly languages and some low-level languages, such as BCPL (Basic Combined Programming Language), lack built-in support for data structures. On the other hand, many
high-level programming language In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to us ...
s and some higher-level assembly languages, such as
MASM The Microsoft Macro Assembler (MASM) is an x86 assembler that uses the Intel syntax for MS-DOS and Microsoft Windows. Beginning with MASM 8.0, there are two versions of the assembler: One for 16-bit & 32-bit assembly sources, and another (ML64) ...
, have special syntax or other built-in support for certain data structures, such as records and arrays. For example, the C (a direct descendant of BCPL) and Pascal languages support structs and records, respectively, in addition to vectors (one-dimensional
arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
) and multi-dimensional arrays. Most programming languages feature some sort of
library A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
mechanism that allows data structure implementations to be reused by different programs. Modern languages usually come with standard libraries that implement the most common data structures. Examples are the
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
Standard Template Library The Standard Template Library (STL) is a software library originally designed by Alexander Stepanov for the C++ programming language that influenced many parts of the C++ Standard Library. It provides four components called ''algorithms'', '' ...
, the
Java Collections Framework The Java collections framework is a set of classes and interfaces that implement commonly reusable collection data structures. Although referred to as a framework, it works in a manner of a library. The collections framework provides both interf ...
, and the
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washin ...
.NET Framework The .NET Framework (pronounced as "''dot net"'') is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows. It was the predominant implementation of the Common Language Infrastructure (CLI) until bein ...
. Modern languages also generally support
modular programming Modular programming is a software design technique that emphasizes separating the functionality of a Computer program, program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect of th ...
, the separation between the
interface Interface or interfacing may refer to: Academic journals * ''Interface'' (journal), by the Electrochemical Society * '' Interface, Journal of Applied Linguistics'', now merged with ''ITL International Journal of Applied Linguistics'' * '' Int ...
of a library module and its implementation. Some provide
opaque data type In computer science, an opaque data type is a data type whose concrete data structure is not defined in an interface. This enforces information hiding, since its values can only be manipulated by calling subroutines that have access to the missing ...
s that allow clients to hide implementation details.
Object-oriented programming language Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
s, such as
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
,
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
, and Smalltalk, typically use classes for this purpose. Many known data structures have
concurrent Concurrent means happening at the same time. Concurrency, concurrent, or concurrence may refer to: Law * Concurrence, in jurisprudence, the need to prove both ''actus reus'' and ''mens rea'' * Concurring opinion (also called a "concurrence"), a ...
versions which allow multiple computing threads to access a single concrete instance of a data structure simultaneously.


See also

*
Abstract data type In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a ''user'', of the data, specifically in terms of possible values, pos ...
*
Concurrent data structure In computer science, a concurrent data structure is a particular way of storing and organizing data for access by multiple computing threads (or processes) on a computer. Historically, such data structures were used on uniprocessor machines w ...
*
Data model A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be c ...
*
Dynamization In computer science, dynamization is the process of transforming a static data structure into a dynamic one. Although static data structures may provide very good functionality and fast queries, their utility is limited because of their inability t ...
*
Linked data structure In computer science, a linked data structure is a data structure which consists of a set of data records ('' nodes'') linked together and organized by references (''links'' or '' pointers''). The link between data can also be called a connector. I ...
*
List of data structures This is a list of well-known data structures. For a wider list of terms, see list of terms relating to algorithms and data structures. For a comparison of running times for a subset of this list see comparison of data structures. Data types ...
*
Persistent data structure In computing, a persistent data structure or not ephemeral data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (v ...
*
Plain old data structure In computer science and object-oriented programming, a passive data structure (PDS, also termed a plain old data structure, or plain old data, POD) is a term for a record, to contrast with objects. It is a data structure that is represented only ...
* Queap *
Succinct data structure In computer science, a succinct data structure is a data structure which uses an amount of space that is "close" to the information-theoretic lower bound, but (unlike other compressed representations) still allows for efficient query operations. Th ...
*
Tree (data structure) In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be c ...


References


Bibliography

* Peter Brass, ''Advanced Data Structures'',
Cambridge University Press Cambridge University Press is the university press of the University of Cambridge. Granted letters patent by Henry VIII of England, King Henry VIII in 1534, it is the oldest university press A university press is an academic publishing hou ...
, 2008, *
Donald Knuth Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer sc ...
, ''
The Art of Computer Programming ''The Art of Computer Programming'' (''TAOCP'') is a comprehensive monograph written by the computer scientist Donald Knuth presenting programming algorithms and their analysis. Volumes 1–5 are intended to represent the central core of com ...
'', vol. 1.
Addison-Wesley Addison-Wesley is an American publisher of textbooks and computer literature. It is an imprint of Pearson PLC, a global publishing and education company. In addition to publishing books, Addison-Wesley also distributes its technical titles throu ...
, 3rd edition, 1997, * Dinesh Mehta and
Sartaj Sahni Professor Sartaj Kumar Sahni (born July 22, 1949, in Pune, India) is a computer scientist based in the United States, and is one of the pioneers in the field of data structures. He is a distinguished professor in the Department of Computer and I ...
, ''Handbook of Data Structures and Applications'',
Chapman and Hall Chapman & Hall is an imprint owned by CRC Press, originally founded as a British publishing house in London in the first half of the 19th century by Edward Chapman and William Hall. Chapman & Hall were publishers for Charles Dickens (from 1840 ...
/
CRC Press The CRC Press, LLC is an American publishing group that specializes in producing technical books. Many of their books relate to engineering, science and mathematics. Their scope also includes books on business, forensics and information tec ...
, 2004, *
Niklaus Wirth Niklaus Emil Wirth (born 15 February 1934) is a Swiss computer scientist. He has designed several programming languages, including Pascal, and pioneered several classic topics in software engineering. In 1984, he won the Turing Award, generally ...
, ''Algorithms and Data Structures'', Prentice Hall, 1985,


Further reading

*
Alfred Aho Alfred Vaino Aho (born August 9, 1941) is a Canadian computer scientist best known for his work on programming languages, compilers, and related algorithms, and his textbooks on the art and science of computer programming. Aho was elected into ...
,
John Hopcroft John Edward Hopcroft (born October 7, 1939) is an American theoretical computer scientist. His textbooks on theory of computation (also known as the Cinderella book) and data structures are regarded as standards in their fields. He is the IBM P ...
, and
Jeffrey Ullman Jeffrey David Ullman (born November 22, 1942) is an American computer scientist and the Stanford W. Ascherman Professor of Engineering, Emeritus, at Stanford University. His textbooks on compilers (various editions are popularly known as the d ...
, ''Data Structures and Algorithms'', Addison-Wesley, 1983, * G. H. Gonnet and R. Baeza-Yates,
Handbook of Algorithms and Data Structures - in Pascal and C
', second edition, Addison-Wesley, 1991, *
Ellis Horowitz Ellis Horowitz is an American computer scientist and Professor of Computer Science and Electrical Engineering at the University of Southern California (USC). Horowitz is best known for his computer science textbooks on data structures and algor ...
and Sartaj Sahni, ''Fundamentals of Data Structures in Pascal'',
Computer Science Press W. H. Freeman and Company is an imprint of Macmillan Higher Education, a division of Macmillan Publishers. Macmillan publishes monographs and textbooks for the sciences under the imprint. History The company was founded in 1946 by William H. ...
, 1984,


External links


Descriptions
from the
Dictionary of Algorithms and Data Structures The NIST ''Dictionary of Algorithms and Data Structures'' is a reference work maintained by the U.S. National Institute of Standards and Technology. It defines a large number of terms relating to algorithms and data structures. For algorithms and ...

Data structures course

An Examination of Data Structures from .NET perspective

Schaffer, C. ''Data Structures and Algorithm Analysis''
{{DEFAULTSORT:Data Structure