HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
, a record (also called a structure, struct, or compound data) is a basic data structure. Records in a
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases s ...
or
spreadsheet A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
are usually called "
row Row or ROW may refer to: Exercise *Rowing, or a form of aquatic movement using oars *Row (weight-lifting), a form of weight-lifting exercise Math *Row vector, a 1 × ''n'' matrix in linear algebra. *Row (database), a single, implicitly structured ...
s". A record is a collection of ''
fields Fields may refer to: Music * Fields (band), an indie rock band formed in 2006 * Fields (progressive rock band), a progressive rock band formed in 1971 * ''Fields'' (album), an LP by Swedish-based indie rock band Junip (2010) * "Fields", a song b ...
'', possibly of different data types, typically in a fixed number and sequence. The fields of a record may also be called ''members'', particularly in
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
; fields may also be called ''elements'', though this risks confusion with the elements of a collection. For example, a date could be stored as a record containing a numeric year field, a month field represented as a string, and a numeric day-of-month field. A personnel record might contain a name, a salary, and a rank. A Circle record might contain a center and a radius—in this instance, the center itself might be represented as a point record containing x and y coordinates. Records are distinguished from
arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
by the fact that their number of fields is determined in the definition of the record, and by the fact the records are a heterogenous data type; not all of the fields must contain the same type of data. A ''record type'' is a data type that describes such values and variables. Most modern computer languages allow the programmer to define new record types. The definition includes specifying the data type of each field and an identifier (name or label) by which it can be accessed. In
type theory In mathematics, logic, and computer science, a type theory is the formal presentation of a specific type system, and in general type theory is the academic study of type systems. Some type theories serve as alternatives to set theory as a fou ...
,
product type In programming languages and type theory, a product of ''types'' is another, compounded, type in a structure. The "operands" of the product are types, and the structure of a product type is determined by the fixed order of the operands in the prod ...
s (with no field names) are generally preferred due to their simplicity, but proper record types are studied in languages such as System F-sub. Since type-theoretical records may contain
first-class function In computer science, a programming language is said to have first-class functions if it treats functions as first-class citizens. This means the language supports passing functions as arguments to other functions, returning them as the values from ...
-typed fields in addition to data, they can express many features of
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
. Records can exist in any storage medium, including main memory and
mass storage devices Data storage is the recording (storing) of information (data) in a storage medium. Handwriting, phonographic recording, magnetic tape, and optical discs are all examples of storage media. Biological molecules such as RNA and DNA are consi ...
such as magnetic tapes or hard disks. Records are a fundamental component of most data structures, especially
linked data structure In computer science, a linked data structure is a data structure which consists of a set of data records ('' nodes'') linked together and organized by references (''links'' or '' pointers''). The link between data can also be called a connector. I ...
s. Many
computer file A computer file is a computer resource for recording data in a computer storage device, primarily identified by its file name. Just as words can be written to paper, so can data be written to a computer file. Files can be shared with and trans ...
s are organized as arrays of logical records, often grouped into larger physical records or blocks for efficiency. The parameters of a function or procedure can often be viewed as the fields of a record variable; and the arguments passed to that function can be viewed as a record value that gets assigned to that variable at the time of the call. Also, in the
call stack In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or m ...
that is often used to implement procedure calls, each entry is an ''activation record'' or ''call frame'', containing the procedure parameters and local variables, the return address, and other internal fields. An object in
object-oriented Object-oriented programming (OOP) is a programming paradigm based on the concept of " objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of p ...
language is essentially a record that contains procedures specialized to handle that record; and object types are an elaboration of record types. Indeed, in most object-oriented languages, records are just special cases of objects, and are known as
plain old data structure In computer science and object-oriented programming, a passive data structure (PDS, also termed a plain old data structure, or plain old data, POD) is a term for a record, to contrast with objects. It is a data structure that is represented only ...
s (PODSs), to contrast with objects that use OO features. A record can be viewed as the computer analog of a
mathematical Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
tuple In mathematics, a tuple is a finite ordered list (sequence) of elements. An -tuple is a sequence (or ordered list) of elements, where is a non-negative integer. There is only one 0-tuple, referred to as ''the empty tuple''. An -tuple is defi ...
, although a
tuple In mathematics, a tuple is a finite ordered list (sequence) of elements. An -tuple is a sequence (or ordered list) of elements, where is a non-negative integer. There is only one 0-tuple, referred to as ''the empty tuple''. An -tuple is defi ...
may or may not be considered a record, and vice versa, depending on conventions and the specific programming language. In the same vein, a record type can be viewed as the computer language analog of the Cartesian product of two or more mathematical sets, or the implementation of an abstract
product type In programming languages and type theory, a product of ''types'' is another, compounded, type in a structure. The "operands" of the product are types, and the structure of a product type is determined by the fixed order of the operands in the prod ...
in a specific language.


Keys

A record may have zero or more ''key''s. A key maps an expression to a value, or a set of values, in the record. A primary key is a key this is unique throughout all stored records; only one if this key exists. In other words, no duplicate may exist for any primary key. For example an employee file might contain employee number, name, department, and salary. The employee number will be unique in the organization and would be the primary key. Depending on the storage medium and file organization the employee number might be '' indexed''—that is also stored in a separate file to make lookup faster. The department code is not necessarily unique; it may also be indexed, in which case it would be considered a ''secondary key'', or ''alternate key''. If it is not indexed the entire employee file would have to be scanned to produce a listing of all employees in a specific department. Keys are usually chosen in a way that minimizes the chances of multiple values being feasibly mapped to by one key. For example, the salary field would not normally be considered usable as a key since many employees will likely have the same salary. Indexing is one factor considered when designing a file.


History

The concept of a record can be traced to various types of
table Table may refer to: * Table (furniture), a piece of furniture with a flat surface and one or more legs * Table (landform), a flat area of land * Table (information), a data arrangement with rows and columns * Table (database), how the table data ...
s and
ledger A ledger is a book or collection of accounts in which account transactions are recorded. Each account has an opening or carry-forward balance, and would record each transaction as either a debit or credit in separate columns, and the ending or ...
s used in accounting since remote times. The modern notion of records in computer science, with fields of well-defined type and size, was already implicit in 19th century mechanical calculators, such as Babbage's Analytical Engine. The original machine-readable medium used for data (as opposed to control) was
punch card A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
used for records in the
1890 United States Census The United States census of 1890 was taken beginning June 2, 1890, but most of the 1890 census materials were destroyed in 1921 when a building caught fire and in the subsequent disposal of the remaining damaged records. It determined the reside ...
: each punch card was a single record. Compare the journal entry from 1880 and the punch card from 1895. Records were well-established in the first half of the 20th century, when most data processing was done using punched cards. Typically each record of a data file would be recorded in one punched card, with specific columns assigned to specific fields. Generally, a record was the smallest unit that could be read in from external storage (e.g. card reader, tape or disk). The contents of punchcard-style records were originally called "unit records" because since punchcards had pre-determined document lengths. When storage systems became more advanced with the use of
hard drives A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating platters coated with magne ...
and magnetic tape, variable-length records became the standard. A variable-length record is a record in which the size of the record in bytes is approximately equal to the sum of the sizes of its fields. This was not possible to do before more advanced storage hardware was invented because all of the punchcards had to conform to pre-determined document lengths that the computer could read, since at the time the cards had to be physically fed into a machine. Most
machine language In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very ...
implementations and early assembly languages did not have special syntax for records, but the concept was available (and extensively used) through the use of
index register An index register in a computer's CPU is a processor register (or an assigned memory location) used for pointing to operand addresses during the run of a program. It is useful for stepping through strings and arrays. It can also be used for hol ...
s,
indirect addressing Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions ...
, and
self-modifying code In computer science, self-modifying code (SMC) is code that alters its own instructions while it is executing – usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, ...
. Some early computers, such as the
IBM 1620 The IBM 1620 was announced by IBM on October 21, 1959, and marketed as an inexpensive scientific computer. After a total production of about two thousand machines, it was withdrawn on November 19, 1970. Modified versions of the 1620 were used as ...
, had hardware support for delimiting records and fields, and special instructions for copying such records. The concept of records and fields was central in some early file sorting and tabulating utilities, such as IBM's Report Program Generator (RPG). was the first widespread programming language to support record types, and its record definition facilities were quite sophisticated at the time. The language allows for the definition of nested records with alphanumeric, integer, and fractional fields of arbitrary size and precision, as well as fields that automatically format any value assigned to them (e.g., insertion of currency signs, decimal points, and digit group separators). Each file is associated with a record variable where data is read into or written from. COBOL also provides a MOVE CORRESPONDING statement that assigns corresponding fields of two records according to their names. The early languages developed for numeric computing, such as FORTRAN (up to FORTRAN IV) and Algol 60, did not have support for record types; but later versions of those languages, such as FORTRAN 77 and
Algol 68 ALGOL 68 (short for ''Algorithmic Language 1968'') is an imperative programming language that was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and more rigorously d ...
did add them. The original
Lisp programming language Lisp (historically LISP) is a family of programming languages with a long history and a distinctive, fully parenthesized prefix notation. Originally specified in 1960, Lisp is the second-oldest high-level programming language still in common u ...
too was lacking records (except for the built-in cons cell), but its
S-expression In computer programming, an S-expression (or symbolic expression, abbreviated as sexpr or sexp) is an expression in a like-named notation for nested list (tree-structured) data. S-expressions were invented for and popularized by the programming la ...
s provided an adequate surrogate. The
Pascal programming language Pascal is an imperative and procedural programming language, designed by Niklaus Wirth as a small, efficient language intended to encourage good programming practices using structured programming and data structuring. It is named in honour o ...
was one of the first languages to fully integrate record types with other basic types into a logically consistent type system. The
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. I ...
programming language provided for COBOL-style records. The C programming language initially provided the record concept as a kind of template ( struct) that could be laid on top of a memory area, rather than a true record data type. The latter were provided eventually (by the
typedef typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (''alias'') for another data type, but does not create a new type, except in the obscure case of a qualified typedef of ...
declaration), but the two concepts are still distinct in the language. Most languages designed after Pascal (such as
Ada Ada may refer to: Places Africa * Ada Foah, a town in Ghana * Ada (Ghana parliament constituency) * Ada, Osun, a town in Nigeria Asia * Ada, Urmia, a village in West Azerbaijan Province, Iran * Ada, Karaman, a village in Karaman Province, ...
,
Modula The Modula programming language is a descendant of the Pascal language. It was developed in Switzerland, at ETH Zurich, in the mid-1970s by Niklaus Wirth, the same person who designed Pascal. The main innovation of Modula over Pascal is a modul ...
, and
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
), also supported records. Although records are not often used in their original context anymore (i.e. being used solely for the purpose of containing data), records influenced newer
object oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of " objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of p ...
languages and
relational database management systems A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
. Since records provided more modularity in the way data was stored and handled, they are better suited at representing complex, real-world concepts than the primitive data types provided by default in languages. This influenced later languages such as
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
,
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
,
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of websites use JavaScript on the client side for webpage behavior, of ...
, and
Objective-C Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXT ...
which address the same modularity concerns of programmers. Objects in these languages are essentially records with the addition of
methods Method ( grc, μέθοδος, methodos) literally means a pursuit of knowledge, investigation, mode of prosecuting such inquiry, or system. In recent centuries it more often means a prescribed process for completing a task. It may refer to: *Scien ...
and
inheritance Inheritance is the practice of receiving private property, titles, debts, entitlements, privileges, rights, and obligations upon the death of an individual. The rules of inheritance differ among societies and have changed over time. Officia ...
, which allow programmers to manipulate the way data behaves instead of only the contents of a record. Many programmers regard records as obsolete now since object-oriented languages have features that far surpass what records are capable of. On the other hand, many programmers argue that the low overhead and ability to use records in assembly language make records still relevant when programming with low levels of abstraction. to Today, the most popular languages on the TIOBE index, an indicator of the popularity of programming languages, have been influenced in some way by records due to the fact that they are object oriented. Query languages such as SQL and Object Query Language were also influenced by the concept of records. These languages allow the programmer to store sets of data, which are essentially records, in tables. This data can then be retrieved using a primary key. The tables themselves are also records which may have a foreign key: a key that references data in another table.   


Operations

* Declaration of a new record type, including the position, type, and (possibly) name of each field; * Declaration of variables and values as having a given record type; * Construction of a record value from given field values and (sometimes) with given field names; * Selection of a field of a record with an explicit name; * Assignment of a record value to a record variable; * Comparison of two records for equality; * Computation of a standard
hash value A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called ''hash values'', ''hash codes'', ''digests'', or simply ''hashes''. The values are usually u ...
for the record. The selection of a field from a record value yields a value. Some languages may provide facilities that enumerate all fields of a record, or at least the fields that are references. This facility is needed to implement certain services such as
debugger A debugger or debugging tool is a computer program used to test and debug other programs (the "target" program). The main use of a debugger is to run the target program under controlled conditions that permit the programmer to track its executi ...
s, garbage collectors, and serialization. It requires some degree of
type polymorphism In programming language theory and type theory, polymorphism is the provision of a single interface to entities of different types or the use of a single symbol to represent multiple different types.: "Polymorphic types are types whose operati ...
. In systems with record subtyping, operations on values of record type may also include: * Adding a new field to a record, setting the value of the new field. * Removing a field from a record. In such settings, a specific record type implies that a specific set of fields are present, but values of that type may contain additional fields. A record with fields ''x'', ''y'', and ''z'' would thus belong to the type of records with fields ''x'' and ''y'', as would a record with fields ''x'', ''y'', and ''r''. The rationale is that passing an (''x'',''y'',''z'') record to a function that expects an (''x'',''y'') record as argument should work, since that function will find all the fields it requires within the record. Many ways of practically implementing records in programming languages would have trouble with allowing such variability, but the matter is a central characteristic of record types in more theoretical contexts.


Assignment and comparison

Most languages allow assignment between records that have exactly the same record type (including same field types and names, in the same order). Depending on the language, however, two record data types defined separately may be regarded as distinct types even if they have exactly the same fields. Some languages may also allow assignment between records whose fields have different names, matching each field value with the corresponding field variable by their positions within the record; so that, for example, a
complex number In mathematics, a complex number is an element of a number system that extends the real numbers with a specific element denoted , called the imaginary unit and satisfying the equation i^= -1; every complex number can be expressed in the fo ...
with fields called real and imag can be assigned to a 2D point record variable with fields X and Y. In this alternative, the two operands are still required to have the same sequence of field types. Some languages may also require that corresponding types have the same size and encoding as well, so that the whole record can be assigned as an uninterpreted
bit string A bit array (also known as bitmask, bit map, bit set, bit string, or bit vector) is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level ...
. Other languages may be more flexible in this regard, and require only that each value field can be legally assigned to the corresponding variable field; so that, for example, a
short integer In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are ...
field can be assigned to a long integer field, or vice versa. Other languages (such as COBOL) may match fields and values by their names, rather than positions. These same possibilities apply to the comparison of two record values for equality. Some languages may also allow order comparisons ('<'and '>'), using the
lexicographic order In mathematics, the lexicographic or lexicographical order (also known as lexical order, or dictionary order) is a generalization of the alphabetical order of the dictionaries to sequences of ordered symbols or, more generally, of elements of a ...
based on the comparison of individual fields.
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. I ...
allows both of the preceding types of assignment, and also allows ''structure expressions'', such as a = a+1; where "a" is a record, or structure in PL/I terminology.


Algol 68's distributive field selection

In Algol 68, if Pts was an array of records, each with integer fields X and Y, one could write Y of Pts to obtain an array of integers, consisting of the Y fields of all the elements of Pts. As a result, the statements Y of Pts := 7 and (Y of Pts) := 7 would have the same effect.


Pascal's "with" statement

In the
Pascal programming language Pascal is an imperative and procedural programming language, designed by Niklaus Wirth as a small, efficient language intended to encourage good programming practices using structured programming and data structuring. It is named in honour o ...
, the command with R do S would execute the command sequence S as if all the fields of record R had been declared as variables. Similarly to entering a different
namespace In computing, a namespace is a set of signs (''names'') that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified. Namespaces ...
in an object-oriented language like C#, it is no longer necessary to use the record name as a prefix to access the fields. So, instead of writing Pt.X := 5; Pt.Y := Pt.X + 3 one could write with Pt do begin X := 5; Y := X + 3 end.


Representation in memory

The representation of records in memory varies depending on the programming languages. Usually the fields are stored in consecutive positions in memory, in the same order as they are declared in the record type. This may result in two or more fields stored into the same word of memory; indeed, this feature is often used in
systems programming Systems programming, or system programming, is the activity of programming computer system software. The primary distinguishing characteristic of systems programming when compared to application programming is that application programming aims to pr ...
to access specific bits of a word. On the other hand, most compilers will add padding fields, mostly invisible to the programmer, in order to comply with alignment constraints imposed by the machine—say, that a
floating point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
field must occupy a single word. Some languages may implement a record as an array of addresses pointing to the fields (and, possibly, to their names and/or types). Objects in object-oriented languages are often implemented in rather complicated ways, especially in languages that allow multiple class inheritance.


Self-defining records

A ''self-defining record'' is a type of record which contains information to identify the record type and to locate information within the record. It may contain the offsets of elements; the elements can therefore be stored in any order or may be omitted. The information stored in a self-defining record can be interpreted as metadata for the record, which is similar to what one would expect to find in the
UNIX Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, an ...
metadata regarding a file, containing information such as the record's creation time and the size of the record in
bytes The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
. Alternatively, various elements of the record, each including an element identifier, can simply follow one another in any order.


See also

*
Block (data storage) In computing (specifically data transmission and data storage), a block, sometimes called a physical record, is a sequence of bytes or bits, usually containing some whole number of records, having a maximum length; a ''block size''. Data th ...
*
Composite data type In computer science, a composite data type or compound data type is any data type which can be constructed in a program using the programming language's primitive data types and other composite types. It is sometimes called a structure or aggreg ...
*
Data hierarchy Data hierarchy refers to the systematic organization of data, often in a hierarchical form. Data organization involves characters, fields, records, files and so on. This concept is a starting point when trying to see what makes up data and whether d ...
*
Object composition In computer science, object composition and object aggregation are closely related ways to combine objects or data types into more complex ones. In conversation the distinction between composition and aggregation is often ignored. Common kind ...
*
Passive data structure In computer science and object-oriented programming, a passive data structure (PDS, also termed a plain old data structure, or plain old data, POD) is a term for a record, to contrast with objects. It is a data structure that is represented only ...
*
Union type In computer science, a union is a value that may have any of several representations or formats within the same position in memory; that consists of a variable that may hold such a data structure. Some programming languages support special data ...


References

{{DEFAULTSORT:Record (Computer Science) Data types Composite data types Articles with example Julia code