HOME
        TheInfoList






In computer science, a reference is a value that enables a program to indirectly access a particular datum, such as a variable's value or a record, in the computer's memory or in some other storage device. The reference is said to refer to the datum, and accessing the datum is called dereferencing the reference.

A reference is distinct from the datum itself. Typically, for references to data stored in memory on a given system, a reference is implemented as the physical address of where the data is stored in memory or in the storage device. For this reason, a reference is often erroneously confused with a pointer or address, and is said to "point to" the data. However, a reference may also be implemented in other ways, such as the offset (difference) between the datum's address and some fixed "base" address, as an index into an array, or more abstractly as a handle. More broadly, in networking, references may be network addresses, such as URLs.

The concept of reference must not be confused with other values (keys or identifiers) that uniquely identify the data item, but give access to it only through a non-trivial lookup operation in some table data structure.

References are widely used in programming, especially to efficiently pass large or mutable data as arguments to procedures, or to share such data among various uses. In particular, a reference may point to a variable or record that contains references to other data. This idea is the basis of indirect addressing and of many linked data structures, such as linked lists. References can cause significant complexity in a program, partially due to the possibility of dangling and wild references and partially because the topology of data with references is a directed graph, whose analysis can be quite complicated.

Benefits

References increase flexibility in where objects can be stored, how they are allocated, and how they are passed between areas of code. As long as one can access a reference to the data, one can access the data through it, and the data itself need not be moved. They also make sharing of data between different code areas easier; each keeps a reference to it.

The mechanism of references, if varying in implementation, is a fundamental programming language feature common to nearly all modern programming languages. Even some languages that support no direct use of references have some internal or implicit use. For example, the call by reference calling convention can be implemented with either explicit or implicit use of references.

Examples

Pointers are the most primitive type of reference. Due to their intimate relationship with the underlying hardware, they are one of the most powerful and efficient types of references. However, also due to this relationship, pointers require a strong understanding by the programmer of the details of memory architecture. Because pointers store a memory location's address, instead of a value directly, inappropriate use of pointers can lead to undefined behavior in a program, particularly due to dangling pointers or wild pointers. Smart pointers are opaque data structures that act like pointers but can only be accessed through particular methods.

A handle is an abstract reference, and may be represented in various ways. A common example are file handles (the FILE data structure in the C standard I/O library), used to abstract file content. It usually represents both the file itself, as when requesting a lock on the file, and a specific position within the file's content, as when reading a file.

In distributed computing, the reference may contain more than an address or identifier; it may also include an embedded specification of the network protocols used to locate and access the referenced object, the way information is encoded or serialized. Thus, for example, a WSDL description of a remote web service can be viewed as a form of reference; it includes a complete specification of ho

A reference is distinct from the datum itself. Typically, for references to data stored in memory on a given system, a reference is implemented as the physical address of where the data is stored in memory or in the storage device. For this reason, a reference is often erroneously confused with a pointer or address, and is said to "point to" the data. However, a reference may also be implemented in other ways, such as the offset (difference) between the datum's address and some fixed "base" address, as an index into an array, or more abstractly as a handle. More broadly, in networking, references may be network addresses, such as URLs.

The concept of reference must not be confused with other values (keys or identifiers) that uniquely identify the data item, but give access to it only through a non-trivial lookup operation in some table data structure.

References are widely used in programming, especially to efficiently pass large or mutable data as arguments to procedures, or to share such data among various uses. In particular, a reference may point to a variable or record that contains references to other data. This idea is the basis of indirect addressing and of many linked data structures, such as linked lists. References can cause significant complexity in a program, partially due to the possibility of dangling and wild references and partially because the topology of data with references is a directed graph, whose analysis can be quite complicated.

References increase flexibility in where objects can be stored, how they are allocated, and how they are passed between areas of code. As long as one can access a reference to the data, one can access the data through it, and the data itself need not be moved. They also make sharing of data between different code areas easier; each keeps a reference to it.

The mechanism of references, if varying in implementation, is a fundamental programming language feature common to nearly all modern programming languages. Even some languages that support no direct use of references have some internal or implicit use. For example, the call by reference calling convention can be implemented with either explicit or implicit use of references.

Examples

Pointers are the most primitive type of reference. Due to their intimate relationship with the underlying hardware, they are one of the most powerful and efficient types of references. However, also due to this relationship, pointers require a strong understanding by the programmer of the details of memory architecture. Because pointers store a memory location's address, instead of a value directly, inappropriate use of pointers can lead to undefined behavior in a program, particularly due to dangling pointers or wild pointers. Smart pointers are opaque data structures that act like pointers but can only be accessed through particular methods.

A handle is an abstract reference, and may be represented in various ways. A common example are file handles (the FILE data structure in the C standard I/O library), used to abstract file content. It usually represents both the file itself, as when requesting a lock on the file, and a specific position within the file's content, as when reading a file.

In distributed comput

The mechanism of references, if varying in implementation, is a fundamental programming language feature common to nearly all modern programming languages. Even some languages that support no direct use of references have some internal or implicit use. For example, the call by reference calling convention can be implemented with either explicit or implicit use of references.

Pointers are the most primitive type of reference. Due to their intimate relationship with the underlying hardware, they are one of the most powerful and efficient types of references. However, also due to this relationship, pointers require a strong understanding by the programmer of the details of memory architecture. Because pointers store a memory location's address, instead of a value directly, inappropriate use of pointers can lead to undefined behavior in a program, particularly due to dangling pointers or wild pointers. Smart pointers are opaque data structures that act like pointers but can only be accessed through particular methods.

A handle is an abstract reference, and may be represented in various ways. A common example are file handles (the FILE data structure in the handle is an abstract reference, and may be represented in various ways. A common example are file handles (the FILE data structure in the C standard I/O library), used to abstract file content. It usually represents both the file itself, as when requesting a lock on the file, and a specific position within the file's content, as when reading a file.

In distributed computing, the reference may contain more than an address or identifier; it may also include an embedded specification of the network protocols used to locate and access the referenced object, the way information is encoded or serialized. Thus, for example, a WSDL description of a remote web service can be viewed as a form of reference; it includes a complete specification of how to locate and bind to a particular web service. A reference to a live distributed object is another example: it is a complete specification for how to construct a small software component called a proxy that will subsequently engage in a peer-to-peer interaction, and through which the local machine may gain access to data that is replicated or exists only as a weakly consistent message stream. In all these cases, the reference includes the full set of instructions, or a recipe, for how to access the data; in this sense, it serves the same purpose as an identifier or address in memory.

More generally, a reference can be considered as a piece of data that allows unique retrieval of another piece of data. This includes primary keys in databases and keys in an associative array. If we have a set of keys K and a set of data objects D, any well-defined (single-valued) function from K to D ∪ {null} defines a type of reference, where null is the image of a key not referring to anything meaningful.

An alternative representation of such a function is a directed graph called a reachability graph. Here, each datum is represented by a vertex and there is an edge from u to v if the datum in uAn alternative representation of such a function is a directed graph called a reachability graph. Here, each datum is represented by a vertex and there is an edge from u to v if the datum in u refers to the datum in v. The maximum out-degree is one. These graphs are valuable in garbage collection, where they can be used to separate accessible from inaccessible objects.

In many data structures, large, complex objects are composed of smaller objects. These objects are typically stored in one of two ways:

  1. With internal storage, the contents of the smaller object are stored inside the larger object.
  2. With external storage, the smaller objects are allocated in their own location, and the larger object only stores references to them.

Internal storage is usually more efficient, because there is a space cost for the references and dynamic allocation metadata, and a time cost associated with dereferencing a reference and with allocating the memory for the smaller objects. Internal storage also enhances locality of reference by keeping different parts of the same large object close together in memory. However, there are a variety of situations in which external storage is preferred:

  • If the data structure is recursive, meaning it may contain itself. This cannot be represented in the internal way.
  • If the larger object is being stored in an area with lim

    Some languages, such as Java, Smalltalk, Python, and Scheme, do not support internal storage. In these languages, all objects are uniformly accessed through references.

    Language support

    In assembly languages, the first languages used, it is typical to express references using either raw memory addresses or indexes into tables. These work, but are somewhat tricky to use, because an address tells you nothing about the value it points to, not even how la

    In assembly languages, the first languages used, it is typical to express references using either raw memory addresses or indexes into tables. These work, but are somewhat tricky to use, because an address tells you nothing about the value it points to, not even how large it is or how to interpret it; such information is encoded in the program logic. The result is that misinterpretations can occur in incorrect programs, causing bewildering errors.

    One of the earliest opaque references was that of the Lisp language cons cell, which is simply a record containing two references to other Lisp objects,

    One of the earliest opaque references was that of the Lisp language cons cell, which is simply a record containing two references to other Lisp objects, including possibly other cons cells. This simple structure is most commonly used to build singly linked lists, but can also be used to build simple binary trees and so-called "dotted lists", which terminate not with a null reference but a value.

    Another early language, Fortran, does not have an explicit representation of references, but does use them implicitly in its call-by-reference calling semantics.

    The pointer is still one of the most popular types of references today. It is similar to the assembly representation of a raw address, except that it carries a static datatype which can be used at compile-time to ensure that the data it refers to is not misinterpreted. However, because C has a weak type system which can be violated using casts (explicit conversions between various pointer types and between pointer types and integers), misinterpretation is still possible, if more difficult. Its successor C++ tried to increase type safety of pointers with new cast operators and smart pointers in its standard library, but still retained the ability to circumvent these safety mechanisms for compatibility.

    A number of popular mainstream languages today such as Eiffel, Java, C#, and Visual Basic have adopted a much more opaque type of reference, usually referred to as simply a reference. These references have types like C pointers indicating how to interpret the data they reference, but they are typesafe in that they cannot be interpreted as a raw address and unsafe conversions are not permitted.

    A Fortran reference is best thought of as an alias of another object, such as a scalar variable or a row or column of an array. There is no syntax to dereference the reference or manipulate the contents of the referent directly. Fortran references can be null. As in other languages, these references facilitate the processing of dynamic structures, such as linked lists, queues, and trees.

    Functional languages