HOME

TheInfoList



OR:

In
computer programming Computer programming is the process of performing a particular computation (or more generally, accomplishing a specific computing result), usually by designing and building an executable computer program. Programming involves tasks such as anal ...
, a dope vector is a
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, ...
used to hold information about a
data object In computer science, an object can be a variable, a data structure, a function, or a method. As regions of memory, they contain value and are referenced by identifiers. In the object-oriented programming paradigm, ''object'' can be a combinati ...
, especially its memory layout.


Purpose

Dope vectors are most commonly used to describe arrays, which commonly store multiple instances of a particular datatype as a contiguous block of memory. For example, an array containing 100 elements, each of which occupies 32 bytes, requires 100 × 32 bytes. By itself, such a memory block has no place to keep track of how large the array (or other object) is overall, how large each element within it is, or how many elements it contains. A dope vector is a place to store such information. Dope vectors can also describe structures which may contain arrays or variable elements. If such an array is stored contiguously, with the first byte at memory location ''M'', then its last byte is at location . A major advantage of this arrangement is that locating item ''N'' is easy: it begins at location . Of course, the value 32 must be known (this value is commonly called the "stride" of the array or the "width" of the array's elements). Navigating an array data structure using an index is called
dead reckoning In navigation, dead reckoning is the process of calculating current position of some moving object by using a previously determined position, or fix, and then incorporating estimates of speed, heading direction, and course over elapsed time. ...
. This arrangement, however (without adding dope vectors) means that having the location of item N is not enough to discover the index N itself; or the stride; or whether there are elements at or . For example, a function or method may iterate over all the items in an array and pass each one to another function or method, which does not know the item is part of an array at all, much less where or how large the array is. Without a dope vector, even knowing the address of the entire array does not tell you how big it is. This is important because writing to the element in an array that only contains ''N'' elements, will likely destroy some other data. Because many programming languages treat character strings as a kind of array, this leads directly to the infamous
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memo ...
problem. A dope vector reduces these problems by storing a small amount of
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
along with an array (or other object). With dope vectors, a compiler can easily (and optionally) insert code that prevents accidentally writing beyond the end of an array or other object. Alternatively, the programmer can access the dope vector when desired, for safety or other purposes.


Description

The exact set of metadata included in a dope vector varies from one language and/or operating system to another, but a dope vector for an
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
might contain: * a pointer to the location in memory where the array elements begin (this is normally identical to the location of the zeroth element of the array (element with all subscripts 0). (This might not be the first actual element if subscripts do not start at zero.) * the type of each array element (integer, Boolean, a particular
class Class or The Class may refer to: Common uses not otherwise categorized * Class (biology), a taxonomic rank * Class (knowledge representation), a collection of individuals or objects * Class (philosophy), an analytical concept used differently ...
, etc.). * the rank of an array. * the extent of an array (its range of indices). (In many languages the starting index for arrays is fixed at zero, or one, but the ending index is set when the array is (re-)allocated.) * for arrays where the extent in use at a given time may change, the maximum and current extents may both be stored. * the stride of an array, or the amount of memory occupied by each element of the array. A program then can refer to the array (or other dope-vector-using object) by referring to the dope vector. This is commonly automatic in
high-level language In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to use, ...
s. Getting to an element of the array costs a tiny bit more (commonly one instruction, which fetches the pointer to the actual data from out of the dope vector). On the other hand, doing many other common operations are easier and/or faster: * Without a dope vector, determining the number of elements in the array is impossible. Thus it is common to add an extra element to the end of an array, with a "reserved" value (such as NULL). The length can then be determined by scanning forward through the array, counting elements until this "end-marker" is reached. Of course, this makes length-checking much slower than looking up the length directly in a dope vector. * Without knowing the extent of an array, it is not possible to free() (unallocate) that memory when it is no longer needed. Thus, without dope vectors, something must store that length somewhere else. For example, asking a particular OS to allocate space for a 3200-byte array, might cause it to allocate 3204 bytes at some location M; it would then store the size in the first 4 bytes, and tell the requesting program the allocated space starts at M+4 (so that the caller will not treat the extra 4 bytes as part of the array proper). This extra data is not considered a dope vector, but achieves some of the same goals. * Without dope vectors, extra information must also be kept about the stride (or width) of array elements. In C, this information is handled by the compiler, which must keep track of a datatype distinction between "pointer to an array of 20-byte-wide elements", and "pointer to an array of 1000-byte-wide elements". This means that a pointer to an element in either kind of array can be incremented or decremented in order to reach the next or previous element; but it also means that array widths must be fixed at an earlier stage. Even with a dope vector, having (only) a pointer to a particular member of an array does not enable finding the position in the array, or the location of the array or the dope vector itself. If that is desired, such information can be added to each element within the array. Such per-element information can be useful, but is not part of the dope vector. Dope vectors can be a general facility, shared across multiple datatypes (not just arrays and/or strings).


See also

* Data descriptor *
Iliffe vector In computer programming, an Iliffe vector, also known as a display, is a data structure used to implement multi-dimensional arrays. An Iliffe vector for an ''n''-dimensional array (where ''n'' ≥ 2) consists of a vector (or 1-dimension ...
*
Reflection (computer programming) In computer science, reflective programming or reflection is the ability of a process to examine, introspect, and modify its own structure and behavior. Historical background The earliest computers were programmed in their native assembly lan ...


References

Arrays {{compu-prog-stub