computer programming Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...

, a dope vector is a

data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...

used to hold information about a

data object In software development, an object is an entity that has state, behavior, and identity. An object can model some part of reality or can be an invention of the design process whose collaborations with other such objects serve as the mechanisms t ...

, especially its

memory layout Data structure alignment is the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing. The CPU in modern computer hardware performs reads a ...

Purpose

Dope vectors are most commonly used to describe

arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...

, which commonly store multiple instances of a particular datatype as a contiguous block of memory. For example, an array containing 100 elements, each of which occupies 32 bytes, requires 100 × 32 bytes. By itself, such a memory block has no place to keep track of how large the array (or other object) is overall, how large each element within it is, or how many elements it contains. A dope vector is a place to store such information. Dope vectors can also describe structures which may contain arrays or variable elements. If such an array is stored contiguously, with the first byte at memory location ''M'', then its last byte is at location . A major advantage of this arrangement is that locating item ''N'' is easy: it begins at location . Of course, the value 32 must be known (this value is commonly called the "stride" of the array or the "width" of the array's elements). Navigating an array data structure using an index is called

dead reckoning In navigation, dead reckoning is the process of calculating the current position of a moving object by using a previously determined position, or fix, and incorporating estimates of speed, heading (or direction or course), and elapsed time. T ...

. This arrangement, however (without adding dope vectors) means that having the location of item N is not enough to discover the index N itself; or the stride; or whether there are elements at or . For example, a function or method may iterate over all the items in an array and pass each one to another function or method, which does not know the item is part of an array at all, much less where or how large the array is. Without a dope vector, even knowing the address of the entire array does not tell you how big it is. This is important because writing to the element in an array that only contains ''N'' elements, will likely destroy some other data. Because many programming languages treat character strings as a kind of array, this leads directly to the infamous buffer overflow problem. A dope vector reduces these problems by storing a small amount of

metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...

along with an array (or other object). With dope vectors, a compiler can easily (and optionally) insert code that prevents accidentally writing beyond the end of an array or other object. Alternatively, the programmer can access the dope vector when desired, for safety or other purposes.

Description

The exact set of metadata included in a dope vector varies from one language and/or operating system to another, but a dope vector for an array might contain: * a pointer to the location in memory where the array elements begin (this is normally identical to the location of the zeroth element of the array (element with all subscripts 0). (This might not be the first actual element if subscripts do not start at zero.) * the type of each array element (integer, Boolean, a particular

class Class, Classes, or The Class may refer to: Common uses not otherwise categorized * Class (biology), a taxonomic rank * Class (knowledge representation), a collection of individuals or objects * Class (philosophy), an analytical concept used d ...

, etc.). * the rank of an array. * the extent of an array (its range of indices). (In many languages the starting index for arrays is fixed at zero, or one, but the ending index is set when the array is (re-)allocated.) * for arrays where the extent in use at a given time may change, the maximum and current extents may both be stored. * the stride of an array, or the amount of memory occupied by each element of the array. A program then can refer to the array (or other dope-vector-using object) by referring to the dope vector. This is commonly automatic in

high-level language A high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to use, or may automate (or ...

s. Getting to an element of the array costs a tiny bit more (commonly one instruction, which fetches the pointer to the actual data from out of the dope vector). On the other hand, doing many other common operations are easier and/or faster: * Without a dope vector, determining the number of elements in the array is impossible. Thus it is common to add an extra element to the end of an array, with a "reserved" value (such as NULL). The length can then be determined by scanning forward through the array, counting elements until this "end-marker" is reached. Of course, this makes length-checking much slower than looking up the length directly in a dope vector. * Without knowing the extent of an array, it is not possible to free() (unallocate) that memory when it is no longer needed. Thus, without dope vectors, something must store that length somewhere else. For example, asking a particular OS to allocate space for a 3200-byte array, might cause it to allocate 3204 bytes at some location M; it would then store the size in the first 4 bytes, and tell the requesting program the allocated space starts at M+4 (so that the caller will not treat the extra 4 bytes as part of the array proper). This extra data is not considered a dope vector, but achieves some of the same goals. * Without dope vectors, extra information must also be kept about the stride (or width) of array elements. In C, this information is handled by the compiler, which must keep track of a datatype distinction between "pointer to an array of 20-byte-wide elements", and "pointer to an array of 1000-byte-wide elements". This means that a pointer to an element in either kind of array can be incremented or decremented in order to reach the next or previous element; but it also means that array widths must be fixed at an earlier stage. Even with a dope vector, having (only) a pointer to a particular member of an array does not enable finding the position in the array, or the location of the array or the dope vector itself. If that is desired, such information can be added to each element within the array. Such per-element information can be useful, but is not part of the dope vector. Dope vectors can be a general facility, shared across multiple datatypes (not just arrays and/or strings).

References

Arrays {{compu-prog-stub

Purpose

Description

See also

References