Digital data

TheInfoList

OR:

Digital data, in
information theory Information theory is the scientific study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940 ...
and information systems, is
information Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
represented as a string of discrete symbols each of which can take on one of only a finite number of values from some
alphabet An alphabet is a standardized set of basic written graphemes (called letters) that represent the phonemes of certain spoken languages. Not all writing systems represent language in this way; in a syllabary, each character represents a sy ...
, such as letters or digits. An example is a text document, which consists of a string of alphanumeric characters . The most common form of digital data in modern information systems is ''
binary data Binary data is data whose unit can take on only two possible states. These are often labelled as 0 and 1 in accordance with the binary numeral system and Boolean algebra. Binary data occurs in many different technical and scientific fields, whe ...
'', which is represented by a string of
binary digit Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that ta ...
s (bits) each of which can have one of two values, either 0 or 1. Digital data can be contrasted with ''analog data'', which is represented by a value from a continuous range of
real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one- dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Ever ...
s. Analog data is transmitted by an
analog signal An analog signal or analogue signal (see spelling differences) is any continuous signal representing some other quantity, i.e., ''analogous'' to another quantity. For example, in an analog audio signal, the instantaneous signal voltage varies ...
, which not only takes on continuous values, but can vary continuously with time, a continuous
real-valued function In mathematics, a real-valued function is a function whose values are real numbers. In other words, it is a function that assigns a real number to each member of its domain. Real-valued functions of a real variable (commonly called ''real f ...
of time. An example is the air pressure variation in a sound wave. The word ''digital'' comes from the same source as the words digit and ''digitus'' (the
Latin Latin (, or , ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through ...
word for '' finger''), as fingers are often used for counting.
Mathematician A mathematician is someone who uses an extensive knowledge of mathematics in their work, typically to solve mathematical problems. Mathematicians are concerned with numbers, data, quantity, structure, space, models, and change. History ...
George Stibitz of
Bell Telephone Laboratories Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mul ...
used the word ''digital'' in reference to the fast electric pulses emitted by a device designed to aim and fire anti-aircraft guns in 1942. The term is most commonly used in
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
and
electronics The field of electronics is a branch of physics and electrical engineering that deals with the emission, behaviour and effects of electrons using electronic devices. Electronics uses active devices to control electron flow by amplification ...
, especially where real-world information is converted to binary numeric form as in
digital audio Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, sam ...
and
digital photography Digital photography uses cameras containing arrays of electronic photodetectors interfaced to an analog-to-digital converter (ADC) to produce images focused by a lens, as opposed to an exposure on photographic film. The digitized image is ...
.

# Symbol to digital conversion

Since symbols (for example, alphanumeric
characters Character or Characters may refer to: Arts, entertainment, and media Literature * ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk * ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to The ...
) are not continuous, representing symbols digitally is rather simpler than conversion of continuous or analog information to digital. Instead of sampling and quantization as in
analog-to-digital conversion In electronics, an analog-to-digital converter (ADC, A/D, or A-to-D) is a system that converts an analog signal, such as a sound picked up by a microphone or light entering a digital camera, into a digital signal. An ADC may also provid ...
, such techniques as polling and
encoding In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communicatio ...
are used. A symbol input device usually consists of a group of switches that are polled at regular intervals to see which switches are switched. Data will be lost if, within a single polling interval, two switches are pressed, or a switch is pressed, released, and pressed again. This polling can be done by a specialized processor in the device to prevent burdening the main CPU. When a new symbol has been entered, the device typically sends an interrupt, in a specialized format, so that the CPU can read it. For devices with only a few switches (such as the buttons on a
joystick A joystick, sometimes called a flight stick, is an input device consisting of a stick that pivots on a base and reports its angle or direction to the device it is controlling. A joystick, also known as the control column, is the principal cont ...
), the status of each can be encoded as bits (usually 0 for released and 1 for pressed) in a single word. This is useful when combinations of key presses are meaningful, and is sometimes used for passing the status of modifier keys on a keyboard (such as shift and control). But it does not scale to support more keys than the number of bits in a single byte or word. Devices with many switches (such as a
computer keyboard A computer keyboard is a peripheral input device modeled after the typewriter keyboard which uses an arrangement of buttons or keys to act as mechanical levers or electronic switches. Replacing early punched cards and paper tape technology, ...
) usually arrange these switches in a scan matrix, with the individual switches on the intersections of x and y lines. When a switch is pressed, it connects the corresponding x and y lines together. Polling (often called scanning in this case) is done by activating each x line in sequence and detecting which y lines then have a signal, thus which keys are pressed. When the keyboard processor detects that a key has changed state, it sends a signal to the CPU indicating the scan code of the key and its new state. The symbol is then encoded or converted into a number based on the status of modifier keys and the desired
character encoding Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values th ...
. A custom
encoding In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communicatio ...
can be used for a specific application with no loss of data. However, using a standard encoding such as
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because o ...
is problematic if a symbol such as 'ß' needs to be converted but is not in the standard. It is estimated that in the year 1986 less than 1% of the world's technological capacity to store information was digital and in 2007 it was already 94%."The World’s Technological Capacity to Store, Communicate, and Compute Information"
especiall
Supporting online material
Martin Hilbert and Priscila López (2011),
Science Science is a systematic endeavor that builds and organizes knowledge in the form of testable explanations and predictions about the universe. Science may be as old as the human species, and some of the earliest archeological evidence ...
, 332(6025), 60–65; free access to the article through here: martinhilbert.net/WorldInfoCapacity.html
The year 2002 is assumed to be the year when humankind was able to store more information in digital than in analog format (the "beginning of the digital age").

# States

Digital data come in these three states: data at rest, data in transit and
data in use Data in use is an information technology term referring to active data which is stored in a non-persistent digital state typically in computer random-access memory (RAM), CPU caches, or CPU registers. Scranton, PA data scientist Daniel Allen i ...
. The confidentiality, integrity and availability have to be managed during the entire lifecycle from 'birth' to the destruction of the data.

# Properties of digital information

All digital information possesses common properties that distinguish it from analog data with respect to communications: * Synchronization: Since digital information is conveyed by the sequence in which symbols are ordered, all digital schemes have some method for determining the beginning of a sequence. In written or spoken human languages, synchronization is typically provided by pauses (spaces), capitalization, and
punctuation Punctuation (or sometimes interpunction) is the use of spacing, conventional signs (called punctuation marks), and certain typographical devices as aids to the understanding and correct reading of written text, whether read silently or aloud. An ...
. Machine communications typically use special synchronization sequences. * Language: All digital communications require a '' formal language'', which in this context consists of all the information that the sender and receiver of the digital communication must both possess, in advance, in order for the communication to be successful. Languages are generally arbitrary and specify the meaning to be assigned to particular symbol sequences, the allowed range of values, methods to be used for synchronization, etc. * Errors: Disturbances (
noise Noise is unwanted sound considered unpleasant, loud or disruptive to hearing. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrations through a medium, such as air or water. The difference ari ...
) in analog communications invariably introduce some, generally small deviation or error between the intended and actual communication. Disturbances in digital communication only result in errors when the disturbance is so large as to result in a symbol being misinterpreted as another symbol or disturb the sequence of symbols. It is generally possible to have a near error-free digital communication. Further, techniques such as check codes may be used to detect errors and correct them through redundancy or re-transmission. Errors in digital communications can take the form of ''substitution errors'' in which a symbol is replaced by another symbol, or ''insertion/deletion'' errors in which an extra incorrect symbol is inserted into or deleted from a digital message. Uncorrected errors in digital communications have an unpredictable and generally large impact on the information content of the communication. * Copying: Because of the inevitable presence of noise, making many successive copies of an analog communication is infeasible because each generation increases the noise. Because digital communications are generally error-free, copies of copies can be made indefinitely. * Granularity: The digital representation of a continuously variable analog value typically involves a selection of the number of symbols to be assigned to that value. The number of symbols determines the precision or resolution of the resulting datum. The difference between the actual analog value and the digital representation is known as ''
quantization error Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set (often a continuous set) to output values in a (countable) smaller set, often with a finite number of elements. Rounding and ...
''. For example, if the actual temperature is 23.234456544453 degrees, but if only two digits (23) are assigned to this parameter in a particular digital representation, the quantizing error is: 0.234456544453. This property of digital communication is known as ''granularity''. * Compressible: According to Miller, "Uncompressed digital data is very large, and in its raw form, it would actually produce a larger signal (therefore be more difficult to transfer) than analog data. However, digital data can be compressed. Compression reduces the amount of bandwidth space needed to send information. Data can be compressed, sent and then decompressed at the site of consumption. This makes it possible to send much more information and result in, for example, digital television signals offering more room on the airwave spectrum for more television channels."

# Historical digital systems

Even though digital signals are generally associated with the binary electronic digital systems used in modern electronics and computing, digital systems are actually ancient, and need not be binary or electronic. * DNA
genetic code The genetic code is the set of rules used by living cells to translate information encoded within genetic material ( DNA or RNA sequences of nucleotide triplets, or codons) into proteins. Translation is accomplished by the ribosome, which links ...
is a naturally occurring form of digital data storage. * Written text (due to the limited character set and the use of discrete symbols – the alphabet in most cases) * The ''
abacus The abacus (''plural'' abaci or abacuses), also called a counting frame, is a calculating tool which has been used since ancient times. It was used in the ancient Near East, Europe, China, and Russia, centuries before the adoption of the Hi ...
'' was created sometime between 1000 BC and 500 BC, it later became a form of calculation frequency. Nowadays it can be used as a very advanced, yet basic digital calculator that uses beads on rows to represent numbers. Beads only have meaning in discrete up and down states, not in analog in-between states. * A '' beacon'' is perhaps the simplest non-electronic digital signal, with just two states (on and off). In particular, '' smoke signals'' are one of the oldest examples of a digital signal, where an analog "carrier" (smoke) is
modulated In electronics and telecommunications, modulation is the process of varying one or more properties of a periodic waveform, called the ''carrier signal'', with a separate signal called the ''modulation signal'' that typically contains informa ...
with a blanket to generate a digital signal (puffs) that conveys information. *
Morse code Morse code is a method used in telecommunication to encode text characters as standardized sequences of two different signal durations, called ''dots'' and ''dashes'', or ''dits'' and ''dahs''. Morse code is named after Samuel Morse, one ...
uses six digital states—dot, dash, intra-character gap (between each dot or dash), short gap (between each letter), medium gap (between words), and long gap (between sentences)—to send messages via a variety of potential carriers such as electricity or light, for example using an electrical telegraph or a flashing light. * The
Braille Braille (Pronounced: ) is a tactile writing system used by people who are visually impaired, including people who are blind, deafblind or who have low vision. It can be read either on embossed paper or by using refreshable braille displ ...
uses a six-bit code rendered as dot patterns. * Flag semaphore uses rods or flags held in particular positions to send messages to the receiver watching them some distance away. * International maritime signal flags have distinctive markings that represent letters of the alphabet to allow ships to send messages to each other. * More recently invented, a
modem A modulator-demodulator or modem is a computer hardware device that converts data from a digital format into a format suitable for an analog transmission medium such as telephone or radio. A modem transmits data by modulating one or more c ...
modulates an analog "carrier" signal (such as sound) to encode binary electrical digital information, as a series of binary digital sound pulses. A slightly earlier, surprisingly reliable version of the same concept was to bundle a sequence of audio digital "signal" and "no signal" information (i.e. "sound" and "silence") on magnetic cassette tape for use with early home computers.

# See also

*
Analog-to-digital converter In electronics, an analog-to-digital converter (ADC, A/D, or A-to-D) is a system that converts an analog signal, such as a sound picked up by a microphone or light entering a digital camera, into a digital signal. An ADC may also provid ...
* Barker code *
Binary number A binary number is a number expressed in the base-2 numeral system or binary numeral system, a method of mathematical expression which uses only two symbols: typically "0" (zero) and "1" (one). The base-2 numeral system is a positional notation ...
* Comparison of analog and digital recording * Data (computer science) *
Data remanence Data remanence is the residual representation of digital data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting o ...
* Digital architecture *
Digital art Digital art refers to any artistic work or practice that uses digital technology as part of the creative or presentation process, or more specifically computational art that uses and engages with digital media. Since the 1960s, various name ...
* Digital control * Digital divide * Digital electronics * Digital infinity *
Digital native The term digital native describes a person who has grown up in the information age. Often grouped into Millennials, Generation Z, and Generation Alpha, these individuals can consume digital information and stimuli quickly and comfortably through ...
* Digital physics * Digital recording * Digital Revolution *
Digital video Digital video is an electronic representation of moving visual images ( video) in the form of encoded digital data. This is in contrast to analog video, which represents moving visual images in the form of analog signals. Digital video comprise ...
*
Digital-to-analog converter In electronics, a digital-to-analog converter (DAC, D/A, D2A, or D-to-A) is a system that converts a digital signal into an analog signal. An analog-to-digital converter (ADC) performs the reverse function. There are several DAC archi ...
*
Internet forum An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are often longer than one line of text, and are at least tempor ...

# Further reading

* Tocci, R. 2006. Digital Systems: Principles and Applications (10th Edition). Prentice Hall. {{DEFAULTSORT:Digital data Digital media Computer data Digital systems Digital technology Consumer electronics