unit of information
In computing and telecommunications, a unit of information is the capacity of some standard data storage system or communication channel, used to measure the capacities of other systems and channels. In information theory, units of information a ...
in
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
and digital
communication
Communication (from la, communicare, meaning "to share" or "to be in relation with") is usually defined as the transmission of information. The term may also refer to the message communicated through such transmissions or the field of inqui ...
s. The name is a
portmanteau
A portmanteau word, or portmanteau (, ) is a blend of words The bit represents a logical state with one of two possible
values
In ethics and social sciences, value denotes the degree of importance of something or action, with the aim of determining which actions are best to do or what way is best to live (normative ethics in ethics), or to describe the significance of di ...
. These values are most commonly represented as either , but other representations such as ''true''/''false'', ''yes''/''no'', ''on''/''off'', or ''+''/''−'' are also commonly used.
The relation between these values and the physical states of the underlying storage or device is a matter of convention, and different assignments may be used even within the same device or
program
Program, programme, programmer, or programming may refer to:
Business and management
* Program management, the process of managing several related projects
* Time management
* Program, a part of planning
Arts and entertainment Audio
* Progra ...
. It may be physically implemented with a two-state device.
The symbol for the binary digit is either "bit" per recommendation by the IEC 80000-13:2008 standard, or the lowercase character "b", as recommended by the IEEE 1541-2002 standard.
A contiguous group of binary digits is commonly called a ''
bit string
A bit array (also known as bitmask, bit map, bit set, bit string, or bit vector) is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level ...
'', a bit vector, or a single-dimensional (or multi-dimensional) ''
bit array
A bit array (also known as bitmask, bit map, bit set, bit string, or bit vector) is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level ...
''.
A group of eight bits is called one ''
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
'', but historically the size of the byte is not strictly defined. Frequently, half, full, double and quadruple words consist of a number of bytes which is a low power of two. A string of four bits is a '' nibble''.
In information theory, one bit is the
information entropy
In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable X, which takes values in the alphabet \ ...
of a random
binary
Binary may refer to:
Science and technology Mathematics
* Binary number, a representation of numbers using only two digits (0 and 1)
* Binary function, a function that takes two arguments
* Binary operation, a mathematical operation that ta ...
variable that is 0 or 1 with equal probability, or the information that is gained when the value of such a variable becomes known. As a
unit of information
In computing and telecommunications, a unit of information is the capacity of some standard data storage system or communication channel, used to measure the capacities of other systems and channels. In information theory, units of information a ...
The encoding of data by discrete bits was used in the
punched card
A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
s invented by Basile Bouchon and Jean-Baptiste Falcon (1732), developed by Joseph Marie Jacquard (1804), and later adopted by Semyon Korsakov, Charles Babbage, Hermann Hollerith, and early computer manufacturers like IBM. A variant of that idea was the perforated paper tape. In all those systems, the medium (card or tape) conceptually carried an array of hole positions; each position could be either punched through or not, thus carrying one bit of information. The encoding of text by bits was also used in Morse code (1844) and early digital communications machines such as teletypes and stock ticker machines (1870).
Ralph Hartley
Ralph Vinton Lyon Hartley (November 30, 1888 – May 1, 1970) was an American electronics researcher. He invented the Hartley oscillator and the Hartley transform, and contributed to the foundations of information theory.
Biography
Hartley wa ...
suggested the use of a logarithmic measure of information in 1928. Claude E. Shannon first used the word "bit" in his seminal 1948 paper "
A Mathematical Theory of Communication
"A Mathematical Theory of Communication" is an article by mathematician Claude E. Shannon published in ''Bell System Technical Journal'' in 1948. It was renamed ''The Mathematical Theory of Communication'' in the 1949 book of the same name, a sma ...
". He attributed its origin to
John W. Tukey
John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distribut ...
, who had written a Bell Labs memo on 9 January 1947 in which he contracted "binary information digit" to simply "bit".
Vannevar Bush
Vannevar Bush ( ; March 11, 1890 – June 28, 1974) was an American engineer, inventor and science administrator, who during World War II headed the U.S. Office of Scientific Research and Development (OSRD), through which almost all warti ...
had written in 1936 of "bits of information" that could be stored on the
punched card
A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
s used in the mechanical computers of that time. The first programmable computer, built by
Konrad Zuse
Konrad Ernst Otto Zuse (; 22 June 1910 – 18 December 1995) was a German civil engineer, pioneering computer scientist, inventor and businessman. His greatest achievement was the world's first programmable computer; the functional program ...
, used binary notation for numbers.
Physical representation
A bit can be stored by a digital device or other physical system that exists in either of two possible distinct states. These may be the two stable states of a flip-flop, two positions of an
electrical switch
In electrical engineering, a switch is an electrical component that can disconnect or connect the conducting path in an electrical circuit, interrupting the electric current or diverting it from one conductor to another. The most common type of ...
, two distinct
voltage
Voltage, also known as electric pressure, electric tension, or (electric) potential difference, is the difference in electric potential between two points. In a static electric field, it corresponds to the work needed per unit of charge to ...
or
current
Currents, Current or The Current may refer to:
Science and technology
* Current (fluid), the flow of a liquid or a gas
** Air current, a flow of air
** Ocean current, a current in the ocean
*** Rip current, a kind of water current
** Current (stre ...
levels allowed by a circuit, two distinct levels of light intensity, two directions of magnetization or polarization, the orientation of reversible double stranded DNA, etc.
Bits can be implemented in several forms. In most modern computing devices, a bit is usually represented by an
electrical
Electricity is the set of physical phenomena associated with the presence and motion of matter that has a property of electric charge. Electricity is related to magnetism, both being part of the phenomenon of electromagnetism, as described ...
voltage
Voltage, also known as electric pressure, electric tension, or (electric) potential difference, is the difference in electric potential between two points. In a static electric field, it corresponds to the work needed per unit of charge to ...
or
current
Currents, Current or The Current may refer to:
Science and technology
* Current (fluid), the flow of a liquid or a gas
** Air current, a flow of air
** Ocean current, a current in the ocean
*** Rip current, a kind of water current
** Current (stre ...
pulse, or by the electrical state of a flip-flop circuit.
For devices using positive logic, a digit value of (or a logical value of true) is represented by a more positive voltage relative to the representation of . The specific voltages are different for different logic families and variations are permitted to allow for component aging and noise immunity. For example, in
transistor–transistor logic Transistor–transistor logic (TTL) is a logic family built from bipolar junction transistors. Its name signifies that transistors perform both the logic function (the first "transistor") and the amplifying function (the second "transistor"), as o ...
(TTL) and compatible circuits, digit values and at the output of a device are represented by no higher than 0.4 volts and no lower than 2.6 volts, respectively; while TTL inputs are specified to recognize 0.8 volts or below as and 2.2 volts or above as .
Transmission and processing
Bits are transmitted one at a time in
serial transmission
In telecommunication and data transmission, serial communication is the process of sending data one bit at a time, sequentially, over a communication channel or computer bus. This is in contrast to parallel communication, where several bits are s ...
, and by a multiple number of bits in
parallel transmission
In data transmission, parallel communication is a method of conveying multiple binary digits (bits) simultaneously using multiple conductors. This contrasts with serial communication, which conveys only a single bit at a time; this distinction ...
. A bitwise operation optionally processes bits one at a time. Data transfer rates are usually measured in decimal SI multiples of the unit bit per second (bit/s), such as kbit/s.
Storage
In the earliest non-electronic information processing devices, such as Jacquard's loom or Babbage's Analytical Engine, a bit was often stored as the position of a mechanical lever or gear, or the presence or absence of a hole at a specific point of a paper card or tape. The first electrical devices for discrete logic (such as
elevator
An elevator or lift is a cable-assisted, hydraulic cylinder-assisted, or roller-track assisted machine that vertically transports people or freight between floors, levels, or decks of a building, vessel, or other structure. They a ...
and
traffic light
Traffic lights, traffic signals, or stoplights – known also as robots in South Africa are signalling devices positioned at road intersections, pedestrian crossings, and other locations in order to control flows of traffic.
Traffic light ...
telephone switches
telephone exchange, telephone switch, or central office is a telecommunications system used in the public switched telephone network (PSTN) or in large enterprises. It interconnects telephone subscriber lines or virtual circuits of digital syst ...
, and Konrad Zuse's computer) represented bits as the states of
electrical relay
A relay
Electromechanical relay schematic showing a control coil, four pairs of normally open and one pair of normally closed contacts
An automotive-style miniature relay with the dust cover taken off
A relay is an electrically operated switch ...
s which could be either "open" or "closed". When relays were replaced by
vacuum tube
A vacuum tube, electron tube, valve (British usage), or tube (North America), is a device that controls electric current flow in a high vacuum between electrodes to which an electric potential difference has been applied.
The type known as ...
s, starting in the 1940s, computer builders experimented with a variety of storage methods, such as pressure pulses traveling down a mercury delay line, charges stored on the inside surface of a cathode-ray tube, or opaque spots printed on glass discs by
photolithographic
In integrated circuit manufacturing, photolithography or optical lithography is a general term used for techniques that use light to produce minutely patterned thin films of suitable materials over a substrate, such as a silicon wafer, to protect ...
techniques.
In the 1950s and 1960s, these methods were largely supplanted by
magnetic storage
Magnetic storage or magnetic recording is the storage of data on a magnetized medium. Magnetic storage uses different patterns of magnetisation in a magnetizable material to store data and is a form of non-volatile memory. The information is ac ...
devices such as
magnetic-core memory
Magnetic-core memory was the predominant form of random-access computer memory for 20 years between about 1955 and 1975.
Such memory is often just called core memory, or, informally, core.
Core memory uses toroids (rings) of a hard magneti ...
, magnetic tapes, drums, and disks, where a bit was represented by the polarity of magnetization of a certain area of a ferromagnetic film, or by a change in polarity from one direction to the other. The same principle was later used in the magnetic bubble memory developed in the 1980s, and is still found in various
magnetic strip
The term digital card can refer to a physical item, such as a memory card on a camera, or, increasingly since 2017, to the digital content hosted
as a virtual card or cloud card, as a digital virtual representation of a physical card. They share ...
items such as
metro
Metro, short for metropolitan, may refer to:
Geography
* Metro (city), a city in Indonesia
* A metropolitan area, the populated region including and surrounding an urban center
Public transport
* Rapid transit, a passenger railway in an urb ...
tickets and some
credit card
A credit card is a payment card issued to users (cardholders) to enable the cardholder to pay a merchant for goods and services based on the cardholder's accrued debt (i.e., promise to the card issuer to pay them for the amounts plus the o ...
electric charge
Electric charge is the physical property of matter that causes charged matter to experience a force when placed in an electromagnetic field. Electric charge can be ''positive'' or ''negative'' (commonly carried by protons and electrons respe ...
stored in a
capacitor
A capacitor is a device that stores electrical energy in an electric field by virtue of accumulating electric charges on two close surfaces insulated from each other. It is a passive electronic component with two terminals.
The effect of ...
read-only memory
Read-only memory (ROM) is a type of non-volatile memory used in computers and other electronic devices. Data stored in ROM cannot be electronically modified after the manufacture of the memory device. Read-only memory is useful for storing sof ...
, a bit may be represented by the presence or absence of a conducting path at a certain point of a circuit. In
optical disc
In computing and optical disc recording technologies, an optical disc (OD) is a flat, usually circular disc that encodes binary data (bits) in the form of pits and lands on a special material, often aluminum, on one of its flat surfaces. ...
s, a bit is encoded as the presence or absence of a
microscopic
The microscopic scale () is the scale of objects and events smaller than those that can easily be seen by the naked eye, requiring a lens or microscope to see them clearly. In physics, the microscopic scale is sometimes regarded as the scale be ...
pit on a reflective surface. In one-dimensional
bar code
A barcode or bar code is a method of representing data in a visual, machine-readable form. Initially, barcodes represented data by varying the widths, spacings and sizes of parallel lines. These barcodes, now commonly referred to as linear or o ...
s, bits are encoded as the thickness of alternating black and white lines.
IEC 60027 IEC 60027 (formerly IEC 27) is a technical international standard for letter symbols published by the International Electrotechnical Commission (IEC), comprising the following parts:
* IEC 60027-1: General
* IEC 60027-2: Telecommunications and elect ...
, which specifies that the symbol for binary digit should be 'bit', and this should be used in all multiples, such as 'kbit', for kilobit. However, the lower-case letter 'b' is widely used as well and was recommended by the IEEE 1541 Standard (2002). In contrast, the upper case letter 'B' is the standard and customary symbol for byte.
Multiple bits
Multiple bits may be expressed and represented in several ways. For convenience of representing commonly reoccurring groups of bits in information technology, several
units of information
In computing and telecommunications, a unit of information is the capacity of some standard data storage system or communication channel, used to measure the capacities of other systems and channels. In information theory, units of information a ...
have traditionally been used. The most common is the unit
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
, coined by
Werner Buchholz
Werner Buchholz (24 October 1922 – 11 July 2019) was a German-American computer scientist. After growing up in Europe, Buchholz moved to Canada and then to the United States. He worked for International Business Machines (IBM) in New York. In ...
in June 1956, which historically was used to represent the group of bits used to encode a single
character
Character or Characters may refer to:
Arts, entertainment, and media Literature
* ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk
* ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to The ...
of text (until
UTF-8
UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''.
UTF-8 is capable of ...
multibyte encoding took over) in a computer and for this reason it was used as the basic addressable element in many
computer architecture
In computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a more detailed level, the ...
s. The trend in hardware design converged on the most common implementation of using eight bits per byte, as it is widely used today. However, because of the ambiguity of relying on the underlying hardware design, the unit
octet
Octet may refer to:
Music
* Octet (music), ensemble consisting of eight instruments or voices, or composition written for such an ensemble
** String octet, a piece of music written for eight string instruments
*** Octet (Mendelssohn), 1825 compos ...
was defined to explicitly denote a sequence of eight bits.
Computers usually manipulate bits in groups of a fixed size, conventionally named "
words
A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no conse ...
". Like the byte, the number of bits in a word also varies with the hardware design, and is typically between 8 and 80 bits, or even more in some specialized computers. In the 21st century, retail personal or server computers have a word size of 32 or 64 bits.
The International System of Units defines a series of decimal prefixes for multiples of standardized units which are commonly also used with the bit and the byte. The prefixes
kilo
KILO (94.3 FM broadcasting, FM, 94.3 KILO) is a radio station broadcasting in Colorado Springs, Colorado, Colorado Springs and Pueblo, Colorado, Pueblo, Colorado. It also streams online.
History
KLST and KPIK-FM
The 94.3 signal signed on th ...
(103) through
yotta
A metric prefix is a unit prefix that precedes a basic unit of measure to indicate a multiple or submultiple of the unit. All metric prefixes used today are decadic. Each prefix has a unique symbol that is prepended to any unit symbol. The pre ...
(1024) increment by multiples of one thousand, and the corresponding units are the
kilobit
The kilobit is a multiple of the unit bit for digital information or computer storage. The prefix '' kilo-'' (symbol k) is defined in the International System of Units (SI) as a multiplier of 103 (1 thousand), and therefore,
:1 kilobit = = 10 ...
(kbit) through the
yottabit
The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented a ...
(Ybit).
Information capacity and information compression
When the information capacity of a storage system or a communication channel is presented in ''bits'' or ''bits per second'', this often refers to binary digits, which is a computer hardware capacity to store binary data ( or , up or down, current or not, etc.). Information capacity of a storage system is only an upper bound to the quantity of information stored therein. If the two possible values of one bit of storage are not equally likely, that bit of storage contains less than one bit of information. If the value is completely predictable, then the reading of that value provides no information at all (zero entropic bits, because no resolution of uncertainty occurs and therefore no information is available). If a computer file that uses ''n'' bits of storage contains only ''m'' < ''n'' bits of information, then that information can in principle be encoded in about ''m'' bits, at least on the average. This principle is the basis of
data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressio ...
technology. Using an analogy, the hardware binary digits refer to the amount of storage space available (like the number of buckets available to store things), and the information content the filling, which comes in different levels of granularity (fine or coarse, that is, compressed or uncompressed information). When the granularity is finer—when information is more compressed—the same bucket can hold more.
For example, it is estimated that the combined technological capacity of the world to store information provides 1,300
exabyte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
s of hardware digits. However, when this storage space is filled and the corresponding content is optimally compressed, this only represents 295 exabytes of information. When optimally compressed, the resulting carrying capacity approaches
Shannon information
In information theory, the information content, self-information, surprisal, or Shannon information is a basic quantity derived from the probability of a particular event occurring from a random variable. It can be thought of as an alternative wa ...
or
information entropy
In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable X, which takes values in the alphabet \ ...
processor
Processor may refer to:
Computing Hardware
* Processor (computing)
**Central processing unit (CPU), the hardware within a computer that executes a program
*** Microprocessor, a central processing unit contained on a single integrated circuit (I ...
instructions (such as ''bit set'') operate at the level of manipulating bits rather than manipulating data interpreted as an aggregate of bits.
In the 1980s, when
bitmap
In computing, a bitmap is a mapping from some domain (for example, a range of integers) to bits. It is also called a bit array or bitmap index.
As a noun, the term "bitmap" is very often used to refer to a particular bitmapping application: t ...
ped computer displays became popular, some computers provided specialized
bit block transfer
Bit blit (also written BITBLT, BIT BLT, BitBLT, Bit BLT, Bit Blt etc., which stands for ''bit block transfer'') is a data operation commonly used in computer graphics in which several bitmaps are combined into one using a ''boolean function''.
The ...
instructions to set or copy the bits that corresponded to a given rectangular area on the screen.
In most computers and programming languages, when a bit within a group of bits, such as a byte or word, is referred to, it is usually specified by a number from 0 upwards corresponding to its position within the byte or word. However, 0 can refer to either the
most
Most or Möst or ''variation'', may refer to:
Places
* Most, Kardzhali Province, a village in Bulgaria
* Most (city), a city in the Czech Republic
** Most District, a district surrounding the city
** Most Basin, a lowland named after the city
** A ...
torque
In physics and mechanics, torque is the rotational equivalent of linear force. It is also referred to as the moment of force (also abbreviated to moment). It represents the capability of a force to produce change in the rotational motion of th ...
and
energy
In physics, energy (from Ancient Greek: ἐνέργεια, ''enérgeia'', “activity”) is the quantitative property that is transferred to a body or to a physical system, recognizable in the performance of work and in the form of hea ...
dimensionality
In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coordin ...
of
units of measurement
A unit of measurement is a definite magnitude of a quantity, defined and adopted by convention or by law, that is used as a standard for measurement of the same kind of quantity. Any other quantity of that kind can be expressed as a multi ...
, but there is in general no meaning to adding, subtracting or otherwise combining the units mathematically, although one may act as a bound on the other.
Units of information used in information theory include the '' shannon'' (Sh), the '' natural unit of information'' (nat) and the '' hartley'' (Hart). One shannon is the maximum amount of information needed to specify the state of one bit of storage. These are related by 1 Sh ≈ 0.693 nat ≈ 0.301 Hart.
Some authors also define a binit as an arbitrary information unit equivalent to some fixed but unspecified number of bits.
See also
*
Byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
*
Integer (computer science)
In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers. Integral data types may be of different sizes and may or may not be allowed to contain negative values. Integers are ...
*
Primitive data type
In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled pr ...
Qubit
In quantum computing, a qubit () or quantum bit is a basic unit of quantum information—the quantum version of the classic binary bit physically realized with a two-state device. A qubit is a two-state (or two-level) quantum-mechanical system, ...
(quantum bit)
*
Bitstream
A bitstream (or bit stream), also known as binary sequence, is a sequence of bits.
A bytestream is a sequence of bytes. Typically, each byte is an 8-bit quantity, and so the term octet stream is sometimes used interchangeably. An octet may ...
*
Entropy (information theory)
In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable X, which takes values in the alphabet \ ...
baud rate
In telecommunication and electronics, baud (; symbol: Bd) is a common unit of measurement of symbol rate, which is one of the components that determine the speed of communication over a data channel.
It is the unit for symbol rate or modul ...
Ternary numeral system
A ternary numeral system (also called base 3 or trinary) has three as its base. Analogous to a bit, a ternary digit is a trit (trinary digit). One trit is equivalent to log2 3 (about 1.58496) bits of information.
Although ''ternary'' m ...
*
Shannon (unit)
The shannon (symbol: Sh) is a unit of information named after Claude Shannon, the founder of information theory. IEC 80000-13 defines the shannon as the information content associated with an event when the probability of the event occurring is . I ...