Shannon (unit)
   HOME

TheInfoList



OR:

The shannon (symbol: Sh) is a
unit of information In computing and telecommunications, a unit of information is the capacity of some standard data storage system or communication channel, used to measure the capacities of other systems and channels. In information theory, units of information a ...
named after
Claude Shannon Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician, electrical engineer, and cryptographer known as a "father of information theory". As a 21-year-old master's degree student at the Massachusetts I ...
, the founder of
information theory Information theory is the scientific study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. ...
.
IEC 80000-13 ISO 80000 or IEC 80000 is an international standard introducing the International System of Quantities (ISQ). It was developed and promulgated jointly by the International Organization for Standardization (ISO) and the International Electrotec ...
defines the shannon as the
information content In information theory, the information content, self-information, surprisal, or Shannon information is a basic quantity derived from the probability of a particular event occurring from a random variable. It can be thought of as an alternative wa ...
associated with an event when the probability of the event occurring is . It is understood as such within the realm of
information theory Information theory is the scientific study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. ...
, and is conceptually distinct from the
bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...
, a term used in
data processing Data processing is the collection and manipulation of digital data to produce meaningful information. Data processing is a form of '' information processing'', which is the modification (processing) of information in any manner detectable by ...
and storage to denote a single instance of a binary
signal In signal processing, a signal is a function that conveys information about a phenomenon. Any quantity that can vary over space or time can be used as a signal to share messages between observers. The '' IEEE Transactions on Signal Processing' ...
. A sequence of ''n'' binary symbols (such as contained in computer memory or a binary data transmission) is properly described as consisting of ''n'' bits, but the information content of those ''n'' symbols may be more or less than ''n'' shannons according to the ''a priori'' probability of the actual sequence of symbols. The shannon also serves as a unit of the
information entropy In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable X, which takes values in the alphabet \ ...
of an event, which is defined as the
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
of the information content of the event (i.e., the probability-weighted average of all potential events). Given a number of possible outcomes, unlike information content, the entropy has an upper bound, which occurs when the possible outcomes are equiprobable. The maximum entropy of ''n'' bits is ''n'' Sh. A further quantity that it is used for is channel capacity (unually per unit time), which is the maximum of the expected value of the information content that can be transferred with negligible probability of error encoded over a channel. Nevertheless, the term "bits of information" or simply "bits" is more often heard, even in the fields of information and
communication theory Communication theory is a proposed description of communication phenomena, the relationships among them, a storyline describing these relationships, and an argument for these three elements. Communication theory provides a way of talking about a ...
, rather than "shannons"; just saying "bits" can therefore be ambiguous. Using the unit ''shannon'' is an explicit reference to a quantity of information content, information entropy or channel capacity, and is not restricted to binary data, whereas "bits" can as well refer to the number of binary symbols involved, as is the term used in fields such as data processing.


Similar units

The shannon is connected through constants of proportionality to two other units of information: The ''
hartley Hartley may refer to: Places Australia *Hartley, New South Wales * Hartley, South Australia ** Electoral district of Hartley, a state electoral district Canada *Hartley Bay, British Columbia United Kingdom * Hartley, Cumbria * Hartley, Pl ...
'', a seldom-used unit, is named after Ralph Hartley, an electronics engineer interested in the capacity of communications channels. Although of a more limited nature, his early work, preceding that of Shannon, makes him recognized also as a pioneer of information theory. Just as the shannon describes the maximum possible information capacity of a binary symbol, the hartley describes the information that can be contained in a 10-ary symbol, that is, a digit in between 0 and 9 when the ''a priori'' probability of each digit is . The conversion factor quoted above is given by log10(2). From a mathematical perspective, the
nat Nat or NAT may refer to: Computing * Network address translation (NAT), in computer networking Organizations * National Actors Theatre, New York City, U.S. * National AIDS trust, a British charity * National Archives of Thailand * National A ...
is a more natural unit of information, but 1 nat does not correspond to a case in which all possibilities are equiprobable, unlike with the shannon and hartley. In each case, formulae for the quantification of information capacity or
entropy Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
involve taking the
logarithm In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a number  to the base  is the exponent to which must be raised, to produce . For example, since , the ''logarithm base'' 10 ...
of an expression involving probabilities. If base-2 logarithms are employed, the result is expressed in shannons, if base-10 (
common logarithm In mathematics, the common logarithm is the logarithm with base 10. It is also known as the decadic logarithm and as the decimal logarithm, named after its base, or Briggsian logarithm, after Henry Briggs, an English mathematician who pioneered ...
s) then the result is in hartleys, and if
natural logarithm The natural logarithm of a number is its logarithm to the base of the mathematical constant , which is an irrational and transcendental number approximately equal to . The natural logarithm of is generally written as , , or sometimes, if ...
s (base e), the result is in nats. For instance, the information capacity of a 16-bit sequence (achieved when all 65536 possible sequences are equally probable) is given by log(65536), thus , , or .


Information measures

In
information theory Information theory is the scientific study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. ...
and derivative fields such as
coding theory Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are studied ...
, one cannot quantify the "information" in a single message (sequence of symbols) out of context, but rather a reference is made to the model of a channel (such as
bit error rate In digital transmission, the number of bit errors is the number of received bits of a data stream over a communication channel that have been altered due to noise, interference, distortion or bit synchronization errors. The bit error rate (BER) ...
) or to the underlying statistics of an information source. There are thus various measures of or related to information all of which may use the shannon as a unit. For instance, in the above example, a 16 bit channel could be said to have a channel capacity of 16 Sh, but when connected to a particular information source which only sends one of 8 possible messages, one would compute the
entropy Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
of its output as no more than 3 Sh. And if one already had been informed through a side channel that the message must be among one or the other set of 4 possible messages, then one could calculate the
mutual information In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the " amount of information" (in units such ...
of the new message (having 8 possible states) as no more than 2 Sh. Although there are an infinite possibilities for a
real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every ...
chosen between 0 and 1, so-called
differential entropy Differential entropy (also referred to as continuous entropy) is a concept in information theory that began as an attempt by Claude Shannon to extend the idea of (Shannon) entropy, a measure of average surprisal of a random variable, to continuo ...
can be used to quantify the information content of an analog signal, such as related to the enhancement of signal to noise ratio or confidence of a
hypothesis test A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...
. The shannon (or nat, or hartley) is thus a unit of information used for quite different quantities and in various contexts, always dependent on a stated model, rather than having a rather context-free and unambiguous significance such as the
gram The gram (originally gramme; SI unit symbol g) is a unit of mass in the International System of Units (SI) equal to one one thousandth of a kilogram. Originally defined as of 1795 as "the absolute weight of a volume of pure water equal to th ...
has as a unit of mass.


References

Units of information
unit Unit may refer to: Arts and entertainment * UNIT, a fictional military organization in the science fiction television series ''Doctor Who'' * Unit of action, a discrete piece of action (or beat) in a theatrical presentation Music * ''Unit'' (a ...
{{comp-sci-stub