In
information theory, the information content, self-information, surprisal, or Shannon information is a basic quantity derived from the
probability of a particular
event
Event may refer to:
Gatherings of people
* Ceremony, an event of ritual significance, performed on a special occasion
* Convention (meeting), a gathering of individuals engaged in some common interest
* Event management, the organization of ev ...
occurring from a
random variable. It can be thought of as an alternative way of expressing probability, much like
odds or
log-odds, but which has particular mathematical advantages in the setting of information theory.
The Shannon information can be interpreted as quantifying the level of "surprise" of a particular outcome. As it is such a basic quantity, it also appears in several other settings, such as the length of a message needed to transmit the event given an optimal
source coding of the random variable.
The Shannon information is closely related to ''
entropy'', which is the expected value of the self-information of a random variable, quantifying how surprising the random variable is "on average". This is the average amount of self-information an observer would expect to gain about a random variable when measuring it.
The information content can be expressed in various
units of information, of which the most common is the "bit" (more correctly called the ''shannon''), as explained below.
Definition
Claude Shannon's definition of self-information was chosen to meet several axioms:
# An event with probability 100% is perfectly unsurprising and yields no information.
# The less probable an event is, the more surprising it is and the more information it yields.
# If two independent events are measured separately, the total amount of information is the sum of the self-informations of the individual events.
The detailed derivation is below, but it can be shown that there is a unique function of probability that meets these three axioms, up to a multiplicative scaling factor. Broadly, given a real number
and an
event
Event may refer to:
Gatherings of people
* Ceremony, an event of ritual significance, performed on a special occasion
* Convention (meeting), a gathering of individuals engaged in some common interest
* Event management, the organization of ev ...
with
probability , the information content is defined as follows:
The base ''b'' corresponds to the scaling factor above. Different choices of ''b'' correspond to different units of information: when , the unit is the
shannon (symbol Sh), often called a 'bit'; when , the unit is the
natural unit of information
The natural unit of information (symbol: nat), sometimes also nit or nepit, is a unit of information, based on natural logarithms and powers of ''e'', rather than the powers of 2 and base 2 logarithms, which define the shannon. This unit is ...
(symbol nat); and when , the unit is the
hartley
Hartley may refer to:
Places Australia
*Hartley, New South Wales
* Hartley, South Australia
** Electoral district of Hartley, a state electoral district
Canada
*Hartley Bay, British Columbia
United Kingdom
* Hartley, Cumbria
* Hartley, Pl ...
(symbol Hart).
Formally, given a random variable
with
probability mass function , the self-information of measuring
as
outcome is defined as
The use of the notation
for self-information above is not universal. Since the notation
is also often used for the related quantity of
mutual information, many authors use a lowercase
for self-entropy instead, mirroring the use of the capital
for the entropy.
Properties
Monotonically decreasing function of probability
For a given
probability space, the measurement of rarer
event
Event may refer to:
Gatherings of people
* Ceremony, an event of ritual significance, performed on a special occasion
* Convention (meeting), a gathering of individuals engaged in some common interest
* Event management, the organization of ev ...
s are intuitively more "surprising", and yield more information content, than more common values. Thus, self-information is a
strictly decreasing monotonic function of the probability, or sometimes called an "antitonic" function.
While standard probabilities are represented by real numbers in the interval