Algorithmic information theory (AIT) is a branch of
theoretical computer science
Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory.
It is difficult to circumsc ...
that concerns itself with the relationship between
computation
Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm).
Mechanical or electronic devices (or, historically, people) that perform computations are known as ''computers''. An es ...
and
information
Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random ...
of computably generated objects (as opposed to
stochastically generated), such as
strings or any other
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
. In other words, it is shown within algorithmic information theory that computational incompressibility "mimics" (except for a constant that only depends on the chosen universal programming language) the relations or inequalities found in
information theory
Information theory is the scientific study of the quantification (science), quantification, computer data storage, storage, and telecommunication, communication of information. The field was originally established by the works of Harry Nyquist a ...
.
According to
Gregory Chaitin
Gregory John Chaitin ( ; born 25 June 1947) is an Argentine-American mathematician and computer scientist. Beginning in the late 1960s, Chaitin made contributions to algorithmic information theory and metamathematics, in particular a computer-t ...
, it is "the result of putting
Shannon's
information theory
Information theory is the scientific study of the quantification (science), quantification, computer data storage, storage, and telecommunication, communication of information. The field was originally established by the works of Harry Nyquist a ...
and
Turing
Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical co ...
's
computability theory
Computability theory, also known as recursion theory, is a branch of mathematical logic, computer science, and the theory of computation that originated in the 1930s with the study of computable functions and Turing degrees. The field has since e ...
into a cocktail shaker and shaking vigorously."
Besides the formalization of a universal measure for irreducible information content of computably generated objects, some main achievements of AIT were to show that: in fact algorithmic complexity follows (in the
self-delimited case) the same inequalities (except for a constant) that
entropy
Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodynam ...
does, as in classical information theory;
randomness is incompressibility;
and, within the realm of randomly generated software, the probability of occurrence of any data structure is of the order of the shortest program that generates it when running on a universal machine.
AIT principally studies measures of irreducible information content of
strings
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian anim ...
(or other
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
s). Because most mathematical objects can be described in terms of strings, or as the
limit of a sequence
As the positive integer n becomes larger and larger, the value n\cdot \sin\left(\tfrac1\right) becomes arbitrarily close to 1. We say that "the limit of the sequence n\cdot \sin\left(\tfrac1\right) equals 1."
In mathematics, the limi ...
of strings, it can be used to study a wide variety of mathematical objects, including
integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign (−1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
s. One of the main motivations behind AIT is the very study of the information carried by mathematical objects as in the field of
metamathematics
Metamathematics is the study of mathematics itself using mathematical methods. This study produces metatheories, which are mathematical theories about other mathematical theories. Emphasis on metamathematics (and perhaps the creation of the ter ...
, e.g., as shown by the incompleteness results mentioned below. Other main motivations came from surpassing the limitations of
classical information theory for single and fixed objects, formalizing the concept of
randomness
In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual rand ...
, and finding a meaningful
probabilistic inference without prior knowledge of the
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
(e.g., whether it is
independent and identically distributed
In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usua ...
,
Markovian, or even
stationary). In this way, AIT is known to be basically founded upon three main mathematical concepts and the relations between them:
algorithmic complexity
Algorithmic may refer to:
*Algorithm, step-by-step instructions for a calculation
**Algorithmic art, art made by an algorithm
**Algorithmic composition, music made by an algorithm
** Algorithmic trading, trading decisions made by an algorithm
**Alg ...
,
algorithmic randomness
Intuitively, an algorithmically random sequence (or random sequence) is a sequence of binary digits that appears random to any algorithm running on a (prefix-free or not) universal Turing machine. The notion can be applied analogously to sequenc ...
, and
algorithmic probability In algorithmic information theory, algorithmic probability, also known as Solomonoff probability, is a mathematical method of assigning a prior probability to a given observation. It was invented by Ray Solomonoff in the 1960s.
It is used in induc ...
.
Overview
Algorithmic information theory principally studies
complexity
Complexity characterises the behaviour of a system or model whose components interaction, interact in multiple ways and follow local rules, leading to nonlinearity, randomness, collective dynamics, hierarchy, and emergence.
The term is generall ...
measures on
strings (or other
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
s). Because most mathematical objects can be described in terms of strings, or as the
limit of a sequence
As the positive integer n becomes larger and larger, the value n\cdot \sin\left(\tfrac1\right) becomes arbitrarily close to 1. We say that "the limit of the sequence n\cdot \sin\left(\tfrac1\right) equals 1."
In mathematics, the limi ...
of strings, it can be used to study a wide variety of mathematical objects, including
integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign (−1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
s.
Informally, from the point of view of algorithmic information theory, the information content of a string is equivalent to the length of the most-
compressed possible self-contained representation of that string. A self-contained representation is essentially a
program
Program, programme, programmer, or programming may refer to:
Business and management
* Program management, the process of managing several related projects
* Time management
* Program, a part of planning
Arts and entertainment Audio
* Progra ...
—in some fixed but otherwise irrelevant universal
programming language
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.
The description of a programming ...
—that, when run, outputs the original string.
From this point of view, a 3000-page encyclopedia actually contains less information than 3000 pages of completely random letters, despite the fact that the encyclopedia is much more useful. This is because to reconstruct the entire sequence of random letters, one must know what every single letter is. On the other hand, if every vowel were removed from the encyclopedia, someone with reasonable knowledge of the English language could reconstruct it, just as one could likely reconstruct the sentence "Ths sntnc hs lw nfrmtn cntnt" from the context and consonants present.
Unlike classical information theory, algorithmic information theory gives
formal,
rigorous
Rigour (British English) or rigor (American English; American and British English spelling differences#-our, -or, see spelling differences) describes a condition of stiffness or strictness. These constraints may be environmentally imposed, su ...
definitions of a
random string and a
random infinite sequence that do not depend on physical or philosophical
intuition
Intuition is the ability to acquire knowledge without recourse to conscious reasoning. Different fields use the word "intuition" in very different ways, including but not limited to: direct access to unconscious knowledge; unconscious cognition; ...
s about
nondeterminism
Nondeterminism or nondeterministic may refer to:
Computer science
*Nondeterministic programming
*Nondeterministic algorithm
*Nondeterministic model of computation
**Nondeterministic finite automaton
**Nondeterministic Turing machine
*Indeterminacy ...
or
likelihood
The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
. (The set of random strings depends on the choice of the
universal Turing machine
In computer science, a universal Turing machine (UTM) is a Turing machine that can simulate an arbitrary Turing machine on arbitrary input. The universal machine essentially achieves this by reading both the description of the machine to be simu ...
used to define
Kolmogorov complexity
In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produ ...
, but any choice
gives identical asymptotic results because the Kolmogorov complexity of a string is invariant up to an additive constant depending only on the choice of universal Turing machine. For this reason the set of random infinite sequences is independent of the choice of universal machine.)
Some of the results of algorithmic information theory, such as
Chaitin's incompleteness theorem
In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produ ...
, appear to challenge common mathematical and philosophical intuitions. Most notable among these is the construction of
Chaitin's constant
In the computer science subfield of algorithmic information theory, a Chaitin constant (Chaitin omega number) or halting probability is a real number that, informally speaking, represents the probability that a randomly constructed program will ...
Ω, a real number that expresses the probability that a self-delimiting universal Turing machine will
halt when its input is supplied by flips of a fair coin (sometimes thought of as the probability that a random computer program will eventually halt). Although Ω is easily defined, in any
consistent
In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent i ...
axiom
An axiom, postulate, or assumption is a statement that is taken to be true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Ancient Greek word (), meaning 'that which is thought worthy or f ...
atizable
theory
A theory is a rational type of abstract thinking about a phenomenon, or the results of such thinking. The process of contemplative and rational thinking is often associated with such processes as observational study or research. Theories may be s ...
one can only compute finitely many digits of Ω, so it is in some sense ''unknowable'', providing an absolute limit on knowledge that is reminiscent of
Gödel's incompleteness theorems. Although the digits of Ω cannot be determined, many properties of Ω are known; for example, it is an
algorithmically random sequence
Intuitively, an algorithmically random sequence (or random sequence) is a Sequence#Infinite sequences in theoretical computer science, sequence of binary digits that appears random to any algorithm running on a (prefix-free or not) universal Turi ...
and thus its binary digits are evenly distributed (in fact it is
normal Normal(s) or The Normal(s) may refer to:
Film and television
* ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson
* ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie
* ''Norma ...
).
History
Algorithmic information theory was founded by
Ray Solomonoff
Ray Solomonoff (July 25, 1926 – December 7, 2009) was the inventor of algorithmic probability, his General Theory of Inductive Inference (also known as Universal Inductive Inference),Samuel Rathmanner and Marcus Hutter. A philosophical treatise ...
,
[Vitanyi, P.]
Obituary: Ray Solomonoff, Founding Father of Algorithmic Information Theory"
/ref> who published the basic ideas on which the field is based as part of his invention of algorithmic probability In algorithmic information theory, algorithmic probability, also known as Solomonoff probability, is a mathematical method of assigning a prior probability to a given observation. It was invented by Ray Solomonoff in the 1960s.
It is used in induc ...
—a way to overcome serious problems associated with the application of Bayes' rule
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
s in statistics. He first described his results at a Conference at Caltech
The California Institute of Technology (branded as Caltech or CIT)The university itself only spells its short form as "Caltech"; the institution considers other spellings such a"Cal Tech" and "CalTech" incorrect. The institute is also occasional ...
in 1960,[Paper from conference on "Cerebral Systems and Computers", California Institute of Technology, February 8–11, 1960, cited in "A Formal Theory of Inductive Inference, Part 1, 1964, p. 1] and in a report, February 1960, "A Preliminary Report on a General Theory of Inductive Inference."[Solomonoff, R.,]
A Preliminary Report on a General Theory of Inductive Inference
, Report V-131, Zator Co., Cambridge, Ma., (November Revision of February 4, 1960 report.) Algorithmic information theory was later developed independently by Andrey Kolmogorov
Andrey Nikolaevich Kolmogorov ( rus, Андре́й Никола́евич Колмого́ров, p=ɐnˈdrʲej nʲɪkɐˈlajɪvʲɪtɕ kəlmɐˈɡorəf, a=Ru-Andrey Nikolaevich Kolmogorov.ogg, 25 April 1903 – 20 October 1987) was a Sovi ...
, in 1965 and Gregory Chaitin
Gregory John Chaitin ( ; born 25 June 1947) is an Argentine-American mathematician and computer scientist. Beginning in the late 1960s, Chaitin made contributions to algorithmic information theory and metamathematics, in particular a computer-t ...
, around 1966.
There are several variants of Kolmogorov complexity or algorithmic information; the most widely used one is based on self-delimiting programs and is mainly due to Leonid Levin
Leonid Anatolievich Levin ( ; russian: Леони́д Анато́льевич Ле́вин; uk, Леоні́д Анато́лійович Ле́він; born November 2, 1948) is a Soviet-American mathematician and computer scientist.
He is kn ...
(1974). Per Martin-Löf
Per Erik Rutger Martin-Löf (; ; born 8 May 1942) is a Swedish logician, philosopher, and mathematical statistician. He is internationally renowned for his work on the foundations of probability, statistics, mathematical logic, and computer scie ...
also contributed significantly to the information theory of infinite sequences. An axiomatic approach to algorithmic information theory based on the Blum axioms In computational complexity theory the Blum axioms or Blum complexity axioms are axioms that specify desirable properties of complexity measures on the set of computable functions. The axioms were first defined by Manuel Blum in 1967.
Importantly ...
(Blum 1967) was introduced by Mark Burgin in a paper presented for publication by Andrey Kolmogorov
Andrey Nikolaevich Kolmogorov ( rus, Андре́й Никола́евич Колмого́ров, p=ɐnˈdrʲej nʲɪkɐˈlajɪvʲɪtɕ kəlmɐˈɡorəf, a=Ru-Andrey Nikolaevich Kolmogorov.ogg, 25 April 1903 – 20 October 1987) was a Sovi ...
(Burgin 1982). The axiomatic approach encompasses other approaches in the algorithmic information theory. It is possible to treat different measures of algorithmic information as particular cases of axiomatically defined measures of algorithmic information. Instead of proving similar theorems, such as the basic invariance theorem, for each particular measure, it is possible to easily deduce all such results from one corresponding theorem proved in the axiomatic setting. This is a general advantage of the axiomatic approach in mathematics. The axiomatic approach to algorithmic information theory was further developed in the book (Burgin 2005) and applied to software metrics (Burgin and Debnath, 2003; Debnath and Burgin, 2003).
Precise definitions
A binary string is said to be random if the Kolmogorov complexity
In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produ ...
of the string is at least the length of the string. A simple counting argument shows that some strings of any given length are random, and almost all strings are very close to being random. Since Kolmogorov complexity depends on a fixed choice of universal Turing machine (informally, a fixed "description language" in which the "descriptions" are given), the collection of random strings does depend on the choice of fixed universal machine. Nevertheless, the collection of random strings, as a whole, has similar properties regardless of the fixed machine, so one can (and often does) talk about the properties of random strings as a group without having to first specify a universal machine.
An infinite binary sequence is said to be random if, for some constant ''c'', for all ''n'', the Kolmogorov complexity
In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produ ...
of the initial segment of length ''n'' of the sequence is at least ''n'' − ''c''. It can be shown that almost every sequence (from the point of view of the standard measure
Measure may refer to:
* Measurement, the assignment of a number to a characteristic of an object or event
Law
* Ballot measure, proposed legislation in the United States
* Church of England Measure, legislation of the Church of England
* Mea ...
—"fair coin" or Lebesgue measure—on the space of infinite binary sequences) is random. Also, since it can be shown that the Kolmogorov complexity relative to two different universal machines differs by at most a constant, the collection of random infinite sequences does not depend on the choice of universal machine (in contrast to finite strings). This definition of randomness is usually called ''Martin-Löf'' randomness, after Per Martin-Löf
Per Erik Rutger Martin-Löf (; ; born 8 May 1942) is a Swedish logician, philosopher, and mathematical statistician. He is internationally renowned for his work on the foundations of probability, statistics, mathematical logic, and computer scie ...
, to distinguish it from other similar notions of randomness. It is also sometimes called ''1-randomness'' to distinguish it from other stronger notions of randomness (2-randomness, 3-randomness, etc.). In addition to Martin-Löf randomness concepts, there are also recursive randomness, Schnorr randomness, and Kurtz randomness etc. Yongge Wang Yongge Wang (born 1967) is a computer science professor at the University of North Carolina at Charlotte specialized in algorithmic complexity and cryptography. He is the inventor of IEEE P1363 cryptographic standards SRP5 and WANG-KE and has cont ...
showed that all of these randomness concepts are different.
(Related definitions can be made for alphabets other than the set .)
Specific sequence
Algorithmic information theory (AIT) is the information theory of individual objects, using computer science, and concerns itself with the relationship between computation, information, and randomness.
The information content or complexity of an object can be measured by the length of its shortest description. For instance the string
"0101010101010101010101010101010101010101010101010101010101010101"
has the short description "32 repetitions of '01'", while
"1100100001100001110111101110110011111010010000100101011110010110"
presumably has no simple description other than writing down the string itself.
More formally, the algorithmic complexity (AC) of a string ''x'' is defined as the length of the shortest program that computes or outputs ''x'', where the program is run on some fixed reference universal computer.
A closely related notion is the probability that a universal computer outputs some string ''x'' when fed with a program chosen at random. This algorithmic "Solomonoff" probability (AP) is key in addressing the old philosophical problem of induction in a formal way.
The major drawback of AC and AP are their incomputability. Time-bounded "Levin" complexity penalizes a slow program by adding the logarithm of its running time to its length. This leads to computable variants of AC and AP, and universal "Levin" search (US) solves all inversion problems in optimal time (apart from some unrealistically large multiplicative constant).
AC and AP also allow a formal and rigorous definition of randomness of individual strings to not depend on physical or philosophical intuitions about non-determinism or likelihood. Roughly, a string is algorithmic "Martin-Löf" random (AR) if it is incompressible in the sense that its algorithmic complexity is equal to its length.
AC, AP, and AR are the core sub-disciplines of AIT, but AIT spawns into many other areas. It serves as the foundation of the Minimum Description Length (MDL) principle, can simplify proofs in computational complexity theory
In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by ...
, has been used to define a universal similarity metric between objects, solves the Maxwell daemon problem, and many others.
See also
References
External links
Algorithmic Information Theory
at Scholarpedia
''Scholarpedia'' is an English-language wiki-based online encyclopedia with features commonly associated with open-access online academic journals, which aims to have quality content in science and medicine.
''Scholarpedia'' articles are written ...
Chaitin's account of the history of AIT
Further reading
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
{{DEFAULTSORT:Algorithmic Information Theory
Information theory
Randomized algorithms