algorithmic probability
   HOME

TheInfoList



OR:

In
algorithmic information theory Algorithmic information theory (AIT) is a branch of theoretical computer science that concerns itself with the relationship between computation and information of computably generated objects (as opposed to stochastically generated), such as str ...
, algorithmic probability, also known as Solomonoff probability, is a mathematical method of assigning a prior
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
to a given observation. It was invented by Ray Solomonoff in the 1960s. It is used in inductive inference theory and analyses of algorithms. In his general theory of inductive inference, Solomonoff uses the method together with
Bayes' rule In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For exa ...
to obtain probabilities of prediction for an algorithm's future outputs. In the mathematical formalism used, the observations have the form of finite binary strings viewed as outputs of
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer alg ...
s, and the universal prior is a probability distribution over the set of finite binary strings calculated from a probability distribution over programs (that is, inputs to a universal Turing machine). The prior is universal in the Turing-computability sense, i.e. no string has zero probability. It is not computable, but it can be approximated.


Overview

Algorithmic probability is the main ingredient of
Solomonoff's theory of inductive inference Solomonoff's theory of inductive inference is a mathematical proof that if a universe is generated by an algorithm, then observations of that universe, encoded as a dataset, are best predicted by the smallest executable archive of that dataset. T ...
, the theory of prediction based on observations; it was invented with the goal of using it for machine learning; given a sequence of symbols, which one will come next? Solomonoff's theory provides an answer that is optimal in a certain sense, although it is incomputable. Unlike, for example,
Karl Popper Sir Karl Raimund Popper (28 July 1902 – 17 September 1994) was an Austrian-British philosopher, academic and social commentator. One of the 20th century's most influential philosophers of science, Popper is known for his rejection of the ...
's informal inductive inference theory, Solomonoff's is mathematically rigorous. Four principal inspirations for Solomonoff's algorithmic probability were:
Occam's razor Occam's razor, Ockham's razor, or Ocham's razor ( la, novacula Occami), also known as the principle of parsimony or the law of parsimony ( la, lex parsimoniae), is the problem-solving principle that "entities should not be multiplied beyond neces ...
, Epicurus' principle of multiple explanations, modern computing theory (e.g. use of a universal Turing machine) and Bayes’ rule for prediction. Occam's razor and Epicurus' principle are essentially two different non-mathematical approximations of the
universal prior Universal is the adjective for universe. Universal may also refer to: Companies * NBCUniversal, a media and entertainment company ** Universal Animation Studios, an American Animation studio, and a subsidiary of NBCUniversal ** Universal TV, ...
. * Occam's razor: ''among the theories that are consistent with the observed phenomena, one should select the simplest theory''. * Epicurus' principle of multiple explanations: ''if more than one theory is consistent with the observations, keep all such theories''. At the heart of the universal prior is an abstract model of a computer, such as a universal Turing machine. Any abstract computer will do, as long as it is Turing-complete, i.e. every computable function has at least one program that will compute its application on the abstract computer. The abstract computer is used to give precise meaning to the phrase "simple explanation". In the formalism used, explanations, or theories of phenomena, are computer programs that generate observation strings when run on the abstract computer. Each computer program is assigned a weight corresponding to its length. The universal probability distribution is the probability distribution on all possible output strings with random input, assigning for each finite output prefix ''q'' the sum of the probabilities of the programs that compute something starting with ''q''. Thus, a simple explanation is a short computer program. A complex explanation is a long computer program. Simple explanations are more likely, so a high-probability observation string is one generated by a short computer program, or perhaps by any of a large number of slightly longer computer programs. A low-probability observation string is one that can only be generated by a long computer program. Algorithmic probability is closely related to the concept of
Kolmogorov complexity In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produ ...
. Kolmogorov's introduction of complexity was motivated by information theory and problems in randomness, while Solomonoff introduced algorithmic complexity for a different reason: inductive reasoning. A single universal prior probability that can be substituted for each actual prior probability in Bayes’s rule was invented by Solomonoff with Kolmogorov complexity as a side product. It predicts the most likely continuation of that observation, and provides a measure of how likely this continuation will be. Solomonoff's enumerable measure is
universal Universal is the adjective for universe. Universal may also refer to: Companies * NBCUniversal, a media and entertainment company ** Universal Animation Studios, an American Animation studio, and a subsidiary of NBCUniversal ** Universal TV, a t ...
in a certain powerful sense, but the computation time can be infinite. One way of dealing with this issue is a variant of Leonid Levin's Search Algorithm, which limits the time spent computing the success of possible programs, with shorter programs given more time. When run for longer and longer periods of time, it will generate a sequence of approximations which converge to the universal probability distribution. Other methods of dealing with the issue include limiting the search space by including training sequences. Solomonoff proved this distribution to be machine-invariant within a constant factor (called the invariance theorem).


History

Solomonoff invented the concept of algorithmic probability with its associated invariance theorem around 1960, publishing a report on it: "A Preliminary Report on a General Theory of Inductive Inference." He clarified these ideas more fully in 1964 with "A Formal Theory of Inductive Inference," Part I and Part II.Solomonoff, R.,
A Formal Theory of Inductive Inference, Part II
''Information and Control'', Vol 7, No. 2 pp 224–254, June 1964.


Examples

These ideas can be made specific.


Key people

* Ray Solomonoff *
Andrey Kolmogorov Andrey Nikolaevich Kolmogorov ( rus, Андре́й Никола́евич Колмого́ров, p=ɐnˈdrʲej nʲɪkɐˈlajɪvʲɪtɕ kəlmɐˈɡorəf, a=Ru-Andrey Nikolaevich Kolmogorov.ogg, 25 April 1903 – 20 October 1987) was a Sovi ...
* Leonid Levin


See also

*
Solomonoff's theory of inductive inference Solomonoff's theory of inductive inference is a mathematical proof that if a universe is generated by an algorithm, then observations of that universe, encoded as a dataset, are best predicted by the smallest executable archive of that dataset. T ...
*
Algorithmic information theory Algorithmic information theory (AIT) is a branch of theoretical computer science that concerns itself with the relationship between computation and information of computably generated objects (as opposed to stochastically generated), such as str ...
*
Bayesian inference Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and ...
* Inductive inference *
Inductive probability Inductive probability attempts to give the probability of future events based on past events. It is the basis for inductive reasoning, and gives the mathematical basis for learning and the perception of patterns. It is a source of knowledge about ...
*
Kolmogorov complexity In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produ ...
* Universal Turing machine * Information-based complexity


References


Sources

* Li, M. and Vitanyi, P., ''An Introduction to Kolmogorov Complexity and Its Applications'', 3rd Edition, Springer Science and Business Media, N.Y., 2008


Further reading

* Rathmanner, S and Hutter, M.,
A Philosophical Treatise of Universal Induction
in Entropy 2011, 13, 1076-1136: A very clear philosophical and mathematical analysis of Solomonoff's Theory of Inductive Inference


External links


Algorithmic Probability
at
Scholarpedia ''Scholarpedia'' is an English-language wiki-based online encyclopedia with features commonly associated with open-access online academic journals, which aims to have quality content in science and medicine. ''Scholarpedia'' articles are writ ...

Solomonoff's publications
{{DEFAULTSORT:Algorithmic Probability Algorithmic information theory Probability interpretations Artificial intelligence