HOME

TheInfoList



OR:

Algorithmic learning theory is a mathematical framework for analyzing
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
problems and algorithms. Synonyms include formal learning theory and algorithmic inductive inference. Algorithmic learning theory is different from
statistical learning theory Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis. Statistical learning theory deals with the statistical inference problem of finding a predictive function based on dat ...
in that it does not make use of statistical assumptions and analysis. Both algorithmic and statistical learning theory are concerned with machine learning and can thus be viewed as branches of
computational learning theory In computer science, computational learning theory (or just learning theory) is a subfield of artificial intelligence devoted to studying the design and analysis of machine learning algorithms. Overview Theoretical results in machine learning m ...
.


Distinguishing characteristics

Unlike statistical learning theory and most statistical theory in general, algorithmic learning theory does not assume that data are random samples, that is, that data points are independent of each other. This makes the theory suitable for domains where observations are (relatively) noise-free but not random, such as language learning and automated scientific discovery. The fundamental concept of algorithmic learning theory is learning in the limit: as the number of data points increases, a learning algorithm should converge to a correct hypothesis on ''every'' possible data sequence consistent with the problem space. This is a non-probabilistic version of statistical consistency, which also requires convergence to a correct model in the limit, but allows a learner to fail on data sequences with probability measure 0 . Algorithmic learning theory investigates the learning power of
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...
s. Other frameworks consider a much more restricted class of learning algorithms than Turing machines, for example, learners that compute hypotheses more quickly, for instance in
polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
. An example of such a framework is
probably approximately correct learning In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. It was proposed in 1984 by Leslie Valiant.L. Valiant. A theory of the learnable.' Communications of the A ...
.


Learning in the limit

The concept was introduced in E. Mark Gold's seminal paper "
Language identification in the limit Language identification in the limit is a formal model for inductive inference of formal languages, mainly by computers (see machine learning and induction of regular languages). It was introduced by E. Mark Gold in a technical report and a journal ...
". The objective of
language identification In natural language processing, language identification or language guessing is the problem of determining which natural language given content is in. Computational approaches to this problem view it as a special case of text categorization, solv ...
is for a machine running one program to be capable of developing another program by which any given sentence can be tested to determine whether it is "grammatical" or "ungrammatical". The language being learned need not be
English English usually refers to: * English language * English people English may also refer to: Peoples, culture, and language * ''English'', an adjective for something of, from, or related to England ** English national ide ...
or any other
natural language In neuropsychology, linguistics, and philosophy of language, a natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation. Natural languages ...
- in fact the definition of "grammatical" can be absolutely anything known to the tester. In Gold's learning model, the tester gives the learner an example sentence at each step, and the learner responds with a
hypothesis A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon. For a hypothesis to be a scientific hypothesis, the scientific method requires that one can test it. Scientists generally base scientific hypotheses on previous obse ...
, which is a suggested
program Program, programme, programmer, or programming may refer to: Business and management * Program management, the process of managing several related projects * Time management * Program, a part of planning Arts and entertainment Audio * Progra ...
to determine grammatical correctness. It is required of the tester that every possible sentence (grammatical or not) appears in the list eventually, but no particular order is required. It is required of the learner that at each step the hypothesis must be correct for all the sentences so far. A particular learner is said to be able to "learn a language in the limit" if there is a certain number of steps beyond which its hypothesis no longer changes. At this point it has indeed learned the language, because every possible sentence appears somewhere in the sequence of inputs (past or future), and the hypothesis is correct for all inputs (past or future), so the hypothesis is correct for every sentence. The learner is not required to be able to tell when it has reached a correct hypothesis, all that is required is that it be true. Gold showed that any language which is defined by a
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...
program can be learned in the limit by another
Turing-complete In computability theory, a system of data-manipulation rules (such as a computer's instruction set, a programming language, or a cellular automaton) is said to be Turing-complete or computationally universal if it can be used to simulate any Tur ...
machine using
enumeration An enumeration is a complete, ordered listing of all the items in a collection. The term is commonly used in mathematics and computer science to refer to a listing of all of the elements of a set. The precise requirements for an enumeration (fo ...
. This is done by the learner testing all possible Turing machine programs in turn until one is found which is correct so far - this forms the hypothesis for the current step. Eventually, the correct program will be reached, after which the hypothesis will never change again (but note that the learner does not know that it won't need to change). Gold also showed that if the learner is given only positive examples (that is, only grammatical sentences appear in the input, not ungrammatical sentences), then the language can only be guaranteed to be learned in the limit if there are only a
finite Finite is the opposite of infinite. It may refer to: * Finite number (disambiguation) * Finite set, a set whose cardinality (number of elements) is some natural number * Finite verb, a verb form that has a subject, usually being inflected or marked ...
number of possible sentences in the language (this is possible if, for example, sentences are known to be of limited length). Language identification in the limit is a highly abstract model. It does not allow for limits of runtime or
computer memory In computing, memory is a device or system that is used to store information for immediate use in a computer or related computer hardware and digital electronic devices. The term ''memory'' is often synonymous with the term ''primary storage ...
which can occur in practice, and the enumeration method may fail if there are errors in the input. However the framework is very powerful, because if these strict conditions are maintained, it allows the learning of any program known to be computable. This is because a Turing machine program can be written to mimic any program in any conventional
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
. See Church-Turing thesis.


Other identification criteria

Learning theorists have investigated other learning criteria, such as the following. * ''Efficiency'': minimizing the number of data points required before convergence to a correct hypothesis. * ''Mind Changes'': minimizing the number of hypothesis changes that occur before convergence. Mind change bounds are closely related to mistake bounds that are studied in
statistical learning theory Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis. Statistical learning theory deals with the statistical inference problem of finding a predictive function based on dat ...
. Kevin Kelly has suggested that minimizing mind changes is closely related to choosing maximally simple hypotheses in the sense of
Occam’s Razor Occam's razor, Ockham's razor, or Ocham's razor ( la, novacula Occami), also known as the principle of parsimony or the law of parsimony ( la, lex parsimoniae), is the problem-solving principle that "entities should not be multiplied beyond neces ...
.


Annual conference

Since 1990, there is an ''International Conference on Algorithmic Learning Theory (ALT)'', called ''Workshop'' in its first years (1990–1997). Between 1992 and 2016, proceedings were published in the
LNCS ''Lecture Notes in Computer Science'' is a series of computer science books published by Springer Science+Business Media since 1973. Overview The series contains proceedings, post-proceedings, monographs, and Festschrifts. In addition, tutorials, ...
series. Starting from 2017, they are published by the Proceedings of Machine Learning Research. The 34th conference will be held in
Singapore Singapore (), officially the Republic of Singapore, is a sovereign island country and city-state in maritime Southeast Asia. It lies about one degree of latitude () north of the equator, off the southern tip of the Malay Peninsula, borde ...
in Feb 2023.ALT'23 home page
/ref> The topics of the conference cover all of theoretical machine learning, including statistical and computational learning theory, online learning, active learning, reinforcement learning, and deep learning.


See also

*
Formal epistemology Formal epistemology uses formal methods from decision theory, logic, probability theory and computability theory to model and reason about issues of epistemological interest. Work in this area spans several academic fields, including philosophy, ...
*
Sample exclusion dimension In computational learning theory, sample exclusion dimensions arise in the study of exact concept learning with queries. In algorithmic learning theory, a ''concept'' over a domain ''X'' is a Boolean function In mathematics, a Boolean function ...


References


External links


Learning Theory in Computer Science.

The Stanford Encyclopaedia of Philosophy
provides a highly accessible introduction to key concepts in algorithmic learning theory, especially as they apply to the philosophical problems of inductive inference. {{DEFAULTSORT:Algorithmic Learning Theory Computational learning theory Learning theory (education) Formal languages