HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
, computational learning theory (or just learning theory) is a subfield of
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machine A machine is a physical system using Power (physics), power to apply Force, forces and control Motion, moveme ...
devoted to studying the design and analysis of
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
algorithms.


Overview

Theoretical results in machine learning mainly deal with a type of inductive learning called
supervised learning Supervised learning (SL) is a machine learning paradigm for problems where the available data consists of labelled examples, meaning that each data point contains features (covariates) and an associated label. The goal of supervised learning alg ...
. In supervised learning, an algorithm is given samples that are labeled in some useful way. For example, the samples might be descriptions of mushrooms, and the labels could be whether or not the mushrooms are edible. The algorithm takes these previously labeled samples and uses them to induce a classifier. This classifier is a function that assigns labels to samples, including samples that have not been seen previously by the algorithm. The goal of the supervised learning algorithm is to optimize some measure of performance such as minimizing the number of mistakes made on new samples. In addition to performance bounds, computational learning theory studies the time complexity and feasibility of learning. In computational learning theory, a computation is considered feasible if it can be done in
polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
. There are two kinds of time complexity results: * Positive resultsShowing that a certain class of functions is learnable in polynomial time. * Negative resultsShowing that certain classes cannot be learned in polynomial time. Negative results often rely on commonly believed, but yet unproven assumptions, such as: * Computational complexity – P ≠ NP (the P versus NP problem); *
Cryptographic Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or ''-logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adver ...
One-way functions exist. There are several different approaches to computational learning theory based on making different assumptions about the
inference Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that ...
principles used to generalise from limited data. This includes different definitions of
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
(see
frequency probability Frequentist probability or frequentism is an interpretation of probability; it defines an event's probability as the limit of its relative frequency in many trials (the long-run probability). Probabilities can be found (in principle) by a re ...
,
Bayesian probability Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification o ...
) and different assumptions on the generation of samples. The different approaches include: * Exact learning, proposed by Dana Angluin; *
Probably approximately correct learning In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. It was proposed in 1984 by Leslie Valiant.L. Valiant. A theory of the learnable.' Communications of the ...
(PAC learning), proposed by
Leslie Valiant Leslie Gabriel Valiant (born 28 March 1949) is a British American computer scientist and computational theorist. He was born to a chemical engineer father and a translator mother. He is currently the T. Jefferson Coolidge Professor of Compu ...
; * VC theory, proposed by Vladimir Vapnik and Alexey Chervonenkis; * Bayesian inference; * Algorithmic learning theory, from the work of
E. Mark Gold E. Mark Gold (often written "E Mark Gold" without a dot, born 1936 in Los Angeles) is an American physicist, mathematician, and computer scientist. He became well known for his article ''Language identification in the limit'' which pioneered a fo ...
; * Online machine learning, from the work of Nick Littlestone. While its primary goal is to understand learning abstractly, computational learning theory has led to the development of practical algorithms. For example, PAC theory inspired boosting, VC theory led to
support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories ...
s, and Bayesian inference led to belief networks.


See also

*
Grammar induction Grammar induction (or grammatical inference) is the process in machine learning of learning a formal grammar (usually as a collection of ''re-write rules'' or '' productions'' or alternatively as a finite state machine or automaton of some kind) fr ...
*
Information theory Information theory is the scientific study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. ...
* Stability (learning theory) * Error Tolerance (PAC learning)


References


Surveys

* Angluin, D. 1992. Computational learning theory: Survey and selected bibliography. In Proceedings of the Twenty-Fourth Annual ACM Symposium on Theory of Computing (May 1992), pages 351–369. http://portal.acm.org/citation.cfm?id=129712.129746 * D. Haussler. Probably approximately correct learning. In AAAI-90 Proceedings of the Eight National Conference on Artificial Intelligence, Boston, MA, pages 1101–1108. American Association for Artificial Intelligence, 1990. http://citeseer.ist.psu.edu/haussler90probably.html


VC dimension

* V. Vapnik and A. Chervonenkis
On the uniform convergence of relative frequencies of events to their probabilities
Theory of Probability and Its Applications, 16(2):264–280, 1971.


Feature selection

* A. Dhagat and L. Hellerstein, "PAC learning with irrelevant attributes", in 'Proceedings of the IEEE Symp. on Foundation of Computer Science', 1994. http://citeseer.ist.psu.edu/dhagat94pac.html


Inductive inference

*


Optimal O notation learning

* Oded Goldreich, Dana Ron.
On universal learning algorithms
'. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.2224


Negative results

* M. Kearns and
Leslie Valiant Leslie Gabriel Valiant (born 28 March 1949) is a British American computer scientist and computational theorist. He was born to a chemical engineer father and a translator mother. He is currently the T. Jefferson Coolidge Professor of Compu ...
. 1989. Cryptographic limitations on learning boolean formulae and finite automata. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing, pages 433–444, New York. ACM. http://citeseer.ist.psu.edu/kearns89cryptographic.html


Boosting (machine learning) In machine learning, boosting is an ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones. Boosting is based on the q ...

* Robert E. Schapire. The strength of weak learnability. Machine Learning, 5(2):197–227, 1990 http://citeseer.ist.psu.edu/schapire90strength.html


Occam learning

* Blumer, A.; Ehrenfeucht, A.; Haussler, D.; Warmuth, M. K.br>Occam's razor
Inf.Proc.Lett. 24, 377–380, 1987. * Blumer, A.; Ehrenfeucht, A.; Haussler, D.; Warmuth, M. K
Learnability and the Vapnik-Chervonenkis dimension
Journal of the ACM, 36(4):929–865, 1989.


Probably approximately correct learning In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. It was proposed in 1984 by Leslie Valiant.L. Valiant. A theory of the learnable.' Communications of the ...

* L. Valiant
A Theory of the Learnable
Communications of the ACM, 27(11):1134–1142, 1984.


Error tolerance

* Michael Kearns and Ming Li. Learning in the presence of malicious errors. SIAM Journal on Computing, 22(4):807–837, August 1993. http://citeseer.ist.psu.edu/kearns93learning.html * Kearns, M. (1993). Efficient noise-tolerant learning from statistical queries. In Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, pages 392–401. http://citeseer.ist.psu.edu/kearns93efficient.html


Equivalence

* D.Haussler, M.Kearns, N.Littlestone and M. Warmuth, Equivalence of models for polynomial learnability, Proc. 1st ACM Workshop on Computational Learning Theory, (1988) 42-55. * A description of some of these publications is given at important publications in machine learning.


Distribution learning theory


External links


Basics of Bayesian inference
{{Differentiable computing Theoretical computer science Computational fields of study de:Maschinelles Lernen