Statistical learning theory is a framework for
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
drawing from the fields of
statistics
Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
and
functional analysis
Functional analysis is a branch of mathematical analysis, the core of which is formed by the study of vector spaces endowed with some kind of limit-related structure (e.g. inner product, norm, topology, etc.) and the linear functions defi ...
. Statistical learning theory deals with the
statistical inference
Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properti ...
problem of finding a predictive function based on data. Statistical learning theory has led to successful applications in fields such as
computer vision
Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the human ...
,
speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
, and
bioinformatics
Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
.
Introduction
The goals of learning are understanding and prediction. Learning falls into many categories, including
supervised learning,
unsupervised learning,
online learning, and
reinforcement learning. From the perspective of statistical learning theory, supervised learning is best understood. Supervised learning involves learning from a
training set
In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from ...
of data. Every point in the training is an input-output pair, where the input maps to an output. The learning problem consists of inferring the function that maps between the input and the output, such that the learned function can be used to predict the output from future input.
Depending on the type of output, supervised learning problems are either problems of
regression or problems of
classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood.
Classification is the grouping of related facts into classes.
It may also refer to:
Business, organizat ...
. If the output takes a continuous range of values, it is a regression problem. Using
Ohm's Law
Ohm's law states that the current through a conductor between two points is directly proportional to the voltage across the two points. Introducing the constant of proportionality, the resistance, one arrives at the usual mathematical equa ...
as an example, a regression could be performed with voltage as input and current as an output. The regression would find the functional relationship between voltage and current to be , such that
:
Classification problems are those for which the output will be an element from a discrete set of labels. Classification is very common for machine learning applications. In
facial recognition, for instance, a picture of a person's face would be the input, and the output label would be that person's name. The input would be represented by a large multidimensional vector whose elements represent pixels in the picture.
After learning a function based on the training set data, that function is validated on a test set of data, data that did not appear in the training set.
Formal description
Take
to be the
vector space
In mathematics and physics, a vector space (also called a linear space) is a set whose elements, often called '' vectors'', may be added together and multiplied ("scaled") by numbers called ''scalars''. Scalars are often real numbers, but can ...
of all possible inputs, and
to be the vector space of all possible outputs. Statistical learning theory takes the perspective that there is some unknown
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
over the product space
, i.e. there exists some unknown
. The training set is made up of
samples from this probability distribution, and is notated
:
Every
is an input vector from the training data, and
is the output that corresponds to it.
In this formalism, the inference problem consists of finding a function
such that
. Let
be a space of functions
called the hypothesis space. The hypothesis space is the space of functions the algorithm will search through. Let
be the
loss function
In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cos ...
, a metric for the difference between the predicted value
and the actual value
. The
expected risk is defined to be
:
The target function, the best possible function
that can be chosen, is given by the
that satisfies
: