
In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
and
numerical analysis
Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic computation, symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of ...
, isotonic regression or
monotonic
In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of ord ...
regression is the technique of fitting a free-form line to a sequence of observations such that the fitted line is
non-decreasing (or non-increasing) everywhere, and lies as close to the observations as possible.
Applications
Isotonic regression has applications in
statistical inference
Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
. For example, one might use it to fit an isotonic curve to the means of some set of experimental results when an increase in those means according to some particular ordering is expected. A benefit of isotonic regression is that it is not constrained by any functional form, such as the linearity imposed by
linear regression
In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
, as long as the function is monotonic increasing.
Another application is nonmetric
multidimensional scaling
Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a data set. MDS is used to translate distances between each pair of n objects in a set into a configuration of n points mapped into an ...
, where a low-dimensional
embedding
In mathematics, an embedding (or imbedding) is one instance of some mathematical structure contained within another instance, such as a group (mathematics), group that is a subgroup.
When some object X is said to be embedded in another object Y ...
for data points is sought such that order of distances between points in the embedding matches
order of dissimilarity between points. Isotonic regression is used iteratively to fit ideal distances to preserve relative dissimilarity order.
Isotonic regression is also used in
probabilistic classification
In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation sh ...
to calibrate the predicted probabilities of
supervised machine learning
In machine learning, supervised learning (SL) is a paradigm where a model is trained using input objects (e.g. a vector of predictor variables) and desired output values (also known as a ''supervisory signal''), which are often human-made labels. ...
models.
Isotonic regression for the simply ordered case with univariate
has been applied to estimating continuous dose-response relationships in fields such as anesthesiology and toxicology. Narrowly speaking, isotonic regression only provides point estimates at observed values of
Estimation of the complete dose-response curve without any additional assumptions is usually done via linear interpolation between the point estimates.
Software for computing isotone (monotonic) regression has been developed for
R,
Stata
Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose Statistics, statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers ...
, and
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (prog ...
.
Problem statement and algorithms
Let
be a given set of observations, where the
and the
fall in some
partially ordered set
In mathematics, especially order theory, a partial order on a Set (mathematics), set is an arrangement such that, for certain pairs of elements, one precedes the other. The word ''partial'' is used to indicate that not every pair of elements need ...
. For generality, each observation
may be given a weight
, although commonly
for all
.
Isotonic regression seeks a weighted
least-squares
The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ...
fit
for all
, subject to the constraint that
whenever
. This gives the following
quadratic program
Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize (minimize or maximize) a multivariate quadratic function subject to linear constrai ...
(QP) in the variables
:
:
subject to
where
specifies the partial ordering of the observed inputs
(and may be regarded as the set of edges of some
directed acyclic graph
In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ...
(dag) with vertices
). Problems of this form may be solved by generic quadratic programming techniques.
In the usual setting where the
values fall in a
totally ordered set
In mathematics, a total order or linear order is a partial order in which any two elements are comparable. That is, a total order is a binary relation \leq on some set X, which satisfies the following for all a, b and c in X:
# a \leq a ( ref ...
such as
, we may assume
WLOG
''Without loss of generality'' (often abbreviated to WOLOG, WLOG or w.l.o.g.; less commonly stated as ''without any loss of generality'' or ''with no loss of generality'') is a frequently used expression in mathematics. The term is used to indicat ...
that the observations have been sorted so that
, and take
. In this case, a simple
iterative algorithm
In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''i''-th approximation (called an "iterate") ...
for solving the quadratic program is the
pool adjacent violators algorithm
Pool may refer to:
Bodies of water
* Swimming pool, usually an artificial structure containing a large body of water intended for swimming
* Reflecting pool, a shallow pool designed to reflect a structure and its surroundings
* Tide pool, a rocky ...
. Conversely, Best and Chakravarti
studied the problem as an
active set
In mathematical optimization, the active-set method is an algorithm used to identify the active constraints in a set of inequality constraints. The active constraints are then expressed as equality constraints, thereby transforming an inequalit ...
identification problem, and proposed a primal algorithm. These two algorithms can be seen as each other's dual, and both have a
computational complexity
In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations ...
of
on already sorted data.
To complete the isotonic regression task, we may then choose any non-decreasing function
such that
for all i. Any such function obviously solves
:
subject to
being nondecreasing
and can be used to predict the
values for new values of
. A common choice when
would be to interpolate linearly between the points
, as illustrated in the figure, yielding a continuous piecewise linear function:
:
Centered isotonic regression
As this article's first figure shows, in the presence of monotonicity violations the resulting interpolated curve will have flat (constant) intervals. In dose-response applications it is usually known that
is not only monotone but also
smooth. The flat intervals are incompatible with
's assumed shape, and can be shown to be biased. A simple improvement for such applications, named centered isotonic regression (CIR), was developed by Oron and Flournoy and shown to substantially reduce estimation error for both dose-response and dose-finding applications.
Both CIR and the standard isotonic regression for the univariate, simply ordered case, are implemented in the R package "cir".
[ This package also provides analytical confidence-interval estimates.
]
References
Further reading
*
*
*
*
{{DEFAULTSORT:Isotonic Regression
Nonparametric regression
Nonparametric Bayesian statistics
Numerical analysis