HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
and
numerical analysis Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic computation, symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of ...
, isotonic regression or
monotonic In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of ord ...
regression is the technique of fitting a free-form line to a sequence of observations such that the fitted line is non-decreasing (or non-increasing) everywhere, and lies as close to the observations as possible.


Applications

Isotonic regression has applications in
statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
. For example, one might use it to fit an isotonic curve to the means of some set of experimental results when an increase in those means according to some particular ordering is expected. A benefit of isotonic regression is that it is not constrained by any functional form, such as the linearity imposed by
linear regression In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
, as long as the function is monotonic increasing. Another application is nonmetric
multidimensional scaling Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a data set. MDS is used to translate distances between each pair of n objects in a set into a configuration of n points mapped into an ...
, where a low-dimensional
embedding In mathematics, an embedding (or imbedding) is one instance of some mathematical structure contained within another instance, such as a group (mathematics), group that is a subgroup. When some object X is said to be embedded in another object Y ...
for data points is sought such that order of distances between points in the embedding matches order of dissimilarity between points. Isotonic regression is used iteratively to fit ideal distances to preserve relative dissimilarity order. Isotonic regression is also used in
probabilistic classification In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation sh ...
to calibrate the predicted probabilities of
supervised machine learning In machine learning, supervised learning (SL) is a paradigm where a model is trained using input objects (e.g. a vector of predictor variables) and desired output values (also known as a ''supervisory signal''), which are often human-made labels. ...
models. Isotonic regression for the simply ordered case with univariate x, y has been applied to estimating continuous dose-response relationships in fields such as anesthesiology and toxicology. Narrowly speaking, isotonic regression only provides point estimates at observed values of x. Estimation of the complete dose-response curve without any additional assumptions is usually done via linear interpolation between the point estimates. Software for computing isotone (monotonic) regression has been developed for R,
Stata Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose Statistics, statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers ...
, and
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (prog ...
.


Problem statement and algorithms

Let (x_1,y_1),\ldots,(x_n,y_n) be a given set of observations, where the y_i \in \mathbb and the x_i fall in some
partially ordered set In mathematics, especially order theory, a partial order on a Set (mathematics), set is an arrangement such that, for certain pairs of elements, one precedes the other. The word ''partial'' is used to indicate that not every pair of elements need ...
. For generality, each observation (x_i,y_i) may be given a weight w_i \ge 0, although commonly w_i=1 for all i. Isotonic regression seeks a weighted
least-squares The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ...
fit \hat_i \approx y_i for all i, subject to the constraint that \hat_i \le \hat_j whenever x_i \le x_j. This gives the following
quadratic program Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize (minimize or maximize) a multivariate quadratic function subject to linear constrai ...
(QP) in the variables \hat_1, \ldots, \hat_n: :\min \sum_^n w_i (\hat_i - y_i)^2 subject to \hat_i \le \hat_j \text (i,j) \in E where E = \ specifies the partial ordering of the observed inputs x_i (and may be regarded as the set of edges of some
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ...
(dag) with vertices 1,2,\ldots n). Problems of this form may be solved by generic quadratic programming techniques. In the usual setting where the x_i values fall in a
totally ordered set In mathematics, a total order or linear order is a partial order in which any two elements are comparable. That is, a total order is a binary relation \leq on some set X, which satisfies the following for all a, b and c in X: # a \leq a ( ref ...
such as \mathbb, we may assume
WLOG ''Without loss of generality'' (often abbreviated to WOLOG, WLOG or w.l.o.g.; less commonly stated as ''without any loss of generality'' or ''with no loss of generality'') is a frequently used expression in mathematics. The term is used to indicat ...
that the observations have been sorted so that x_1 \le x_2 \le \cdots \le x_n, and take E = \. In this case, a simple
iterative algorithm In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''i''-th approximation (called an "iterate") ...
for solving the quadratic program is the
pool adjacent violators algorithm Pool may refer to: Bodies of water * Swimming pool, usually an artificial structure containing a large body of water intended for swimming * Reflecting pool, a shallow pool designed to reflect a structure and its surroundings * Tide pool, a rocky ...
. Conversely, Best and Chakravarti studied the problem as an
active set In mathematical optimization, the active-set method is an algorithm used to identify the active constraints in a set of inequality constraints. The active constraints are then expressed as equality constraints, thereby transforming an inequalit ...
identification problem, and proposed a primal algorithm. These two algorithms can be seen as each other's dual, and both have a
computational complexity In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations ...
of O(n) on already sorted data. To complete the isotonic regression task, we may then choose any non-decreasing function f(x) such that f(x_i) = \hat_i for all i. Any such function obviously solves : \min_f \sum_^n w_i (f(x_i) - y_i)^2 subject to f being nondecreasing and can be used to predict the y values for new values of x. A common choice when x_i \in \mathbb would be to interpolate linearly between the points (x_i,\hat_i), as illustrated in the figure, yielding a continuous piecewise linear function: :f(x) = \begin \hat_1 & \text x \le x_1 \\ \hat_i + \frac(\hat_-\hat_i) & \text x_i \le x \le x_ \\ \hat_n & \text x \ge x_n \end


Centered isotonic regression

As this article's first figure shows, in the presence of monotonicity violations the resulting interpolated curve will have flat (constant) intervals. In dose-response applications it is usually known that f(x) is not only monotone but also smooth. The flat intervals are incompatible with f(x)'s assumed shape, and can be shown to be biased. A simple improvement for such applications, named centered isotonic regression (CIR), was developed by Oron and Flournoy and shown to substantially reduce estimation error for both dose-response and dose-finding applications. Both CIR and the standard isotonic regression for the univariate, simply ordered case, are implemented in the R package "cir". This package also provides analytical confidence-interval estimates.


References


Further reading

* * * * {{DEFAULTSORT:Isotonic Regression Nonparametric regression Nonparametric Bayesian statistics Numerical analysis