HOME

TheInfoList



OR:

In
robust statistics Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, suc ...
, repeated median regression, also known as the repeated median estimator, is a robust linear regression algorithm. The estimator has a
breakdown point Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such ...
of 50%. Although it is
equivariant In mathematics, equivariance is a form of symmetry for functions from one space with symmetry to another (such as symmetric spaces). A function is said to be an equivariant map when its domain and codomain are acted on by the same symmetry group, ...
under scaling, or under
linear transformation In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pre ...
s of either its explanatory variable or its response variable, it is not under
affine transformation In Euclidean geometry, an affine transformation or affinity (from the Latin, ''affinis'', "connected with") is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles. More generally, ...
s that combine both variables.Peter J. Rousseeuw, Nathan S. Netanyahu, and David M. Mount,
New Statistical and Computational Results on the Repeated Median Regression Estimator
, in ''New Directions in Statistical Data Analysis and Robustness'', edited by Stephan Morgenthaler, Elvezio Ronchetti, and Werner A. Stahel, Birkhauser Verlag, Basel, 1993, pp. 177-194.
It can be calculated in O(n^2) time by brute force, in O(n \log^2 n) time using more sophisticated techniques, or in O(n\log n) randomized expected time. It may also be calculated using an on-line algorithm with O(n) update time.


Method

The repeated median method estimates the slope of the regression line y = A + Bx for a set of points (X_i, Y_i) as :\widehat B = \underset \ \underset \ \operatorname(i, j) where \operatorname(i,j) is defined as (Y_j - Y_i) / (X_j - X_i). The estimated Y-axis intercept is defined as :\widehat A = \underset \ \underset \ \operatorname(i, j) where \operatorname(i, j) is defined as (X_j Y_i - X_i Y_j ) / (X_j - X_i).


See also

*
Theil–Sen estimator In non-parametric statistics, the Theil–Sen estimator is a method for robustly fitting a line to sample points in the plane (simple linear regression) by choosing the median of the slopes of all lines through pairs of points. It has also bee ...


References

Robust regression Statistical algorithms {{statistics-stub