Quantile regression is a type of
regression analysis used in statistics and econometrics. Whereas the
method of least squares
The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ...
estimates the conditional ''
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
'' of the response variable across values of the predictor variables, quantile regression estimates the conditional ''
median
The median of a set of numbers is the value separating the higher half from the lower half of a Sample (statistics), data sample, a statistical population, population, or a probability distribution. For a data set, it may be thought of as the “ ...
'' (or other ''
quantiles'') of the response variable.
here is also a method for predicting the conditional geometric mean of the response variable,
[Tofallis (2015). "A Better Measure of Relative Prediction Accuracy for Model Selection and Model Estimation", ''Journal of the Operational Research Society'', 66(8):1352-1362]
/ref>.] Quantile regression is an extension of linear regression used when the conditions of linear regression are not met.
Advantages and applications
One advantage of quantile regression relative to ordinary least squares regression is that the quantile regression estimates are more robust against outliers in the response measurements. However, the main attraction of quantile regression goes beyond this and is advantageous when conditional quantile functions are of interest. Different measures of central tendency
In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.Weisberg H.F (1992) ''Central Tendency and Variability'', Sage University Paper Series on Quantitative Applications in ...
and statistical dispersion
In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartil ...
can be used to more comprehensively analyze the relationship between variables.
In ecology
Ecology () is the natural science of the relationships among living organisms and their Natural environment, environment. Ecology considers organisms at the individual, population, community (ecology), community, ecosystem, and biosphere lev ...
, quantile regression has been proposed and used as a way to discover more useful predictive relationships between variables in cases where there is no relationship or only a weak relationship between the means of such variables. The need for and success of quantile regression in ecology has been attributed to the complexity
Complexity characterizes the behavior of a system or model whose components interact in multiple ways and follow local rules, leading to non-linearity, randomness, collective dynamics, hierarchy, and emergence.
The term is generally used to c ...
of interactions between different factors leading to data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
with unequal variation of one variable for different ranges of another variable.
Another application of quantile regression is in the areas of growth charts, where percentile curves are commonly used to screen for abnormal growth.
History
The idea of estimating a median regression slope, a major theorem about minimizing sum of the absolute deviances and a geometrical algorithm for constructing median regression was proposed in 1760 by Ruđer Josip Bošković, a Jesuit Catholic priest from Dubrovnik. He was interested in the ellipticity of the earth, building on Isaac Newton's suggestion that its rotation could cause it to bulge at the equator
The equator is the circle of latitude that divides Earth into the Northern Hemisphere, Northern and Southern Hemisphere, Southern Hemispheres of Earth, hemispheres. It is an imaginary line located at 0 degrees latitude, about in circumferen ...
with a corresponding flattening at the poles. He finally produced the first geometric procedure for determining the equator
The equator is the circle of latitude that divides Earth into the Northern Hemisphere, Northern and Southern Hemisphere, Southern Hemispheres of Earth, hemispheres. It is an imaginary line located at 0 degrees latitude, about in circumferen ...
of a rotating planet
A planet is a large, Hydrostatic equilibrium, rounded Astronomical object, astronomical body that is generally required to be in orbit around a star, stellar remnant, or brown dwarf, and is not one itself. The Solar System has eight planets b ...
from three observation
Observation in the natural sciences is an act or instance of noticing or perceiving and the acquisition of information from a primary source. In living beings, observation employs the senses. In science, observation can also involve the percep ...
s of a surface feature. More importantly for quantile regression, he was able to develop the first evidence of the least absolute criterion and preceded the least squares introduced by Legendre in 1805 by fifty years.
Other thinkers began building upon Bošković's idea such as Pierre-Simon Laplace
Pierre-Simon, Marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French polymath, a scholar whose work has been instrumental in the fields of physics, astronomy, mathematics, engineering, statistics, and philosophy. He summariz ...
, who developed the so-called "methode de situation." This led to Francis Edgeworth
Francis Ysidro Edgeworth (8 February 1845 – 13 February 1926) was an Anglo-Irish philosopher and political economist who made significant contributions to the methods of statistics during the 1880s. From 1891 onward, he was appointed th ...
's plural median - a geometric approach to median regression - and is recognized as the precursor of the simplex method
In mathematical optimization, Dantzig's simplex algorithm (or simplex method) is a popular algorithm for linear programming.
The name of the algorithm is derived from the concept of a simplex and was suggested by T. S. Motzkin. Simplices are n ...
. The works of Bošković, Laplace, and Edgeworth were recognized as a prelude to Roger Koenker's contributions to quantile regression.
Median regression computations for larger data sets are quite tedious compared to the least squares method, for which reason it has historically generated a lack of popularity among statisticians, until the widespread adoption of computers in the latter part of the 20th century.
Background: quantiles
Quantile regression expresses the conditional quantiles of a dependent variable as a linear function of the explanatory variables. Crucial to the practicality of quantile regression is that the quantiles can be expressed as the solution of a minimization problem, as we will show in this section before discussing conditional quantiles in the next section.
Quantile of a random variable
Let be a real-valued random variable with cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ever ...
. The th quantile of Y is given by
:
where
Define the loss function
In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost ...
as , where is an indicator function
In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , then the indicator functio ...
.
A specific quantile can be found by minimizing the expected loss of with respect to :():
:
This can be shown by computing the derivative of the expected loss with respect to via an application of the Leibniz integral rule
In calculus, the Leibniz integral rule for differentiation under the integral sign, named after Gottfried Wilhelm Leibniz, states that for an integral of the form
\int_^ f(x,t)\,dt,
where -\infty < a(x), b(x) < \infty and the integrands ...
, setting it to 0, and letting be the solution of
:
This equation reduces to
:
and then to
:
If the solution is not unique, then we have to take the smallest such solution to obtain
the th quantile of the random variable ''Y''.
Example
Let be a discrete random variable that takes values with with equal probabilities. The task is to find the median of Y, and hence the value is chosen. Then the expected loss of is
:
Since is a constant, it can be taken out of the expected loss function (this is only true if ). Then, at ''u''=3,
:
Suppose that ''u'' is increased by 1 unit. Then the expected loss will be changed by on changing ''u'' to 4. If, ''u''=5, the expected loss is
:
and any change in ''u'' will increase the expected loss. Thus ''u''=5 is the median. The Table below shows the expected loss (divided by ) for different values of ''u''.
Intuition
Consider and let ''q'' be an initial guess for . The expected loss evaluated at ''q'' is
:
In order to minimize the expected loss, we move the value of ''q'' a little bit to see whether the expected loss will rise or fall.
Suppose we increase ''q'' by 1 unit. Then the change of expected loss would be
:
The first term of the equation is and second term of the equation is . Therefore, the change of expected loss function is negative if and only if , that is if and only if ''q'' is smaller than the median. Similarly, if we reduce ''q'' by 1 unit, the change of expected loss function is negative if and only if ''q'' is larger than the median.
In order to minimize the expected loss function, we would increase (decrease) ''q'' if ''q'' is smaller (larger) than the median, until ''q'' reaches the median. The idea behind the minimization is to count the number of points (weighted with the density) that are larger or smaller than ''q'' and then move ''q'' to a point where ''q'' is larger than % of the points.
Sample quantile
The sample quantile can be obtained by using an importance sampling
Importance sampling is a Monte Carlo method for evaluating properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. Its introduction in statistics is generally at ...
estimate and solving the following minimization problem
:
: