The average absolute deviation (AAD) of a data set is the
average
In ordinary language, an average is a single number taken as representative of a list of numbers, usually the sum of the numbers divided by how many numbers are in the list (the arithmetic mean). For example, the average of the numbers 2, 3, 4, 7, ...
of the
absolute Absolute may refer to:
Companies
* Absolute Entertainment, a video game publisher
* Absolute Radio, (formerly Virgin Radio), independent national radio station in the UK
* Absolute Software Corporation, specializes in security and data risk manage ...
deviations from a
central point. It is a
summary statistic
In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate the largest amount of information as simply as possible. Statisticians commonly try to describe the observations in
* a measure of ...
of
statistical dispersion
In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a Probability distribution, distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard de ...
or variability. In the general form, the central point can be a
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set.
For a data set, the ''arithme ...
,
median
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
,
mode
Mode ( la, modus meaning "manner, tune, measure, due measure, rhythm, melody") may refer to:
Arts and entertainment
* '' MO''D''E (magazine)'', a defunct U.S. women's fashion magazine
* ''Mode'' magazine, a fictional fashion magazine which is ...
, or the result of any other measure of central tendency or any reference value related to the given data set.
AAD includes the mean absolute deviation and the median absolute deviation (both abbreviated as MAD).
Measures of dispersion
Several measures of
statistical dispersion
In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a Probability distribution, distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard de ...
are defined in terms of the absolute deviation.
The term "average absolute deviation" does not uniquely identify a measure of
statistical dispersion
In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a Probability distribution, distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard de ...
, as there are several measures that can be used to measure absolute deviations, and there are several measures of
central tendency
In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.Weisberg H.F (1992) ''Central Tendency and Variability'', Sage University Paper Series on Quantitative Applications ...
that can be used as well. Thus, to uniquely identify the absolute deviation it is necessary to specify both the measure of deviation and the measure of central tendency. Unfortunately, the statistical literature has not yet adopted a standard notation, as both the
mean absolute deviation around the mean and the
median absolute deviation around the median have been denoted by their initials "MAD" in the literature, which may lead to confusion, since in general, they may have values considerably different from each other.
Mean absolute deviation around a central point
The mean absolute deviation of a set is
The choice of measure of central tendency,
, has a marked effect on the value of the mean deviation. For example, for the data set :
Mean absolute deviation around the mean
The mean absolute deviation (MAD), also referred to as the "mean deviation" or sometimes "average absolute deviation", is the mean of the data's absolute deviations around the data's mean: the average (absolute) distance from the mean. "Average absolute deviation" can refer to either this usage, or to the general form with respect to a specified central point (see above).
MAD has been proposed to be used in place of
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
since it corresponds better to real life. Because the MAD is a simpler measure of variability than the
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
, it can be useful in school teaching.
This method's forecast accuracy is very closely related to the
mean squared error
In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between ...
(MSE) method which is just the average squared error of the forecasts. Although these methods are very closely related, MAD is more commonly used because it is both easier to compute (avoiding the need for squaring) and easier to understand.
For the
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu ...
, the ratio of mean absolute deviation from the mean to standard deviation is
. Thus if ''X'' is a normally distributed random variable with expected value 0 then, see Geary (1935):
In other words, for a normal distribution, mean absolute deviation is about 0.8 times the standard deviation.
However, in-sample measurements deliver values of the ratio of mean average deviation / standard deviation for a given Gaussian sample ''n'' with the following bounds:
, with a bias for small ''n''.
[See also Geary's 1936 and 1946 papers: Geary, R. C. (1936). Moments of the ratio of the mean deviation to the standard deviation for normal samples. Biometrika, 28(3/4), 295–307 and Geary, R. C. (1947). Testing for normality. Biometrika, 34(3/4), 209–242.]
The mean absolute deviation from the mean is less than or equal to the
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
; one way of proving this relies on
Jensen's inequality
In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier pr ...
.
Mean absolute deviation around the median
The
median
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
is the point about which the mean deviation is minimized. The MAD median offers a direct measure of the scale of a random variable around its median
This is the
maximum likelihood
In statistics, maximum likelihood estimation (MLE) is a method of estimation theory, estimating the Statistical parameter, parameters of an assumed probability distribution, given some observed data. This is achieved by Mathematical optimization, ...
estimator of the scale parameter
of the
Laplace distribution.
Since the median minimizes the average absolute distance, we have
.
The mean absolute deviation from the median is less than or equal to the mean absolute deviation from the mean. In fact, the mean absolute deviation from the median is always less than or equal to the mean absolute deviation from any other fixed number.
By using the general dispersion function, Habib (2011) defined MAD about median as
where the indicator function is
This representation allows for obtaining MAD median correlation coefficients.
Median absolute deviation around a central point
While in principle the mean or any other central point could be taken as the central point for the median absolute deviation, most often the
median
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
value is taken instead.
Median absolute deviation around the median
The median absolute deviation (also MAD) is the ''
median
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
'' of the absolute deviation from the ''
median
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
''. It is a
robust estimator of dispersion.
For the example : 3 is the median, so the absolute deviations from the median are (reordered as ) with a median of 1, in this case unaffected by the value of the outlier 14, so the median absolute deviation is 1.
For a symmetric distribution, the median absolute deviation is equal to half the
interquartile range
In descriptive statistics, the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference ...
.
Maximum absolute deviation
The maximum absolute deviation around an arbitrary point is the maximum of the absolute deviations of a sample from that point. While not strictly a measure of central tendency, the maximum absolute deviation can be found using the formula for the average absolute deviation as above with
, where
is the
sample maximum
In statistics, the sample maximum and sample minimum, also called the largest observation and smallest observation, are the values of the greatest and least elements of a sample. They are basic summary statistics, used in descriptive statistic ...
.
Minimization
The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency as ''minimizing'' dispersion:
The median is the measure of central tendency most associated with the absolute deviation. Some location parameters can be compared as follows:
*
''L''2 norm statistics: the mean minimizes the
mean squared error
In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between ...
*
''L''1 norm statistics: the median minimizes ''average'' absolute deviation,
*
''L''∞ norm statistics: the
mid-range
In statistics, the mid-range or mid-extreme is a measure of central tendency of a sample defined as the arithmetic mean of the maximum and minimum values of the data set:
:M=\frac.
The mid-range is closely related to the range, a measure of ...
minimizes the ''maximum'' absolute deviation
* trimmed
''L''∞ norm statistics: for example, the
midhinge In statistics, the midhinge is the average of the first and third quartiles and is thus a measure of location.
Equivalently, it is the 25% trimmed mid-range or 25% midsummary; it is an L-estimator.
: \operatorname(X) = \overline = \frac = \frac ...
(average of first and third
quartiles) which minimizes the ''median'' absolute deviation of the whole distribution, also minimizes the ''maximum'' absolute deviation of the distribution after the top and bottom 25% have been trimmed off.
Estimation
The mean absolute deviation of a sample is a
biased estimator
In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In st ...
of the mean absolute deviation of the population.
In order for the absolute deviation to be an unbiased estimator, the expected value (average) of all the sample absolute deviations must equal the population absolute deviation. However, it does not. For the population 1,2,3 both the population absolute deviation about the median and the population absolute deviation about the mean are 2/3. The average of all the sample absolute deviations about the mean of size 3 that can be drawn from the population is 44/81, while the average of all the sample absolute deviations about the median is 4/9. Therefore, the absolute deviation is a biased estimator.
However, this argument is based on the notion of mean-unbiasedness. Each measure of location has its own form of unbiasedness (see entry on
biased estimator
In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In st ...
). The relevant form of unbiasedness here is median unbiasedness.
See also
*
Deviation (statistics) In mathematics and statistics, deviation is a measure of difference between the observed value of a variable and some other value, often that variable's mean. The sign of the deviation reports the direction of that difference (the deviation is posit ...
**
Median absolute deviation
In statistics, the median absolute deviation (MAD) is a robust measure of the variability of a univariate sample of quantitative data. It can also refer to the population parameter that is estimated by the MAD calculated from a sample.
For a un ...
**
Squared deviations
Squared deviations from the mean (SDM) result from squaring deviations. In probability theory and statistics, the definition of ''variance'' is either the expected value of the SDM (when considering a theoretical distribution) or its average valu ...
**
Least absolute deviations
Least absolute deviations (LAD), also known as least absolute errors (LAE), least absolute residuals (LAR), or least absolute values (LAV), is a statistical optimality criterion and a statistical optimization technique based minimizing the ''sum o ...
* Errors
**
Mean absolute error
In statistics, mean absolute error (MAE) is a measure of errors between paired observations expressing the same phenomenon. Examples of ''Y'' versus ''X'' include comparisons of predicted versus observed, subsequent time versus initial time, and ...
**
Mean absolute percentage error
The mean absolute percentage error (MAPE), also known as mean absolute percentage deviation (MAPD), is a measure of prediction accuracy of a forecasting method in statistics. It usually expresses the accuracy as a ratio defined by the formula:
: ...
**
Probable error In statistics, probable error defines the half-range of an interval about a central point for the distribution, such that half of the values from the distribution will lie within the interval and half outside.Dodge, Y. (2006) ''The Oxford Dictiona ...
*
Mean absolute difference
*
Average rectified value In electrical engineering, the average rectified value (ARV) of a quantity is the average of its absolute value.
The average of a symmetric alternating value is zero and it is therefore not useful to characterize it. Thus the easiest way to deter ...
References
External links
Advantages of the mean absolute deviation
{{DEFAULTSORT:Absolute Deviation
Statistical deviation and dispersion