Mean absolute error
   HOME

TheInfoList



OR:

In statistics, mean absolute error (MAE) is a measure of errors between paired observations expressing the same phenomenon. Examples of ''Y'' versus ''X'' include comparisons of predicted versus observed, subsequent time versus initial time, and one technique of measurement versus an alternative technique of measurement. MAE is calculated as the sum of absolute errors divided by the
sample size Sample size determination is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a populatio ...
: \mathrm = \frac =\frac. It is thus an arithmetic average of the absolute errors , e_i, = , y_i - x_i, , where y_i is the prediction and x_i the true value. Note that alternative formulations may include relative frequencies as weight factors. The mean absolute error uses the same scale as the data being measured. This is known as a scale-dependent accuracy measure and therefore cannot be used to make comparisons between series using different scales. The mean absolute error is a common measure of
forecast error In statistics, a forecast error is the difference between the actual or real and the predicted or forecasting, forecast value of a time series or any other phenomenon of interest. Since the forecast error is derived from the same scale of data, com ...
in
time series analysis In mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in m ...
, sometimes used in confusion with the more standard definition of
mean absolute deviation The average absolute deviation (AAD) of a data set is the average of the Absolute value, absolute Deviation (statistics), deviations from a central tendency, central point. It is a summary statistics, summary statistic of statistical dispersion or ...
. The same confusion exists more generally.


Quantity disagreement and allocation disagreement

It is possible to express MAE as the sum of two components: Quantity Disagreement and Allocation Disagreement. Quantity Disagreement is the absolute value of the Mean Error given by: \mathrm = \frac. Allocation Disagreement is MAE minus Quantity Disagreement. It is also possible to identify the types of difference by looking at an (x,y) plot. Quantity difference exists when the average of the X values does not equal the average of the Y values. Allocation difference exists if and only if points reside on both sides of the identity line.


Related measures

The mean absolute error is one of a number of ways of comparing forecasts with their eventual outcomes. Well-established alternatives are the
mean absolute scaled error In statistics, the mean absolute scaled error (MASE) is a measure of the accuracy of forecasts. It is the mean absolute error of the forecast values, divided by the mean absolute error of the in-sample one-step naive forecast. It was proposed in 20 ...
(MASE) and the
mean squared error In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between ...
. These all summarize performance in ways that disregard the direction of over- or under- prediction; a measure that does place emphasis on this is the
mean signed difference In statistics, the mean signed difference (MSD), also known as mean signed deviation and mean signed error, is a sample statistic that summarises how well a set of estimates \hat_i match the quantities \theta_i that they are supposed to estimate ...
. Where a prediction model is to be fitted using a selected performance measure, in the sense that the least squares approach is related to the
mean squared error In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between ...
, the equivalent for mean absolute error is
least absolute deviations Least absolute deviations (LAD), also known as least absolute errors (LAE), least absolute residuals (LAR), or least absolute values (LAV), is a statistical optimality criterion and a statistical optimization technique based minimizing the ''sum o ...
. MAE is not identical to root-mean square error (RMSE), although some researchers report and interpret it that way. MAE is conceptually simpler and also easier to interpret than RMSE: it is simply the average absolute vertical or horizontal distance between each point in a scatter plot and the Y=X line. In other words, MAE is the average absolute difference between X and Y. Furthermore, each error contributes to MAE in proportion to the absolute value of the error. This is in contrast to RMSE which involves squaring the differences, so that a few large differences will increase the RMSE to a greater degree than the MAE. See the example above for an illustration of these differences.


Optimality property

The ''mean absolute error'' of a real variable ''c'' with respect to the random variable ''X'' is E(\left, X-c\) Provided that the probability distribution of ''X'' is such that the above expectation exists, then ''m'' is a median of ''X'' if and only if ''m'' is a minimizer of the mean absolute error with respect to ''X''. In particular, ''m'' is a sample median if and only if ''m'' minimizes the arithmetic mean of the absolute deviations. More generally, a median is defined as a minimum of E(, X-c, - , X, ), as discussed at
Multivariate median In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic f ...
(and specifically at
Spatial median In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
). This optimization-based definition of the median is useful in statistical data-analysis, for example, in ''k''-medians clustering.


Proof of optimality

Statement: The classifier minimising \mathbb, y-\hat, is \hat(x)=\text(y, X=x) . Proof: The
Loss functions for classification In machine learning and mathematical optimization, loss functions for classification are computationally feasible loss functions representing the price paid for inaccuracy of predictions in classification problems (problems of identifying whic ...
is \begin L &= \mathbb \fracL_=_\int_^af_(y)\,_dy+\int_a^-f_(y)\,_dy=0 This_means \int_^a_f(y)\,_dy_=_\int_a^_f(y)\,_dy Hence F_(a)=0.5


_See_also

*_
Least_absolute_deviations Least absolute deviations (LAD), also known as least absolute errors (LAE), least absolute residuals (LAR), or least absolute values (LAV), is a statistical optimality criterion and a statistical optimization technique based minimizing the ''sum o ...
*_
Mean_absolute_percentage_error The mean absolute percentage error (MAPE), also known as mean absolute percentage deviation (MAPD), is a measure of prediction accuracy of a forecasting method in statistics. It usually expresses the accuracy as a ratio defined by the formula: : ...
*_
Mean_percentage_error In statistics, the mean percentage error (MPE) is the computed average of percentage errors by which forecasts of a model differ from actual values of the quantity being forecast. The formula for the mean percentage error is: : \text = \frac\sum ...
*_
Symmetric_mean_absolute_percentage_error Symmetric mean absolute percentage error (SMAPE or sMAPE) is an accuracy measure based on percentage (or relative) errors. It is usually defined as follows: : \text = \frac \sum_^n \frac where ''A't'' is the actual value and ''F't'' is the ...


_References

{{DEFAULTSORT:Mean_Absolute_Error Point_estimation_performance Statistical_deviation_and_dispersion Time_series Errors_and_residualshtml" ;"title="y-a, , X=x]\\ &= \int_^, y-a, f_(y)\, dy\\ &= \int_^a (a-y)f_(y)\, dy+\int_a^(y-a)f_(y)\, dy\\ \end Differentiating with respect to ''a'' gives \fracL = \int_^af_(y)\, dy+\int_a^-f_(y)\, dy=0 This means \int_^a f(y)\, dy = \int_a^ f(y)\, dy Hence F_(a)=0.5


See also

*
Least absolute deviations Least absolute deviations (LAD), also known as least absolute errors (LAE), least absolute residuals (LAR), or least absolute values (LAV), is a statistical optimality criterion and a statistical optimization technique based minimizing the ''sum o ...
*
Mean absolute percentage error The mean absolute percentage error (MAPE), also known as mean absolute percentage deviation (MAPD), is a measure of prediction accuracy of a forecasting method in statistics. It usually expresses the accuracy as a ratio defined by the formula: : ...
*
Mean percentage error In statistics, the mean percentage error (MPE) is the computed average of percentage errors by which forecasts of a model differ from actual values of the quantity being forecast. The formula for the mean percentage error is: : \text = \frac\sum ...
*
Symmetric mean absolute percentage error Symmetric mean absolute percentage error (SMAPE or sMAPE) is an accuracy measure based on percentage (or relative) errors. It is usually defined as follows: : \text = \frac \sum_^n \frac where ''A't'' is the actual value and ''F't'' is the ...


References

{{DEFAULTSORT:Mean Absolute Error Point estimation performance Statistical deviation and dispersion Time series Errors and residuals>y-a, , X=x\ &= \int_^, y-a, f_(y)\, dy\\ &= \int_^a (a-y)f_(y)\, dy+\int_a^(y-a)f_(y)\, dy\\ \end Differentiating with respect to ''a'' gives \fracL = \int_^af_(y)\, dy+\int_a^-f_(y)\, dy=0 This means \int_^a f(y)\, dy = \int_a^ f(y)\, dy Hence F_(a)=0.5


See also

*
Least absolute deviations Least absolute deviations (LAD), also known as least absolute errors (LAE), least absolute residuals (LAR), or least absolute values (LAV), is a statistical optimality criterion and a statistical optimization technique based minimizing the ''sum o ...
*
Mean absolute percentage error The mean absolute percentage error (MAPE), also known as mean absolute percentage deviation (MAPD), is a measure of prediction accuracy of a forecasting method in statistics. It usually expresses the accuracy as a ratio defined by the formula: : ...
*
Mean percentage error In statistics, the mean percentage error (MPE) is the computed average of percentage errors by which forecasts of a model differ from actual values of the quantity being forecast. The formula for the mean percentage error is: : \text = \frac\sum ...
*
Symmetric mean absolute percentage error Symmetric mean absolute percentage error (SMAPE or sMAPE) is an accuracy measure based on percentage (or relative) errors. It is usually defined as follows: : \text = \frac \sum_^n \frac where ''A't'' is the actual value and ''F't'' is the ...


References

{{DEFAULTSORT:Mean Absolute Error Point estimation performance Statistical deviation and dispersion Time series Errors and residuals