HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the fraction of variance unexplained (FVU) in the context of a regression task is the fraction of variance of the regressand (dependent variable) ''Y'' which cannot be explained, i.e., which is not correctly predicted, by the
explanatory variable A variable is considered dependent if it depends on (or is hypothesized to depend on) an independent variable. Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical function ...
s ''X''.


Formal definition

Suppose we are given a regression function f yielding for each y_i an estimate \widehat_i = f(x_i) where x_i is the vector of the ''i''th observations on all the explanatory variables. We define the fraction of variance unexplained (FVU) as: :\begin \text & = = = \left( = 1- , \text\right) \\ pt & = 1 - R^2 \end where ''R''2 is the coefficient of determination and ''VAR''err and ''VAR''tot are the variance of the residuals and the sample variance of the dependent variable. ''SS''''err'' (the sum of squared predictions errors, equivalently the residual sum of squares), ''SS''''tot'' (the
total sum of squares In statistical data analysis the total sum of squares (TSS or SST) is a quantity that appears as part of a standard way of presenting results of such analyses. For a set of observations, y_i, i\leq n, it is defined as the sum over all squared dif ...
), and ''SS''''reg'' (the sum of squares of the regression, equivalently the explained sum of squares) are given by :\begin \text_\text & = \sum_^N\;(y_i - \widehat_i)^2\\ \text_\text & = \sum_^N\;(y_i-\bar)^2 \\ \text_\text & = \sum_^N\;(\widehat_i-\bar)^2 \text \\ \bar & = \frac 1 N \sum_^N\;y_i. \end Alternatively, the fraction of variance unexplained can be defined as follows: : \text = \frac where MSE(''f'') is the
mean squared error In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference betwee ...
of the regression function ''ƒ''.


Explanation

It is useful to consider the second definition to understand FVU. When trying to predict ''Y'', the most naive regression function that we can think of is the constant function predicting the mean of ''Y'', i.e., f(x_i)=\bar. It follows that the MSE of this function equals the variance of ''Y''; that is, ''SS''err = ''SS''tot, and ''SS''reg = 0. In this case, no variation in ''Y'' can be accounted for, and the FVU then has its maximum value of 1. More generally, the FVU will be 1 if the explanatory variables ''X'' tell us nothing about ''Y'' in the sense that the predicted values of ''Y'' do not covary with ''Y''. But as prediction gets better and the MSE can be reduced, the FVU goes down. In the case of perfect prediction where \hat_i = y_i for all ''i'', the MSE is 0, ''SS''err = 0, ''SS''reg = ''SS''tot, and the FVU is 0.


See also

* Coefficient of determination *
Correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
* Explained sum of squares * Lack-of-fit sum of squares *
Linear regression In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
* Regression analysis * Mean absolute scaled error


References

{{DEFAULTSORT:Fraction Of Variance Unexplained Parametric statistics Statistical ratios Least squares