In
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the fraction of variance unexplained (FVU) in the context of a
regression task is the fraction of variance of the
regressand
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
(dependent variable) ''Y'' which cannot be explained, i.e., which is not correctly predicted, by the
explanatory variables ''X''.
Formal definition
Suppose we are given a regression function
yielding for each
an estimate
where
is the vector of the ''i''
th observations on all the explanatory variables.
We define the fraction of variance unexplained (FVU) as:
:
where ''R''
2 is the
coefficient of determination
In statistics, the coefficient of determination, denoted ''R''2 or ''r''2 and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s).
It is a statistic used i ...
and ''VAR''
err and ''VAR''
tot are the variance of the residuals and the sample variance of the dependent variable. ''SS''
''err'' (the sum of squared predictions errors, equivalently the
residual sum of squares
In statistics, the residual sum of squares (RSS), also known as the sum of squared estimate of errors (SSE), is the sum of the squares of residuals (deviations predicted from actual empirical values of data). It is a measure of the discrepanc ...
), ''SS''
''tot'' (the
total sum of squares), and ''SS''
''reg'' (the sum of squares of the regression, equivalently the
explained sum of squares) are given by
:
Alternatively, the fraction of variance unexplained can be defined as follows:
:
where MSE(''f'') is the
mean squared error of the regression function ''ƒ''.
Explanation
It is useful to consider the second definition to understand FVU. When trying to predict ''Y'', the most naïve regression function that we can think of is the constant function predicting the mean of ''Y'', i.e.,
. It follows that the MSE of this function equals the variance of ''Y''; that is, ''SS''
err = ''SS''
tot, and ''SS''
reg = 0. In this case, no variation in ''Y'' can be accounted for, and the FVU then has its maximum value of 1.
More generally, the FVU will be 1 if the explanatory variables ''X'' tell us nothing about ''Y'' in the sense that the predicted values of ''Y'' do not
covary
In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the les ...
with ''Y''. But as prediction gets better and the MSE can be reduced, the FVU goes down. In the case of perfect prediction where
for all ''i'', the MSE is 0, ''SS''
err = 0, ''SS''
reg = ''SS''
tot, and the FVU is 0.
See also
*
Coefficient of determination
In statistics, the coefficient of determination, denoted ''R''2 or ''r''2 and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s).
It is a statistic used i ...
*
Correlation
In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
*
Explained sum of squares
*
Lack-of-fit sum of squares
*
Linear regression
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...
*
Regression analysis
*
Mean absolute scaled error
References
{{DEFAULTSORT:Fraction Of Variance Unexplained
Parametric statistics
Statistical ratios
Least squares