HOME

TheInfoList



OR:

In statistics, the predicted residual error sum of squares (PRESS) is a form of cross-validation used in
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
to provide a summary measure of the fit of a model to a sample of observations that were not themselves used to estimate the model. It is calculated as the sums of squares of the prediction residuals for those observations. A ''fitted model'' having been produced, each observation in turn is removed and the model is refitted using the remaining observations. The out-of-sample predicted value is calculated for the omitted observation in each case, and the PRESS statistic is calculated as the sum of the squares of all the resulting prediction errors: : \operatorname =\sum_^n (y_i - \hat_)^2 Given this procedure, the PRESS statistic can be calculated for a number of candidate model structures for the same dataset, with the lowest values of PRESS indicating the best structures. Models that are over-parameterised ( over-fitted) would tend to give small residuals for observations included in the model-fitting but large residuals for observations that are excluded. PRESS statistic has been extensively used in
Lazy Learning In machine learning, lazy learning is a learning method in which generalization of the training data is, in theory, delayed until a query is made to the system, as opposed to eager learning, where the system tries to generalize the training data ...
and locally linear learning to speed-up the assessment and the selection of the neighbourhood size.


See also

*
Model selection Model selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the ...


References

Regression diagnostics Model selection {{statistics-stub