HOME

TheInfoList



OR:

Smoothing splines are function estimates, \hat f(x), obtained from a set of noisy observations y_i of the target f(x_i), in order to balance a measure of
goodness of fit The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measure ...
of \hat f(x_i) to y_i with a derivative based measure of the smoothness of \hat f(x). They provide a means for smoothing noisy x_i, y_i data. The most familiar example is the cubic smoothing spline, but there are many other possibilities, including for the case where x is a vector quantity.


Cubic spline definition

Let \ be a set of observations, modeled by the relation Y_i = f(x_i) + \epsilon_i where the \epsilon_i are independent, zero mean random variables (usually assumed to have constant variance). The cubic smoothing spline estimate \hat f of the function f is defined to be the minimizer (over the class of twice differentiable functions) of : \sum_^n \^2 + \lambda \int \hat f''(x)^2 \,dx. Remarks: * \lambda \ge 0 is a smoothing parameter, controlling the trade-off between fidelity to the data and roughness of the function estimate. This is often estimated by generalized cross-validation, or by restricted marginal likelihood (REML) which exploits the link between spline smoothing and Bayesian estimation (the smoothing penalty can be viewed as being induced by a prior on the f). * The integral is often evaluated over the whole real line although it is also possible to restrict the range to that of x_i. * As \lambda\to 0 (no smoothing), the smoothing spline converges to the
interpolating spline In the mathematical field of numerical analysis, spline interpolation is a form of interpolation where the interpolant is a special type of piecewise polynomial called a spline. That is, instead of fitting a single, high-degree polynomial to all ...
. * As \lambda\to\infty (infinite smoothing), the roughness penalty becomes paramount and the estimate converges to a
linear least squares Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and ...
estimate. * The roughness penalty based on the
second derivative In calculus, the second derivative, or the second order derivative, of a function (mathematics), function is the derivative of the derivative of . Roughly speaking, the second derivative measures how the rate of change of a quantity is itself ...
is the most common in modern statistics literature, although the method can easily be adapted to penalties based on other derivatives. * In early literature, with equally-spaced ordered