Smoothing splines are function estimates,
, obtained from a set of noisy observations
of the target
, in order to balance a measure of
goodness of fit
The goodness of fit of a statistical model describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measur ...
of
to
with a derivative based measure of the smoothness of
. They provide a means for smoothing noisy
data. The most familiar example is the cubic smoothing spline, but there are many other possibilities, including for the case where
is a vector quantity.
Cubic spline definition
Let
be a set of observations, modeled by the relation
where the
are independent, zero mean random variables. The cubic smoothing spline estimate
of the function
is defined to be the unique minimizer, in the
Sobolev space
In mathematics, a Sobolev space is a vector space of functions equipped with a norm that is a combination of ''Lp''-norms of the function together with its derivatives up to a given order. The derivatives are understood in a suitable weak sense ...
on a compact interval, of
:
Remarks:
*
is a smoothing parameter, controlling the trade-off between fidelity to the data and roughness of the function estimate. This is often estimated by generalized cross-validation, or by restricted marginal likelihood (REML) which exploits the link between spline smoothing and Bayesian estimation (the smoothing penalty can be viewed as being induced by a prior on the
).
* The integral is often evaluated over the whole real line although it is also possible to restrict the range to that of
.
* As
(no smoothing), the smoothing spline converges to the
interpolating spline.
* As
(infinite smoothing), the roughness penalty becomes paramount and the estimate converges to a
linear least squares
Linear least squares (LLS) is the least squares approximation of linear functions to data.
It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and ...
estimate.
* The roughness penalty based on the
second derivative
In calculus, the second derivative, or the second-order derivative, of a function is the derivative of the derivative of . Informally, the second derivative can be phrased as "the rate of change of the rate of change"; for example, the secon ...
is the most common in modern statistics literature, although the method can easily be adapted to penalties based on other derivatives.
* In early literature, with equally-spaced ordered
, second or third-order differences were used in the penalty, rather than derivatives. See also
Whittaker–Henderson smoothing.
* The penalized sum of squares smoothing objective can be replaced by a ''penalized likelihood'' objective in which the sum of squares terms is replaced by another log-likelihood based measure of fidelity to the data.
[ The sum of squares term corresponds to penalized likelihood with a Gaussian assumption on the .
]
Derivation of the cubic smoothing spline
It is useful to think of fitting a smoothing spline in two steps:
# First, derive the values .
# From these values, derive for all ''x''.
Now, treat the second step first.
Given the vector of fitted values, the sum-of-squares part of the spline criterion is fixed. It remains only to minimize , and the minimizer is a natural cubic spline that interpolates the points . This interpolating spline is a linear operator, and can be written in the form
:
where are a set of spline basis functions. As a result, the roughness penalty has the form
:
where the elements of ''A'' are . The basis functions, and hence the matrix ''A'', depend on the configuration of the predictor variables , but not on the responses or .
''A'' is an ''n''×''n'' matrix given by .
''Δ'' is an ''(n-2)''×''n'' matrix of second differences with elements:
, ,
''W'' is an ''(n-2)''×''(n-2)'' symmetric tri-diagonal matrix with elements:
, and , the distances between successive knots (or x values).
Now back to the first step. The penalized sum-of-squares can be written as
:
where .
Minimizing over by differentiating against . This results in:
and
De Boor's approach
De Boor's approach exploits the same idea, of finding a balance between having a smooth curve and being close to the given data.
:
where is a parameter called smooth factor and belongs to the interval