Nonparametric regression is a category of

regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...

in which the predictor does not take a predetermined form but is constructed according to information derived from the data. That is, no parametric form is assumed for the relationship between predictors and dependent variable. Nonparametric regression requires larger sample sizes than regression based on

parametric model In statistics, a parametric model or parametric family or finite-dimensional model is a particular class of statistical models. Specifically, a parametric model is a family of probability distributions that has a finite number of parameters. Defi ...

s because the data must supply the model structure as well as the model estimates.

Definition

In nonparametric regression, we have random variables

X

and

Y

and assume the following relationship: :

\mathbb \mid X=x = m(x),

where

m(x)

is some deterministic function.

Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...

is a restricted case of nonparametric regression where

m(x)

is assumed to be affine. Some authors use a slightly stronger assumption of additive noise: :

Y = m(X) + U,

where the random variable

U

is the `noise term', with mean 0. Without the assumption that

m

belongs to a specific parametric family of functions it is impossible to get an unbiased estimate for

m

, however most estimators are

consistent In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent i ...

under suitable conditions.

List of general-purpose nonparametric regression algorithms

This is a non-exhaustive list of non-parametric models for regression. * nearest neighbors, see

nearest-neighbor interpolation Nearest-neighbor interpolation (also known as proximal interpolation or, in some contexts, point sampling) is a simple method of multivariate interpolation in one or more dimensions. Interpolation is the problem of approximating the value of a ...

and

k-nearest neighbors algorithm In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and regress ...

* regression trees *

kernel regression In statistics, kernel regression is a non-parametric technique to estimate the conditional expectation of a random variable. The objective is to find a non-linear relation between a pair of random variables ''X'' and ''Y''. In any nonparametric r ...

local regression Local regression or local polynomial regression, also known as moving regression, is a generalization of the moving average and polynomial regression. Its most common methods, initially developed for scatterplot smoothing, are LOESS (locally es ...

multivariate adaptive regression splines In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatica ...

smoothing splines Smoothing splines are function estimates, \hat f(x), obtained from a set of noisy observations y_i of the target f(x_i), in order to balance a measure of goodness of fit of \hat f(x_i) to y_i with a derivative based measure of the smoothness of \ ...

Examples

Gaussian process regression or Kriging

In Gaussian process regression, also known as Kriging, a Gaussian prior is assumed for the regression curve. The errors are assumed to have a

multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...

and the regression curve is estimated by its

posterior mode In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the ...

. The Gaussian prior may depend on unknown hyperparameters, which are usually estimated via

empirical Bayes Empirical Bayes methods are procedures for statistical inference in which the prior probability distribution is estimated from the data. This approach stands in contrast to standard Bayesian methods, for which the prior distribution is fixed b ...

. The hyperparameters typically specify a prior covariance kernel. In case the kernel should also be inferred nonparametrically from the data, the critical filter can be used.

Smoothing splines Smoothing splines are function estimates, \hat f(x), obtained from a set of noisy observations y_i of the target f(x_i), in order to balance a measure of goodness of fit of \hat f(x_i) to y_i with a derivative based measure of the smoothness of \ ...

have an interpretation as the posterior mode of a Gaussian process regression.

Kernel regression

Kernel regression estimates the continuous dependent variable from a limited set of data points by convolving the data points' locations with a

kernel function In operator theory, a branch of mathematics, a positive-definite kernel is a generalization of a positive-definite function or a positive-definite matrix. It was first introduced by James Mercer in the early 20th century, in the context of solving ...

—approximately speaking, the kernel function specifies how to "blur" the influence of the data points so that their values can be used to predict the value for nearby locations.

Regression trees

Decision tree learning algorithms can be applied to learn to predict a dependent variable from data. Although the original Classification And Regression Tree (CART) formulation applied only to predicting univariate data, the framework can be used to predict multivariate data, including time series.

References

External links

HyperNiche, software for nonparametric multiplicative regression

Scale-adaptive nonparametric regression
(with Matlab software). {{statistics