Functional Regression
   HOME

TheInfoList



OR:

Functional regression is a version of
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
when responses or covariates include functional data. Functional regression models can be classified into four types depending on whether the responses or covariates are functional or scalar: (i) scalar responses with functional covariates, (ii) functional responses with scalar covariates, (iii) functional responses with functional covariates, and (iv) scalar or functional responses with functional and scalar covariates. In addition, functional regression models can be
linear Linearity is the property of a mathematical relationship (''function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear r ...
, partially linear, or
nonlinear In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other ...
. In particular, functional polynomial models, functional single and multiple index models and functional
additive model In statistics, an additive model (AM) is a nonparametric regression method. It was suggested by Jerome H. Friedman and Werner Stuetzle (1981) and is an essential part of the ACE algorithm. The ''AM'' uses a one-dimensional smoother to build a rest ...
s are three special cases of functional nonlinear models. __TOC__


Functional linear models (FLMs)

Functional linear models (FLMs) are an extension of
linear models In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the ter ...
(LMs). A linear model with scalar response Y\in\mathbb and scalar covariates X\in\mathbb^p can be written as where \langle\cdot,\cdot\rangle denotes the
inner product In mathematics, an inner product space (or, rarely, a Hausdorff space, Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation (mathematics), operation called an inner product. The inner product of two ve ...
in
Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's Elements, Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics ther ...
, \beta_0\in\mathbb and \beta\in\mathbb^p denote the regression coefficients, and \varepsilon is a random error with
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithme ...
zero and finite
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
. FLMs can be divided into two types based on the responses.


Functional linear models with scalar responses

Functional linear models with scalar responses can be obtained by replacing the scalar covariates X and the coefficient vector \beta in model () by a centered functional covariate X^c(\cdot) = X(\cdot) - \mathbb(X(\cdot)) and a coefficient function \beta = \beta(\cdot) with
domain Domain may refer to: Mathematics *Domain of a function, the set of input values for which the (total) function is defined **Domain of definition of a partial function **Natural domain of a partial function **Domain of holomorphy of a function * Do ...
\mathcal, respectively, and replacing the inner product in Euclidean space by that in
Hilbert space In mathematics, Hilbert spaces (named after David Hilbert) allow generalizing the methods of linear algebra and calculus from (finite-dimensional) Euclidean vector spaces to spaces that may be infinite-dimensional. Hilbert spaces arise natural ...
L^2, where \langle \cdot, \cdot \rangle here denotes the inner product in L^2. One approach to estimating \beta_0 and \beta(\cdot) is to expand the centered covariate X^c(\cdot) and the coefficient function \beta(\cdot) in the same functional basis, for example,
B-spline In the mathematical subfield of numerical analysis, a B-spline or basis spline is a spline function that has minimal support with respect to a given degree, smoothness, and domain partition. Any spline function of given degree can be expresse ...
basis or the eigenbasis used in the Karhunen–Loève expansion. Suppose \_^\infty is an
orthonormal basis In mathematics, particularly linear algebra, an orthonormal basis for an inner product space ''V'' with finite dimension is a basis for V whose vectors are orthonormal, that is, they are all unit vectors and orthogonal to each other. For example, ...
of L^2. Expanding X^c and \beta in this basis, X^c(\cdot) = \sum_^\infty x_k \phi_k(\cdot), \beta(\cdot) = \sum_^\infty \beta_k \phi_k(\cdot), model () becomes Y = \beta_0 + \sum_^\infty \beta_k x_k +\varepsilon. For implementation, regularization is needed and can be done through truncation, L^2 penalization or L^1 penalization. In addition, a reproducing kernel Hilbert space (RKHS) approach can also be used to estimate \beta_0 and \beta(\cdot) in model () Adding multiple functional and scalar covariates, model () can be extended to where Z_1,\ldots,Z_q are scalar covariates with Z_1=1, \alpha_1,\ldots,\alpha_q are regression coefficients for Z_1,\ldots,Z_q, respectively, X^c_j is a centered functional covariate given by X_j^c(\cdot) = X_j(\cdot) - \mathbb(X_j(\cdot)), \beta_j is regression coefficient function for X_j^c(\cdot), and \mathcal_j is the domain of X_j and \beta_j, for j=1,\ldots,p. However, due to the parametric component \alpha, the estimation methods for model () cannot be used in this case and alternative estimation methods for model () are available.


Functional linear models with functional responses

For a functional response Y(\cdot) with domain \mathcal and a functional covariate X(\cdot) with domain \mathcal, two FLMs regressing Y(\cdot) on X(\cdot) have been considered. One of these two models is of the form where X^c(\cdot) = X(\cdot) - \mathbb(X(\cdot)) is still the centered functional covariate, \beta_0(\cdot) and \beta(\cdot,\cdot) are coefficient functions, and \varepsilon(\cdot) is usually assumed to be a random process with mean zero and finite variance. In this case, at any given time t\in\mathcal, the value of Y, i.e., Y(t), depends on the entire trajectory of X. Model (), for any given time t, is an extension of
multivariate linear regression The general linear model or general multivariate regression model is a compact way of simultaneously writing several multiple linear regression models. In that sense it is not a separate statistical linear model. The various multiple linear regre ...
with the inner product in Euclidean space replaced by that in L^2. An estimating equation motivated by multivariate linear regression is r_ = R_\beta, \text \beta\in L^2(\mathcal\times\mathcal), where r_(s,t) = \text(X(s),Y(t)), R_: L^2(\mathcal\times\mathcal) \rightarrow L^2(\mathcal\times\mathcal) is defined as (R_\beta)(s,t) = \int_\mathcal r_(s,w)\beta(w,t)dw with r_(s,w) = \text(X(s),X(w)) for s,w\in\mathcal. Regularization is needed and can be done through truncation, L^2 penalization or L^1 penalization. Various estimation methods for model () are available.
When X and Y are concurrently observed, i.e., \mathcal=\mathcal, it is reasonable to consider a historical functional linear model, where the current value of Y only depends on the history of X, i.e., \beta(s,t)=0 for s>t in model (). A simpler version of the historical functional linear model is the functional concurrent model (see below).
Adding multiple functional covariates, model () can be extended to where for j=1,\ldots,p, X_j^c(\cdot)=X_j(\cdot) - \mathbb(X_j(\cdot)) is a centered functional covariate with domain \mathcal_j, and \beta_j(\cdot,\cdot) is the corresponding coefficient function with the same domain, respectively. In particular, taking X_j(\cdot) as a constant function yields a special case of model () Y(t) = \sum_^p X_j \beta_j(t) + \varepsilon(t),\ \text\ t\in\mathcal, which is a FLM with functional responses and scalar covariates.


Functional concurrent models

Assuming that \mathcal = \mathcal, another model, known as the functional concurrent model, sometimes also referred to as the varying-coefficient model, is of the form where \alpha_0 and \alpha are coefficient functions. Note that model () assumes the value of Y at time t, i.e., Y(t), only depends on that of X at the same time, i.e., X(t). Various estimation methods can be applied to model ().
Adding multiple functional covariates, model () can also be extended to Y(t) = \alpha_0(t) + \sum_^p\alpha_j(t)X_j(t)+\varepsilon(t),\ \text\ t\in\mathcal, where X_1,\ldots,X_p are multiple functional covariates with domain \mathcal and \alpha_0,\alpha_1,\ldots,\alpha_p are the coefficient functions with the same domain.


Functional nonlinear models


Functional polynomial models

Functional polynomial models are an extension of the FLMs with scalar responses, analogous to extending linear regression to polynomial regression. For a scalar response Y and a functional covariate X(\cdot) with domain \mathcal, the simplest example of functional polynomial models is functional quadratic regressionYao and Müller (2010). "Functional quadratic regression". ''Biometrika''. 97 (1):49–64. doibr>10.1093/biomet/asp069
Y = \alpha + \int_\mathcal\beta(t)X^c(t)\,dt + \int_\mathcal \int_\mathcal \gamma(s,t) X^c(s)X^c(t) \,ds\,dt + \varepsilon, where X^c(\cdot) = X(\cdot) - \mathbb(X(\cdot)) is the centered functional covariate, \alpha is a scalar coefficient, \beta(\cdot) and \gamma(\cdot,\cdot) are coefficient functions with domains \mathcal and \mathcal\times\mathcal, respectively, and \varepsilon is a random error with mean zero and finite variance. By analogy to FLMs with scalar responses, estimation of functional polynomial models can be obtained through expanding both the centered covariate X^c and the coefficient functions \beta and \gamma in an orthonormal basis.


Functional single and multiple index models

A functional multiple index model is given by Y = g\left(\int_ X^c(t) \beta_1(t)\,dt, \ldots, \int_ X^c(t) \beta_p(t)\,dt \right) + \varepsilon. Taking p=1 yields a functional single index model. However, for p>1, this model is problematic due to curse of dimensionality. With p>1 and relatively small sample sizes, the estimator given by this model often has large variance.Chen, Hall and Müller (2011). "Single and multiple index functional regression models with nonparametric link". ''The Annals of Statistics''. 39 (3):1720–1747. doibr>10.1214/11-AOS882
An alternative p-component functional multiple index model can be expressed as Y = g_1\left(\int_ X^c(t) \beta_1(t)\,dt\right)+ \cdots+ g_p\left(\int_ X^c(t) \beta_p(t)\,dt \right) + \varepsilon. Estimation methods for functional single and multiple index models are available.


Functional additive models (FAMs)

Given an expansion of a functional covariate X with domain \mathcal in an orthonormal basis \_^\infty: X(t) = \sum_^\infty x_k \phi_k(t), a functional linear model with scalar responses shown in model () can be written as \mathbb(Y, X)=\mathbb(Y) + \sum_^\infty \beta_k x_k. One form of FAMs is obtained by replacing the linear function of x_k, i.e., \beta_k x_k, by a general smooth function f_k, \mathbb(Y, X)=\mathbb(Y) + \sum_^\infty f_k(x_k), where f_k satisfies \mathbb(f_k(x_k))=0 for k\in\mathbb. Another form of FAMs consists of a sequence of time-additive models: \mathbb(Y, X(t_1),\ldots,X(t_p))=\sum_^p f_j(X(t_j)), where \ is a dense grid on \mathcal with increasing size p\in\mathbb, and f_j(x) = g(t_j,x) with g a smooth function, for j=1,\ldots,pFan, James and Radchenko (2015). "Functional additive regression". ''The Annals of Statistics''. 43 (5):2296–2325. doibr>10.1214/15-AOS1346


Extensions

A direct extension of FLMs with scalar responses shown in model () is to add a link function to create a generalized functional linear model (GFLM) by analogy to extending
linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...
to generalized linear regression (GLM), of which the three components are: # Linear predictor \eta = \beta_0 + \int_ X^c(t)\beta(t)\,dt; #
Variance function In statistics, the variance function is a smooth function which depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statisti ...
\text(Y, X) = V(\mu), where \mu = \mathbb{E}(Y, X) is the
conditional mean In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value – the value it would take “on average” over an arbitrarily large number of occurrences – given ...
; # Link function g connecting the conditional mean and the linear predictor through \mu=g(\eta).


See also

*
Functional data analysis Functional data analysis (FDA) is a branch of statistics that analyses data providing information about curves, surfaces or anything else varying over a continuum. In its most general form, under an FDA framework, each sample element of functional ...
*
Functional principal component analysis Functional principal component analysis (FPCA) is a statistical method for investigating the dominant modes of variation of functional data. Using this method, a random function is represented in the eigenbasis, which is an orthonormal basis of ...
* Karhunen–Loève theorem * Generalized functional linear model *
Stochastic processes In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appe ...


References

Regression analysis