HOME

TheInfoList



OR:

Functional data analysis (FDA) is a branch of
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
that analyses data providing information about curves, surfaces or anything else varying over a continuum. In its most general form, under an FDA framework, each sample element of functional data is considered to be a random function. The physical continuum over which these functions are defined is often time, but may also be spatial location, wavelength, probability, etc. Intrinsically, functional data are infinite dimensional. The high intrinsic dimensionality of these data brings challenges for theory as well as computation, where these challenges vary with how the functional data were sampled. However, the high or infinite dimensional structure of the data is a rich source of information and there are many interesting challenges for research and data analysis.


History

Functional data analysis has roots going back to work by Grenander and Karhunen in the 1940s and 1950s. They considered the decomposition of square-integrable continuous time
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
into eigencomponents, now known as the Karhunen-Loève decomposition. A rigorous analysis of functional principal components analysis was done in the 1970s by Kleffe, Dauxois and Pousse including results about the asymptotic distribution of the eigenvalues. More recently in the 1990s and 2000s the field has focused more on applications and understanding the effects of dense and sparse observations schemes. The term "Functional Data Analysis" was coined by James O. Ramsay.


Mathematical formalism

Random functions can be viewed as random elements taking values in a
Hilbert space In mathematics, Hilbert spaces (named after David Hilbert) allow generalizing the methods of linear algebra and calculus from (finite-dimensional) Euclidean vector spaces to spaces that may be infinite-dimensional. Hilbert spaces arise natural ...
, or as a
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
. The former is mathematically convenient, whereas the latter is somewhat more suitable from an applied perspective. These two approaches coincide if the random functions are continuous and a condition called mean-squared continuity is satisfied.


Hilbertian random variables

In the Hilbert space viewpoint, one considers an H-valued random element X, where H is a separable Hilbert space such as the space of
square-integrable function In mathematics, a square-integrable function, also called a quadratically integrable function or L^2 function or square-summable function, is a real- or complex-valued measurable function for which the integral of the square of the absolute value i ...
s L^2 ,1/math>. Under the integrability condition that \mathbb E\, X\, _^2 = \mathbb (\int^1_0 , X(t), ^2 dt) < \infty, one can define the mean of X as the unique element \mu\in H satisfying : \mathbb E\langle X , h \rangle = \langle \mu,h\rangle ,\qquad h\in H. This formulation is the
Pettis integral In mathematics, the Pettis integral or Gelfand–Pettis integral, named after Israel Gelfand, Israel M. Gelfand and Billy James Pettis, extends the definition of the Lebesgue integral to vector-valued functions on a measure space, by exploiting ...
but the mean can also be defined as
Bochner integral In mathematics, the Bochner integral, named for Salomon Bochner, extends the definition of Lebesgue integral to functions that take values in a Banach space, as the limit of integrals of simple functions. Definition Let (X, \Sigma, \mu) be a meas ...
\mu=\mathbb EX. Under the integrability condition that \mathbb E\, X\, ^2_ is finite, the
covariance operator In probability theory, for a probability measure P on a Hilbert space ''H'' with inner product \langle \cdot,\cdot\rangle , the covariance of P is the bilinear form Cov: ''H'' × ''H'' → R given by :\mathrm(x, y) ...
of X is a
linear operator In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pre ...
\mathcal:H\to H that is uniquely defined by the relation : \mathcal h = \mathbb E langle h,X-\mu\rangle (X-\mu)\qquad h\in H, or, in
tensor In mathematics, a tensor is an algebraic object that describes a multilinear relationship between sets of algebraic objects related to a vector space. Tensors may map between different objects such as vectors, scalars, and even other tenso ...
form, \mathcal = \mathbb E X-\mu)\otimes(X-\mu)/math>. The
spectral theorem In mathematics, particularly linear algebra and functional analysis, a spectral theorem is a result about when a linear operator or matrix (mathematics), matrix can be Diagonalizable matrix, diagonalized (that is, represented as a diagonal matrix i ...
allows to decompose X as the Karhunen-Loève decomposition : X=\mu+\sum_^\infty \langle X,\varphi_i\rangle \varphi_i, where \varphi_i are
eigenvectors In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
of \mathcal C, corresponding to the nonnegative
eigenvalues In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
of \mathcal C, in a non-increasing order. Truncating this infinite series to a finite order underpins functional principal component analysis.


Stochastic processes

The Hilbertian point of view is mathematically convenient, but abstract; the above considerations do not necessarily even view X as a function at all, since common choices of H like L^2 ,1/math> and
Sobolev spaces In mathematics, a Sobolev space is a vector space of functions equipped with a norm that is a combination of ''Lp''-norms of the function together with its derivatives up to a given order. The derivatives are understood in a suitable weak sense t ...
consist of equivalence classes, not functions. The stochastic process perspective views X as a collection of random variables : \_ indexed by the unit interval (or more generally interval \mathcal T). The mean and covariance functions are defined in a pointwise manner as : \mu(t) = \mathbb EX(t),\qquad \Sigma(s,t) = \textrm(X(s), X(t)),\qquad s,t\in
, 1 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline (t ...
(if \mathbb E (t)^2\infty for all t\in ,1/math>). Under the mean square continuity, \mu and \Sigma are continuous functions and then the covariance function \Sigma defines a covariance operator \mathcal:H\to H given b The
spectral theorem In mathematics, particularly linear algebra and functional analysis, a spectral theorem is a result about when a linear operator or matrix (mathematics), matrix can be Diagonalizable matrix, diagonalized (that is, represented as a diagonal matrix i ...
applies to \mathcal, yielding eigenpairs (\lambda_j,\varphi_j), so that in
tensor product In mathematics, the tensor product V \otimes W of two vector spaces and (over the same field) is a vector space to which is associated a bilinear map V\times W \to V\otimes W that maps a pair (v,w),\ v\in V, w\in W to an element of V \otimes W ...
notation \mathcal writes : \mathcal=\sum_^\infty \lambda_j \varphi_j\otimes\varphi_j. Moreover, since \mathcal Cf is continuous for all f\in H, all the \varphi_j are continuous.
Mercer's theorem In mathematics, specifically functional analysis, Mercer's theorem is a representation of a symmetric positive-definite function on a square as a sum of a convergent sequence of product functions. This theorem, presented in , is one of the most not ...
then states that : \sup_ \left, \Sigma(s,t) - \sum_^K \lambda_j \varphi_j(s)\varphi_j(t)\\to0,\qquad K\to\infty. Finally, under the extra assumption that X has continuous sample paths, namely that with probability one, the random function X: ,1to\mathbb R is continuous, the Karhunen-Loève expansion above holds for X and the Hilbert space machinery can be subsequently applied. Continuity of sample paths can be shown using
Kolmogorov continuity theorem In mathematics, the Kolmogorov continuity theorem is a theorem that guarantees that a stochastic process that satisfies certain constraints on the moments of its increments will be continuous (or, more precisely, have a "continuous version"). It is ...
.


Functional data designs

Functional data are considered as realizations of a
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
X(t),\ t\in ,1/math> that is an L^2 process on a bounded and closed interval ,1/math> with mean function \mu(t)=\mathbb(X(t)) and covariance function \Sigma(s,t)=\textrm(X(s),X(t)). The realizations of the process for the i-th subject is X_i(\cdot), and the sample is assumed to consist of n independent subjects. The sampling schedule may vary across subjects, denoted as T_,..., T_ for the i-th subject. The corresponding i-th observation is denoted as \textbf_i = (X_,..., X_) , where X_ = X_i(T_). In addition, the measurement of X_ is assumed to have random noise \epsilon_ with \mathbb(\epsilon_)=0 and \textrm(\epsilon_)=\sigma^2_, which are independent across i and j.


1. Fully observed functions without noise at arbitrarily dense grid

Measurements Y_=X_i(t) available for all t\in \mathcal,\, i=1,\ldots,n Often unrealistic but mathematically convenient. Real life example: Tecator spectral data.


2. Densely sampled functions with noisy measurements (dense design)

Measurements Y_=X_i(T_) + \varepsilon_ , where T_ are recorded on a regular grid, T_,\ldots,T_ , and N_\rarr\infty applies to typical functional data. Real life example
Berkeley Growth Study Data
an
Stock data


3. Sparsely sampled functions with noisy measurements (longitudinal data)

Measurements Y_=X_i(T_) + \varepsilon_ , where T_ are random times and their number N_ per subject is random and finite. Real life example: CD4 count data for AIDS patients.


Functional principal component analysis

Functional principal component analysis (FPCA) is the most prevalent tool in FDA, partly because FPCA facilitates
dimension reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
of the inherently infinite-dimensional functional data to finite-dimensional random vector of scores. More specifically,
dimension reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
is achieved by expanding the underlying observed random trajectories X_i(t) in a functional basis consisting of the eigenfunctions of the covariance operator on X. Consider the covariance operator \mathcal: L^2 ,1\rightarrow L^2 ,1/math> as in (), which is a
compact operator on Hilbert space In the mathematical discipline of functional analysis, the concept of a compact operator on Hilbert space is an extension of the concept of a matrix acting on a finite-dimensional vector space; in Hilbert space, compact operators are precisely the ...
. By
Mercer's theorem In mathematics, specifically functional analysis, Mercer's theorem is a representation of a symmetric positive-definite function on a square as a sum of a convergent sequence of product functions. This theorem, presented in , is one of the most not ...
, the kernel of \mathcal, i.e., the covariance function \Sigma (\cdot, \cdot), has spectral decomposition \Sigma(s,t)=\sum^_\lambda_k\varphi_k(s)\varphi_k(t), where the series convergence is absolute and uniform, and \lambda_k are real-valued nonnegative eigenvalues in descending order with the corresponding orthonormal eigenfunctions \varphi_k(t) . By the Karhunen–Loève theorem, the FPCA expansion of an underlying random trajectory is X_i(t) =\mu(t)+\sum^_ A_\varphi_k(t), where A_=\int_0^1(X_i(t)-\mu(t))\varphi_k(t)dt are the functional principal components (FPCs), sometimes referred to as scores. The Karhunen–Loève expansion facilitates
dimension reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
in the sense that the partial sum converges uniformly, i.e., \sup_ \mathbb _i(t)-\mu(t)-\sum^_A_\varphi_k(t)2 \rightarrow 0 as K \rightarrow \infty and thus the partial sum with a large enough K yields a good approximation to the infinite sum. Thereby, the information in X_i is reduced from infinite dimensional to a K-dimensional vector A_i=(A_,...,A_) with the approximated process: Other popular bases include spline,
Fourier series A Fourier series () is a summation of harmonically related sinusoidal functions, also known as components or harmonics. The result of the summation is a periodic function whose functional form is determined by the choices of cycle length (or ''p ...
and wavelet bases. Important applications of FPCA include the modes of variation and functional principal component regression.


Functional linear regression models

Functional linear models can be viewed as an extension of the traditional multivariate linear models that associates vector responses with vector covariates. The traditional linear model with scalar response Y\in\mathbb and vector covariate X\in\mathbb^p can be expressed aswhere \langle\cdot,\cdot\rangle denotes the
inner product In mathematics, an inner product space (or, rarely, a Hausdorff space, Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation (mathematics), operation called an inner product. The inner product of two ve ...
in
Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's Elements, Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics ther ...
, \beta_0\in\mathbb and \beta\in\mathbb^p denote the regression coefficients, and \varepsilon is a zero
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithme ...
finite
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
random error (noise). Functional linear models can be divided into two types based on the responses.


Functional regression models with scalar response

Replacing the vector covariate X and the coefficient vector \beta in model () by a centered functional covariate X^c(t) = X(t) - \mu(t) and coefficient function \beta = \beta(t) for t \in ,1/math> and replacing the inner product in Euclidean space by that in
Hilbert space In mathematics, Hilbert spaces (named after David Hilbert) allow generalizing the methods of linear algebra and calculus from (finite-dimensional) Euclidean vector spaces to spaces that may be infinite-dimensional. Hilbert spaces arise natural ...
L^2, one arrives at the functional linear modelThe simple functional linear model () can be extended to multiple functional covariates, \_^p, also including additional vector covariates Z = (Z_1, \cdots,Z_q), where Z_1=1, bywhere \theta\in \mathbb is regression coefficient for Z, the domain of X_j is ,1/math>, X^c_j is the centered functional covariate given by X_j^c(t) = X_j(t) - \mu_j(t), and \beta_j is regression coefficient function for X_j^c , for j=1,\ldots,p. Models () and () have been studied extensively.


Functional regression models with functional response

Consider a functional response Y(s) on ,1/math> and multiple functional covariates X_j(t), t\in ,1/math>, j=1,\ldots,p. Two major models have been considered in this setup. One of these two models, generally referred to as functional linear model (FLM), can be written as:where \alpha_0(s) is the functional intercept, for j=1,\ldots,p , X_j^c(t)=X_j(t) - \mu_j(t) is a centered functional covariate on ,1/math>, \alpha_j(s,t) is the corresponding functional slopes with same domain, respectively, and \varepsilon(s) is usually a random process with mean zero and finite variance. In this case, at any given time s\in ,1/math>, the value of Y, i.e., Y(s), depends on the entire trajectories of \^p_. Model () has been studied extensively.


Function-on-scalar regression

In particular, taking X_j(\cdot) as a constant function yields a special case of model ()Y(s) = \alpha_0(s)+\sum_^p X_j \alpha_j(s) + \varepsilon(s),\ \text\ s\in ,1which is a functional linear model with functional responses and scalar covariates.


Concurrent regression models

This model is given by,where X_1,\ldots,X_p are functional covariates on ,1/math>, \beta_0,\beta_1,\ldots,\beta_p are the coefficient functions defined on the same interval and \varepsilon(s) is usually assumed to be a random process with mean zero and finite variance. This model assumes that the value of Y(s) depends on the current value of \^p_ only and not the history \^p_ or future value. Hence, it is a "concurrent regression model", which is also referred as "varying-coefficient" model. Further, various estimation methods have been proposed.


Functional nonlinear regression models

Direct nonlinear extensions of the classical functional linear regression models (FLMs) still involve a linear predictor, but combine it with a nonlinear link function, analogous to the idea of
generalized linear model In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and b ...
from the conventional linear model. Developments towards fully nonparametric regression models for functional data encounter problems such as
curse of dimensionality The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The ...
. In order to bypass the "curse" and the metric selection problem, we are motivated to consider nonlinear functional regression models, which are subject to some structural constraints but do not overly infringe flexibility. One desires models that retain polynomial rates of convergence, while being more flexible than, say, functional linear models. Such models are particularly useful when diagnostics for the functional linear model indicate lack of fit, which is often encountered in real life situations. In particular, functional polynomial models, functional single and multiple index models and functional
additive model In statistics, an additive model (AM) is a nonparametric regression method. It was suggested by Jerome H. Friedman and Werner Stuetzle (1981) and is an essential part of the ACE algorithm. The ''AM'' uses a one-dimensional smoother to build a rest ...
s are three special cases of functional nonlinear regression models.


Functional polynomial regression models

Functional polynomial regression models may be viewed as a natural extension of the Functional Linear Models (FLMs) with scalar responses, analogous to extending linear regression model to
polynomial regression In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable ''x'' and the dependent variable ''y'' is modelled as an ''n''th degree polynomial in ''x''. Polynomial regression fi ...
model. For a scalar response Y and a functional covariate X(\cdot) with domain ,1/math> and the corresponding centered predictor processes X^c, the simplest and the most prominent member in the family of functional polynomial regression models is the quadratic functional regressionYao, F; Müller, HG. (2010). "Functional quadratic regression". ''Biometrika''. 97 (1):49–64. given as follows,\mathbb(Y, X) = \alpha + \int_0^1\beta(t)X^c(t)\,dt + \int_0^1 \int_0^1 \gamma(s,t) X^c(s)X^c(t) \,ds\,dt where X^c(\cdot) = X(\cdot) - \mathbb(X(\cdot)) is the centered functional covariate, \alpha is a scalar coefficient, \beta(\cdot) and \gamma(\cdot,\cdot) are coefficient functions with domains ,1/math> and ,1times ,1/math>, respectively. In addition to the parameter function β that the above functional quadratic regression model shares with the FLM, it also features a parameter surface γ. By analogy to FLMs with scalar responses, estimation of functional polynomial models can be obtained through expanding both the centered covariate X^c and the coefficient functions \beta and \gamma in an orthonormal basis.


Functional single and multiple index models

A functional multiple index model is given as below, with symbols having their usual meanings as formerly described, \mathbb(Y, X) = g\left(\int_0^1X^c(t) \beta_1(t)\,dt, \ldots, \int_0^1X^c(t) \beta_p(t)\,dt \right) Here g represents an (unknown) general smooth function defined on a p-dimensional domain. The case p=1 yields a functional single index model while multiple index models correspond to the case p>1. However, for p>1, this model is problematic due to
curse of dimensionality The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The ...
. With p>1 and relatively small sample sizes, the estimator given by this model often has large variance.Chen, D; Hall, P; Müller HG. (2011). "Single and multiple index functional regression models with nonparametric link". ''The Annals of Statistics''. 39 (3):1720–1747.


Functional additive models (FAMs)

For a given orthonormal basis \_^\infty on L^2 ,1/math>, we can expand X^c(t) =\sum_^\infty x_k \phi_k(t) on the domain ,1/math>. A functional linear model with scalar responses (see ()) can thus be written as follows,\mathbb(Y, X)=\mathbb(Y) + \sum_^\infty \beta_k x_k.One form of FAMs is obtained by replacing the linear function of x_k in the above expression ( i.e., \beta_k x_k) by a general smooth function f_k, analogous to the extension of multiple linear regression models to additive models and is expressed as,\mathbb(Y, X)=\mathbb(Y) + \sum_^\infty f_k(x_k),where f_k satisfies \mathbb(f_k(x_k))=0 for k\in\mathbb. This constraint on the general smooth functions f_k ensures identifiability in the sense that the estimates of these additive component functions do not interfere with that of the intercept term \mathbb(Y). Another form of FAM is the continuously additive model, expressed as,\mathbb(Y, X) = \mathbb(Y) + \int_0^1 g(t,X(t)) dtfor a bivariate smooth additive surface g: ,1times\mathbb \longrightarrow \mathbb which is required to satisfy \mathbb (t,X(t))0 for all t\in ,1/math>, in order to ensure identifiability.


Generalized functional linear model

An obvious and direct extension of FLMs with scalar responses (see ()) is to add a link function leading to a generalized functional linear model (GFLM) in analogy to the
generalized linear model In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and b ...
(GLM). The three components of the GFLM are: # Linear predictor \eta = \beta_0 + \int_0^1 X^c(t)\beta(t)\,dt; ystematic component#
Variance function In statistics, the variance function is a smooth function which depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statisti ...
\text(Y, X) = V(\mu), where \mu = \mathbb(Y, X) is the
conditional mean In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value – the value it would take “on average” over an arbitrarily large number of occurrences – given ...
; andom component#
Link function In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and by ...
g connecting the conditional mean \mu and the linear predictor \eta through \mu=g(\eta). ystematic component


Clustering and classification of functional data

For vector-valued multivariate data, k-means partitioning methods and
hierarchical clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into ...
are two main approaches. These classical clustering concepts for vector-valued multivariate data have been extended to functional data. For clustering of functional data, k-means clustering methods are more popular than hierarchical clustering methods. For k-means clustering on functional data, mean functions are usually regarded as the cluster centers. Covariance structures have also been taken into consideration. Besides k-means type clustering, functional clustering based on
mixture model In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation ...
s is also widely used in clustering vector-valued multivariate data and has been extended to functional data clustering. Furthermore, Bayesian hierarchical clustering also plays an important role in the development of model-based functional clustering. Functional classification assigns a group membership to a new data object either based on functional regression or functional discriminant analysis. Functional data classification methods based on functional regression models use class levels as responses and the observed functional data and other covariates as predictors. For regression based functional classification models, functional generalized linear models or more specifically, functional binary regression, such as functional logistic regression for binary responses, are commonly used classification approaches. More generally, the generalized functional linear regression model based on the FPCA approach is used. Functional Linear Discriminant Analysis (FLDA) has also been considered as a classification method for functional data. Functional data classification involving density ratios has also been proposed. A study of the asymptotic behavior of the proposed classifiers in the large sample limit shows that under certain conditions the misclassification rate converges to zero, a phenomenon that has been referred to as "perfect classification".


Time warping


Motivations

In addition to amplitude variation, time variation may also be assumed to present in functional data. Time variation occurs when the subject-specific timing of certain events of interest varies among subjects. One classical example is th
Berkeley Growth Study Data
where the amplitude variation is the growth rate and the time variation explains the difference in children's biological age at which the pubertal and the pre-pubertal growth spurt occurred. In the presence of time variation, the cross-sectional mean function may not be an efficient estimate as peaks and troughs are located randomly and thus meaningful signals may be distorted or hidden. Time warping, also known as curve registration, curve alignment or time synchronization, aims to identify and separate amplitude variation and time variation. If both time and amplitude variation are present, then the observed functional data Y_i can be modeled as Y_i(t)=X_i _i^(t) t\in ,1/math>, where X_i\oversetX is a latent amplitude function and h_i\overset h is a latent time warping function that corresponds to a cumulative distribution function. The time warping functions h are assumed to be invertible and to satisfy \mathbb(h^(t))=t. The simplest case of a family of warping functions to specify phase variation is linear transformation, that is h(t)=\delta+\gamma t, which warps the time of an underlying template function by subjected-specific shift and scale. More general class of warping functions includes
diffeomorphism In mathematics, a diffeomorphism is an isomorphism of smooth manifolds. It is an invertible function that maps one differentiable manifold to another such that both the function and its inverse are differentiable. Definition Given two m ...
s of the domain to itself, that is, loosely speaking, a class of invertible functions that maps the compact domain to itself such that both the function and its inverse are smooth. The set of linear transformation is contained in the set of
diffeomorphism In mathematics, a diffeomorphism is an isomorphism of smooth manifolds. It is an invertible function that maps one differentiable manifold to another such that both the function and its inverse are differentiable. Definition Given two m ...
s. One challenge in time warping is identifiability of amplitude and phase variation. Specific assumptions are required to break this non-identifiability.


Methods

Earlier approaches include
dynamic time warping In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walki ...
(DTW) used for applications such as
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the m ...
. Another traditional method for time warping is landmark registration, which aligns special features such as peak locations to an average location. Other relevant warping methods include pairwise warping, registration using \mathcal^2 distance and elastic warping.


Dynamic time warping

The template function is determined through an iteration process, starting from cross-sectional mean, performing registration and recalculating the cross-sectional mean for the warped curves, expecting convergence after a few iterations. DTW minimizes a cost function through dynamic programming. Problems of non-smooth differentiable warps or greedy computation in DTW can be resolved by adding a regularization term to the cost function.


Landmark registration

Landmark registration (or feature alignment) assumes well-expressed features are present in all sample curves and uses the location of such features as a gold-standard. Special features such as peak or trough locations in functions or derivatives are aligned to their average locations on the template function. Then the warping function is introduced through a smooth transformation from the average location to the subject-specific locations. A problem of landmark registration is that the features may be missing or hard to identify due to the noise in the data.


Extensions

So far we considered scalar valued stochastic process, \_ , defined on one dimensional time domain.


Multidimensional domain of X(\cdot)

The domain of X(\cdot) can be in R^p , for example the data could be a sample of random surfaces.


Multivariate stochastic process

The range set of the stochastic process may be extended from R to R^p and further to nonlinear manifolds, Hilbert spaces and eventually to metric spaces.


Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
packages

There are Python packages to work with functional data, and its representation, perform exploratory analysis, or preprocessing, and among other tasks such as inference, classification, regression or clustering of functional data.
scikit-fda


R packages

Some packages can handle functional data under both dense and longitudinal designs.
fda
ref name=":7"/>



ref name=":5">



classiFunc






ref name=":2" />


See also

* Functional principal component analysis * Karhunen–Loève theorem * Modes of variation * Functional regression * Generalized functional linear model *
Stochastic processes In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appe ...
*
Lp space In mathematics, the spaces are function spaces defined using a natural generalization of the Norm (mathematics)#p-norm, -norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue , although ...
*
Variance function In statistics, the variance function is a smooth function which depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statisti ...


Further reading

* Ramsay, J. O. and Silverman, B.W. (2005) ''Functional data analysis'', 2nd ed., New York: Springer, *Horvath, L. and Kokoszka, P. (2012) ''Inference for Functional Data with Applications'', New York: Springer,
ISBN The International Standard Book Number (ISBN) is a numeric commercial book identifier that is intended to be unique. Publishers purchase ISBNs from an affiliate of the International ISBN Agency. An ISBN is assigned to each separate edition and ...
978-1-4614-3654-6 * Hsing, T. and Eubank, R. (2015) ''Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators'', Wiley series in probability and statistics, John Wiley & Sons, Ltd, * Morris, J. (2015) Functional Regression, ''Annual Review of Statistics and Its Application, Vol. 2, 321 - 359,'' https://doi.org/10.1146/annurev-statistics-010814-020413 * Wang et al. (2016) Functional Data Analysis, ''Annual Review of Statistics and Its Application, Vol. 3, 257-295,'' https://doi.org/10.1146/annurev-statistics-041715-033624 :Regression analysis


References

{{Reflist Statistical data types Statistical analysis