HOME

TheInfoList



OR:

In
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, Scalar potential, potential fields, Seismic tomograph ...
, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a ''sliding
dot product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a Scalar (mathematics), scalar as a result". It is also used for other symmetric bilinear forms, for example in a pseudo-Euclidean space. N ...
'' or ''sliding inner-product''. It is commonly used for searching a long signal for a shorter, known feature. It has applications in
pattern recognition Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess PR capabilities but their p ...
, single particle analysis,
electron tomography Electron tomography (ET) is a tomography technique for obtaining detailed 3D structures of sub-cellular, macro-molecular, or materials specimens. Electron tomography is an extension of traditional transmission electron microscopy and uses a trans ...
,
averaging In ordinary language, an average is a single number or value that best represents a set of data. The type of average taken as most typically representative of a list of numbers is the arithmetic mean the sum of the numbers divided by how many nu ...
,
cryptanalysis Cryptanalysis (from the Greek ''kryptós'', "hidden", and ''analýein'', "to analyze") refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic se ...
, and
neurophysiology Neurophysiology is a branch of physiology and neuroscience concerned with the functions of the nervous system and their mechanisms. The term ''neurophysiology'' originates from the Greek word ''νεῦρον'' ("nerve") and ''physiology'' (whic ...
. The cross-correlation is similar in nature to the
convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions f and g that produces a third function f*g, as the integral of the product of the two ...
of two functions. In an
autocorrelation Autocorrelation, sometimes known as serial correlation in the discrete time case, measures the correlation of a signal with a delayed copy of itself. Essentially, it quantifies the similarity between observations of a random variable at differe ...
, which is the cross-correlation of a signal with itself, there will always be a peak at a lag of zero, and its size will be the signal energy. In
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
and
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the term ''cross-correlations'' refers to the correlations between the entries of two random vectors \mathbf and \mathbf, while the ''correlations'' of a random vector \mathbf are the correlations between the entries of \mathbf itself, those forming the
correlation matrix In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
of \mathbf. If each of \mathbf and \mathbf is a scalar random variable which is realized repeatedly in a
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...
, then the correlations of the various temporal instances of \mathbf are known as ''autocorrelations'' of \mathbf, and the cross-correlations of \mathbf with \mathbf across time are temporal cross-correlations. In probability and statistics, the definition of correlation always includes a standardising factor in such a way that correlations have values between −1 and +1. If X and Y are two
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in Pennsylvania, United States * Independentes (English: Independents), a Portuguese artist ...
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s with
probability density function In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...
s f and g, respectively, then the probability density of the difference Y - X is formally given by the cross-correlation (in the signal-processing sense) f \star g; however, this terminology is not used in probability and statistics. In contrast, the
convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions f and g that produces a third function f*g, as the integral of the product of the two ...
f * g (equivalent to the cross-correlation of f(t) and g(-t)) gives the probability density function of the sum X + Y.


Cross-correlation of deterministic signals

For continuous functions f and g, the cross-correlation is defined as:(f \star g)(\tau)\ \triangleq \int_^ \overline g(t+\tau)\,dtwhich is equivalent to(f \star g)(\tau)\ \triangleq \int_^ \overline g(t)\,dtwhere \overline denotes the
complex conjugate In mathematics, the complex conjugate of a complex number is the number with an equal real part and an imaginary part equal in magnitude but opposite in sign. That is, if a and b are real numbers, then the complex conjugate of a + bi is a - ...
of f(t), and \tau is called ''displacement'' or ''lag''. For highly-correlated f and g which have a maximum cross-correlation at a particular \tau, a feature in f at t also occurs later in g at t+\tau, hence g could be described to ''lag'' f by \tau. If f and g are both continuous periodic functions of period T, the integration from -\infty to \infty is replaced by integration over any interval _0,t_0+T/math> of length T:(f \star g)(\tau)\ \triangleq \int_^ \overline g(t + \tau)\,dtwhich is equivalent to(f \star g)(\tau)\ \triangleq \int_^ \overline g(t)\,dtSimilarly, for discrete functions, the cross-correlation is defined as:(f \star g) \triangleq \sum_^ \overline g +n/math>which is equivalent to:(f \star g) \triangleq \sum_^ \overline g /math>For finite discrete functions f,g\in\mathbb^N, the (circular) cross-correlation is defined as:(f \star g) \triangleq \sum_^ \overline g m+n)_/math>which is equivalent to:(f \star g) \triangleq \sum_^ \overline g /math>For finite discrete functions f\in\mathbb^N, g\in\mathbb^M, the kernel cross-correlation is defined as:(f \star g) \triangleq \sum_^ \overline K_g m+n)_/math>where K_g = (g, T_0(g)), k(g, T_1(g)), \dots, k(g, T_(g))/math> is a vector of kernel functions k(\cdot, \cdot)\colon \mathbb^M \times \mathbb^M \to \mathbb and T_i(\cdot)\colon \mathbb^M \to \mathbb^M is an
affine transform In Euclidean geometry, an affine transformation or affinity (from the Latin, '' affinis'', "connected with") is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles. More generally ...
. Specifically, T_i(\cdot) can be circular translation transform, rotation transform, or scale transform, etc. The kernel cross-correlation extends cross-correlation from linear space to kernel space. Cross-correlation is equivariant to translation; kernel cross-correlation is equivariant to any affine transforms, including translation, rotation, and scale, etc.


Explanation

As an example, consider two real valued functions f and g differing only by an unknown shift along the x-axis. One can use the cross-correlation to find how much g must be shifted along the x-axis to make it identical to f. The formula essentially slides the g function along the x-axis, calculating the integral of their product at each position. When the functions match, the value of (f\star g) is maximized. This is because when peaks (positive areas) are aligned, they make a large contribution to the integral. Similarly, when troughs (negative areas) align, they also make a positive contribution to the integral because the product of two negative numbers is positive. With
complex-valued function Complex analysis, traditionally known as the theory of functions of a complex variable, is the branch of mathematical analysis that investigates functions of complex numbers. It is helpful in many branches of mathematics, including algebraic g ...
s f and g, taking the conjugate of f ensures that aligned peaks (or aligned troughs) with imaginary components will contribute positively to the integral. In
econometrics Econometrics is an application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics", '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. 8 ...
, lagged cross-correlation is sometimes referred to as cross-autocorrelation.


Properties


Cross-correlation of random vectors


Definition

For
random vector In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge ...
s \mathbf = (X_1,\ldots,X_m) and \mathbf = (Y_1,\ldots,Y_n), each containing
random element In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line. The concept was introduced by who commented that the “development of probability theory and expansio ...
s whose
expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
and
variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
exist, the cross-correlation matrix of \mathbf and \mathbf is defined by\operatorname_ \triangleq\ \operatorname\left mathbf \mathbf\right/math>and has dimensions m \times n. Written component-wise:\operatorname_ = \begin \operatorname _1 Y_1& \operatorname _1 Y_2& \cdots & \operatorname _1 Y_n\\ \\ \operatorname _2 Y_1& \operatorname _2 Y_2& \cdots & \operatorname _2 Y_n\\ \\ \vdots & \vdots & \ddots & \vdots \\ \\ \operatorname _m Y_1& \operatorname _m Y_2& \cdots & \operatorname _m Y_n\end The random vectors \mathbf and \mathbf need not have the same dimension, and either might be a scalar value. Where \operatorname is the
expectation value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first moment) is a generalization of the weighted average. Informally, the expected va ...
.


Example

For example, if \mathbf = \left( X_1,X_2,X_3 \right) and \mathbf = \left( Y_1,Y_2 \right) are random vectors, then \operatorname_ is a 3 \times 2 matrix whose (i,j)-th entry is \operatorname _i Y_j/math>.


Definition for complex random vectors

If \mathbf = (Z_1,\ldots,Z_m) and \mathbf = (W_1,\ldots,W_n) are
complex random vector In probability theory and statistics, a complex random vector is typically a tuple of complex-valued random variables, and generally is a random variable taking values in a vector space over the field of complex numbers. If Z_1,\ldots,Z_n are compl ...
s, each containing random variables whose expected value and variance exist, the cross-correlation matrix of \mathbf and \mathbf is defined by\operatorname_ \triangleq\ \operatorname mathbf \mathbf^/math>where ^ denotes Hermitian transposition.


Cross-correlation of stochastic processes

In
time series analysis In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...
and
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the cross-correlation of a pair of
random process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Stoc ...
is the correlation between values of the processes at different times, as a function of the two times. Let (X_t, Y_t) be a pair of random processes, and t be any point in time (t may be an
integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
for a
discrete-time In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled. Discrete time Discrete time views values of variables as occurring at distinct, separate "poi ...
process or a
real number In mathematics, a real number is a number that can be used to measure a continuous one- dimensional quantity such as a duration or temperature. Here, ''continuous'' means that pairs of values can have arbitrarily small differences. Every re ...
for a
continuous-time In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled. Discrete time Discrete time views values of variables as occurring at distinct, separate "poi ...
process). Then X_t is the value (or realization) produced by a given run of the process at time t.


Cross-correlation function

Suppose that the process has means \mu_X(t) and \mu_Y(t) and variances \sigma_X^2(t) and \sigma_Y^2(t) at time t, for each t. Then the definition of the cross-correlation between times t_1 and t_2 is\operatorname_(t_1, t_2) \triangleq\ \operatorname\left _ \overline\right/math>where \operatorname is the
expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
operator. Note that this expression may be not defined.


Cross-covariance function

Subtracting the mean before multiplication yields the cross-covariance between times t_1 and t_2:\operatorname_(t_1, t_2) \triangleq\ \operatorname\left left(X_ - \mu_X(t_1)\right)\overline\right/math>Note that this expression is not well-defined for all time series or processes, because the mean or variance may not exist.


Definition for wide-sense stationary stochastic process

Let (X_t, Y_t) represent a pair of
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Sto ...
es that are jointly wide-sense stationary. Then the cross-covariance function and the cross-correlation function are given as follows.


Cross-correlation function

\operatorname_(\tau) \triangleq\ \operatorname\left _t \overline\right/math> or equivalently \operatorname_(\tau) = \operatorname\left _ \overline\right/math>


Cross-covariance function

\operatorname_(\tau) \triangleq\ \operatorname\left left(X_t - \mu_X\right)\overline\right/math> or equivalently \operatorname_(\tau) = \operatorname\left left(X_ - \mu_X\right)\overline\right/math>where \mu_X and \sigma_X are the mean and standard deviation of the process (X_t), which are constant over time due to stationarity; and similarly for (Y_t), respectively. \operatorname /math> indicates the
expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
. That the cross-covariance and cross-correlation are independent of t is precisely the additional information (beyond being individually wide-sense stationary) conveyed by the requirement that (X_t, Y_t) are ''jointly'' wide-sense stationary. The cross-correlation of a pair of jointly wide sense stationary
stochastic processes In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Stoc ...
can be estimated by averaging the product of samples measured from one process and samples measured from the other (and its time shifts). The samples included in the average can be an arbitrary subset of all the samples in the signal (e.g., samples within a finite time window or a sub-sampling of one of the signals). For a large number of samples, the average converges to the true cross-correlation.


Normalization

It is common practice in some disciplines (e.g. statistics and
time series analysis In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...
) to normalize the cross-correlation function to get a time-dependent
Pearson correlation coefficient In statistics, the Pearson correlation coefficient (PCC) is a correlation coefficient that measures linear correlation between two sets of data. It is the ratio between the covariance of two variables and the product of their standard deviatio ...
. However, in other disciplines (e.g. engineering) the normalization is usually dropped and the terms "cross-correlation" and "cross-covariance" are used interchangeably. The definition of the normalized cross-correlation of a stochastic process is \rho_(t_1, t_2) = \frac = \frac If the function \rho_ is well-defined, its value must lie in the range 1,1/math>, with 1 indicating perfect correlation and −1 indicating perfect anti-correlation. For jointly wide-sense stationary stochastic processes, the definition is \rho_(\tau) = \frac = \frac The normalization is important both because the interpretation of the autocorrelation as a correlation provides a scale-free measure of the strength of statistical dependence, and because the normalization has an effect on the statistical properties of the estimated autocorrelations.


Properties


Symmetry property

For jointly wide-sense stationary stochastic processes, the cross-correlation function has the following symmetry property:Kun Il Park, Fundamentals of Probability and Stochastic Processes with Applications to Communications, Springer, 2018, 978-3-319-68074-3\operatorname_(t_1, t_2) = \overlineRespectively for jointly WSS processes:\operatorname_(\tau) = \overline


Time delay analysis

Cross-correlations are useful for determining the time delay between two signals, e.g., for determining time delays for the propagation of acoustic signals across a microphone array. After calculating the cross-correlation between the two signals, the maximum (or minimum if the signals are negatively correlated) of the cross-correlation function indicates the point in time where the signals are best aligned; i.e., the time delay between the two signals is determined by the argument of the maximum, or
arg max In mathematics, the arguments of the maxima (abbreviated arg max or argmax) and arguments of the minima (abbreviated arg min or argmin) are the input points at which a Function (mathematics), function output value is Maxima and minima, maximized ...
of the cross-correlation, as in\tau_\mathrm=\underset((f \star g)(t))Terminology in image processing


Zero-normalized cross-correlation (ZNCC)

For image-processing applications in which the brightness of the image and template can vary due to lighting and exposure conditions, the images can be first normalized. This is typically done at every step by subtracting the mean and dividing by the
standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
. That is, the cross-correlation of a template t(x,y) with a subimage f(x,y) is \frac \sum_\left(f(x,y) - \mu_f \right)\left(t(x,y) - \mu_t \right) where n is the number of pixels in t(x,y) and f(x,y), \mu_f is the average of f and \sigma_f is
standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
of f. In
functional analysis Functional analysis is a branch of mathematical analysis, the core of which is formed by the study of vector spaces endowed with some kind of limit-related structure (for example, Inner product space#Definition, inner product, Norm (mathematics ...
terms, this can be thought of as the dot product of two normalized vectors. That is, ifF(x,y) = f(x,y) - \mu_fandT(x,y) = t(x,y) - \mu_tthen the above sum is equal to\left\langle\frac,\frac\right\ranglewhere \langle\cdot,\cdot\rangle is the
inner product In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, ofte ...
and \, \cdot\, is the ''L''² norm. Cauchy–Schwarz then implies that ZNCC has a range of
1, 1 Onekama ( ) is a village in Manistee County in the U.S. state of Michigan. The population was 399 at the 2020 census. The village is located on the northeast shore of Portage Lake and is surrounded by Onekama Township. The town's name is deri ...
/math>. Thus, if f and t are real matrices, their normalized cross-correlation equals the cosine of the angle between the unit vectors F and T, being thus 1 if and only if F equals T multiplied by a positive scalar. Normalized correlation is one of the methods used for template matching, a process used for finding instances of a pattern or object within an image. It is also the 2-dimensional version of
Pearson product-moment correlation coefficient In statistics, the Pearson correlation coefficient (PCC) is a correlation coefficient that measures linear correlation between two sets of data. It is the ratio between the covariance of two variables and the product of their standard deviation ...
.


Normalized cross-correlation (NCC)

NCC is similar to ZNCC with the only difference of not subtracting the local mean value of intensities:\frac \sum_f(x,y) t(x,y)


Nonlinear systems

Caution must be applied when using cross correlation function which assumes Gaussian variance for nonlinear systems. In certain circumstances, which depend on the properties of the input, cross correlation between the input and output of a system with nonlinear dynamics can be completely blind to certain nonlinear effects. This problem arises because some quadratic moments can equal zero and this can incorrectly suggest that there is little "correlation" (in the sense of statistical dependence) between two signals, when in fact the two signals are strongly related by nonlinear dynamics.


See also

*
Autocorrelation Autocorrelation, sometimes known as serial correlation in the discrete time case, measures the correlation of a signal with a delayed copy of itself. Essentially, it quantifies the similarity between observations of a random variable at differe ...
*
Autocovariance In probability theory and statistics, given a stochastic process, the autocovariance is a function that gives the covariance of the process with itself at pairs of time points. Autocovariance is closely related to the autocorrelation of the proces ...
* Coherence *
Convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions f and g that produces a third function f*g, as the integral of the product of the two ...
*
Correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
* Correlation function * Cross-correlation matrix *
Cross-covariance In probability and statistics, given two stochastic processes \left\ and \left\, the cross-covariance is a function that gives the covariance of one process with the other at pairs of time points. With the usual notation \operatorname E for th ...
* Cross-spectrum * Digital image correlation *
Phase correlation Phase correlation is an approach to estimate the relative Translation (geometry), translative offset between two similar images (digital image correlation) or other data sets. It is commonly used in image registration and relies on a frequency-doma ...
* Scaled correlation *
Spectral density In signal processing, the power spectrum S_(f) of a continuous time signal x(t) describes the distribution of power into frequency components f composing that signal. According to Fourier analysis, any physical signal can be decomposed into ...
* Wiener–Khinchin theorem


References


Further reading

*


External links


Cross Correlation from Mathworld
* http://scribblethink.org/Work/nvisionInterface/nip.html * http://www.staff.ncl.ac.uk/oliver.hinton/eee305/Chapter6.pdf {{Statistics, analysis Bilinear maps Covariance and correlation Signal processing Time domain analysis