Autocorrelation, sometimes known as serial correlation in the
discrete time
In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled.
Discrete time
Discrete time views values of variables as occurring at distinct, separate "po ...
case, is the
correlation of a
signal
In signal processing, a signal is a function that conveys information about a phenomenon. Any quantity that can vary over space or time can be used as a signal to share messages between observers. The '' IEEE Transactions on Signal Processing' ...
with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a
periodic signal
A periodic function is a function that repeats its values at regular intervals. For example, the trigonometric functions, which repeat at intervals of 2\pi radians, are periodic functions. Periodic functions are used throughout science to de ...
obscured by
noise
Noise is unwanted sound considered unpleasant, loud or disruptive to hearing. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrations through a medium, such as air or water. The difference aris ...
, or identifying the
missing fundamental frequency
A harmonic sound is said to have a missing fundamental, suppressed fundamental, or phantom fundamental when its overtones suggest a fundamental frequency but the sound lacks a component at the fundamental frequency itself.
The brain perceives the ...
in a signal implied by its
harmonic
A harmonic is a wave with a frequency that is a positive integer multiple of the '' fundamental frequency'', the frequency of the original periodic signal, such as a sinusoidal wave. The original signal is also called the ''1st harmonic'', the ...
frequencies. It is often used in
signal processing
Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, d ...
for analyzing functions or series of values, such as
time domain
Time domain refers to the analysis of mathematical functions, physical signals or time series of economic or environmental data, with respect to time. In the time domain, the signal or function's value is known for all real numbers, for the cas ...
signals.
Different fields of study define autocorrelation differently, and not all of these definitions are equivalent. In some fields, the term is used interchangeably with
autocovariance
In probability theory and statistics, given a stochastic process, the autocovariance is a function that gives the covariance of the process with itself at pairs of time points. Autocovariance is closely related to the autocorrelation of the process ...
.
Unit root
In probability theory and statistics, a unit root is a feature of some stochastic processes (such as random walks) that can cause problems in statistical inference involving time series models. A linear stochastic process has a unit root if ...
processes,
trend-stationary process
In the statistical analysis of time series, a trend-stationary process is a stochastic process from which an underlying trend (function solely of time) can be removed, leaving a stationary process. The trend does not have to be linear.
Conversely ...
es,
autoregressive process
In statistics, econometrics and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The autoregressive model spe ...
es, and
moving average processes are specific forms of processes with autocorrelation.
Auto-correlation of stochastic processes
In
statistics
Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industri ...
, the autocorrelation of a real or complex
random process
In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appe ...
is the
Pearson correlation
In statistics, the Pearson correlation coefficient (PCC, pronounced ) ― also known as Pearson's ''r'', the Pearson product-moment correlation coefficient (PPMCC), the bivariate correlation, or colloquially simply as the correlation coefficient ...
between values of the process at different times, as a function of the two times or of the time lag. Let
be a random process, and
be any point in time (
may be an
integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign (−1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language o ...
for a
discrete-time
In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled.
Discrete time
Discrete time views values of variables as occurring at distinct, separate "po ...
process or a
real number
In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every r ...
for a
continuous-time
In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled.
Discrete time
Discrete time views values of variables as occurring at distinct, separate "po ...
process). Then
is the value (or
realization) produced by a given
run of the process at time
. Suppose that the process has
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set.
For a data set, the ''arithm ...
and
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
at time
, for each
. Then the definition of the auto-correlation function between times
and
is
[Kun Il Park, Fundamentals of Probability and Stochastic Processes with Applications to Communications, Springer, 2018, ]
where
is the
expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
operator and the bar represents
complex conjugation
In mathematics, the complex conjugate of a complex number is the number with an equal real part and an Imaginary number, imaginary part equal in magnitude but opposite in Sign (mathematics), sign. That is, (if a and b are real, then) the complex ...
. Note that the expectation may not be
well defined.
Subtracting the mean before multiplication yields the auto-covariance function between times
and
:
[
Note that this expression is not well defined for all time series or processes, because the mean may not exist, or the variance may be zero (for a constant process) or infinite (for processes with distribution lacking well-behaved moments, such as certain types of ]power law
In statistics, a power law is a functional relationship between two quantities, where a relative change in one quantity results in a proportional relative change in the other quantity, independent of the initial size of those quantities: one qu ...
).
Definition for wide-sense stationary stochastic process
If is a wide-sense stationary process then the mean and the variance are time-independent, and further the autocovariance function depends only on the lag between and : the autocovariance depends only on the time-distance between the pair of values but not on their position in time. This further implies that the autocovariance and auto-correlation can be expressed as a function of the time-lag, and that this would be an even function
In mathematics, even functions and odd functions are functions which satisfy particular symmetry relations, with respect to taking additive inverses. They are important in many areas of mathematical analysis, especially the theory of power seri ...
of the lag . This gives the more familiar forms for the auto-correlation function[
and the auto-covariance function:
In particular, note that
]
Normalization
It is common practice in some disciplines (e.g. statistics and time series analysis
In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
) to normalize the autocovariance function to get a time-dependent Pearson correlation coefficient
In statistics, the Pearson correlation coefficient (PCC, pronounced ) ― also known as Pearson's ''r'', the Pearson product-moment correlation coefficient (PPMCC), the bivariate correlation, or colloquially simply as the correlation coefficient ...
. However, in other disciplines (e.g. engineering) the normalization is usually dropped and the terms "autocorrelation" and "autocovariance" are used interchangeably.
The definition of the auto-correlation coefficient of a stochastic process is[
If the function is well defined, its value must lie in the range ] .