HOME

TheInfoList



OR:

Autocorrelation, sometimes known as serial correlation in the
discrete time In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled. Discrete time Discrete time views values of variables as occurring at distinct, separate "po ...
case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by
noise Noise is unwanted sound considered unpleasant, loud or disruptive to hearing. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrations through a medium, such as air or water. The difference aris ...
, or identifying the missing fundamental frequency in a signal implied by its
harmonic A harmonic is a wave with a frequency that is a positive integer multiple of the ''fundamental frequency'', the frequency of the original periodic signal, such as a sinusoidal wave. The original signal is also called the ''1st harmonic'', t ...
frequencies. It is often used in
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...
for analyzing functions or series of values, such as time domain signals. Different fields of study define autocorrelation differently, and not all of these definitions are equivalent. In some fields, the term is used interchangeably with autocovariance. Unit root processes,
trend-stationary process In the statistical analysis of time series, a trend-stationary process is a stochastic process from which an underlying trend (function solely of time) can be removed, leaving a stationary process. The trend does not have to be linear. Converse ...
es, autoregressive processes, and moving average processes are specific forms of processes with autocorrelation.


Auto-correlation of stochastic processes

In
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
, the autocorrelation of a real or complex random process is the Pearson correlation between values of the process at different times, as a function of the two times or of the time lag. Let \left\ be a random process, and t be any point in time (t may be an
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
for a discrete-time process or a
real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every ...
for a continuous-time process). Then X_t is the value (or realization) produced by a given
run Run(s) or RUN may refer to: Places * Run (island), one of the Banda Islands in Indonesia * Run (stream), a stream in the Dutch province of North Brabant People * Run (rapper), Joseph Simmons, now known as "Reverend Run", from the hip-hop group ...
of the process at time t. Suppose that the process has
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set. For a data set, the '' ar ...
\mu_t and
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
\sigma_t^2 at time t, for each t. Then the definition of the auto-correlation function between times t_1 and t_2 isKun Il Park, Fundamentals of Probability and Stochastic Processes with Applications to Communications, Springer, 2018, where \operatorname is the
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
operator and the bar represents complex conjugation. Note that the expectation may not be well defined. Subtracting the mean before multiplication yields the auto-covariance function between times t_1 and t_2: Note that this expression is not well defined for all time series or processes, because the mean may not exist, or the variance may be zero (for a constant process) or infinite (for processes with distribution lacking well-behaved moments, such as certain types of power law).


Definition for wide-sense stationary stochastic process

If \left\ is a wide-sense stationary process then the mean \mu and the variance \sigma^2 are time-independent, and further the autocovariance function depends only on the lag between t_1 and t_2: the autocovariance depends only on the time-distance between the pair of values but not on their position in time. This further implies that the autocovariance and auto-correlation can be expressed as a function of the time-lag, and that this would be an
even function In mathematics, even functions and odd functions are functions which satisfy particular symmetry relations, with respect to taking additive inverses. They are important in many areas of mathematical analysis, especially the theory of power se ...
of the lag \tau=t_2-t_1. This gives the more familiar forms for the auto-correlation function and the auto-covariance function: In particular, note that \operatorname_(0) = \sigma^2 .


Normalization

It is common practice in some disciplines (e.g. statistics and time series analysis) to normalize the autocovariance function to get a time-dependent Pearson correlation coefficient. However, in other disciplines (e.g. engineering) the normalization is usually dropped and the terms "autocorrelation" and "autocovariance" are used interchangeably. The definition of the auto-correlation coefficient of a stochastic process is \rho_(t_1,t_2) = \frac = \frac . If the function \rho_ is well defined, its value must lie in the range 1,1/math>, with 1 indicating perfect correlation and −1 indicating perfect anti-correlation. For a wide-sense stationary (WSS) process, the definition is \rho_(\tau) = \frac = \frac. The normalization is important both because the interpretation of the autocorrelation as a correlation provides a scale-free measure of the strength of statistical dependence, and because the normalization has an effect on the statistical properties of the estimated autocorrelations.


Properties


Symmetry property

The fact that the auto-correlation function \operatorname_ is an
even function In mathematics, even functions and odd functions are functions which satisfy particular symmetry relations, with respect to taking additive inverses. They are important in many areas of mathematical analysis, especially the theory of power se ...
can be stated as \operatorname_(t_1,t_2) = \overline respectively for a WSS process: \operatorname_(\tau) = \overline .


Maximum at zero

For a WSS process: \left, \operatorname_(\tau)\ \leq \operatorname_(0) Notice that \operatorname_(0) is always real.


Cauchy–Schwarz inequality

The Cauchy–Schwarz inequality, inequality for stochastic processes: \left, \operatorname_(t_1,t_2)\^2 \leq \operatorname\left X_, ^2\right\operatorname\left white_noise In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used, with this or similar meanings, in many scientific and technical disciplines ...
_signal_will_have_a_strong_peak_(represented_by_a_ Dirac_delta_function)_at_\tau=0_and_will_be_exactly_0_for_all_other_\tau.


_Wiener–Khinchin_theorem

The_ Wiener–Khinchin_theorem_relates_the_autocorrelation_function_\operatorname__to_the_ power_spectral_density_S__via_the_
Fourier_transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed ...
: \operatorname_(\tau)_=_\int_^\infty_S_(f)_e^_\,_f S_(f)_=_\int_^\infty_\operatorname_(\tau)_e^_\,_\tau_. For_real-valued_functions,_the_symmetric_autocorrelation_function_has_a_real_symmetric_transform,_so_the_ Wiener–Khinchin_theorem_can_be_re-expressed_in_terms_of_real_cosines_only: \operatorname_(\tau)_=_\int_^\infty_S_(f)_\cos(2_\pi_f_\tau)_\,_f S_(f)_=_\int_^\infty_\operatorname_(\tau)_\cos(2_\pi_f_\tau)_\,_\tau_.


_Auto-correlation_of_random_vectors

The_(potentially_time-dependent)_auto-correlation_matrix_(also_called_second_moment)_of_a_(potentially_time-dependent)_ random_vector_\mathbf_=_(X_1,\ldots,X_n)^_is_an_n_\times_n_matrix_containing_as_elements_the_autocorrelations_of_all_pairs_of_elements_of_the_random_vector_\mathbf._The_autocorrelation_matrix_is_used_in_various_
digital_signal_processing Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are ...
_algorithms. For_a_ random_vector_\mathbf_=_(X_1,\ldots,X_n)^_containing_
random_element In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line. The concept was introduced by who commented that the “development of probability theory and expansi ...
s_whose_expected_value_ In_probability_theory,_the_expected_value_(also_called_expectation,_expectancy,_mathematical_expectation,_mean,_average,_or_first_moment)_is_a_generalization_of_the__weighted_average._Informally,_the_expected_value_is_the_arithmetic_mean_of_a__...
_and_variance__ In_probability_theory_and_statistics,_variance_is_the__expectation_of_the_squared__deviation_of_a__random_variable_from_its__population_mean_or__sample_mean._Variance_is_a_measure_of_dispersion,_meaning_it_is_a_measure_of_how_far_a_set_of_numbe_...
_exist,_the_auto-correlation_matrix_is_defined_byPapoulis,_Athanasius,_''Probability,_Random_variables_and_Stochastic_processes'',_McGraw-Hill,_1991 where_^_denotes_transposition_and_has_dimensions_n_\times_n. Written_component-wise: \operatorname__= \begin \operatorname _1_X_1&_\operatorname _1_X_2&_\cdots_&_\operatorname _1_X_n\\_\\ \operatorname _2_X_1&_\operatorname _2_X_2&_\cdots_&_\operatorname _2_X_n\\_\\ _\vdots_&_\vdots_&_\ddots_&_\vdots_\\_\\ \operatorname _n_X_1&_\operatorname _n_X_2&_\cdots_&_\operatorname _n_X_n\\_\\ \end If_\mathbf_is_a_ complex_random_vector,_the_autocorrelation_matrix_is_instead_defined_by \operatorname__\triangleq\_\operatorname mathbf_\mathbf^. Here_^_denotes_ Hermitian_transposition. For_example,_if_\mathbf_=_\left(_X_1,X_2,X_3_\right)^_is_a_random_vector,_then_\operatorname__is_a_3_\times_3_matrix_whose_(i,j)-th_entry_is_\operatorname _i_X_j/math>.


_Properties_of_the_autocorrelation_matrix

*_The_autocorrelation_matrix_is_a_ Hermitian_matrix_for_complex_random_vectors_and_a_ symmetric_matrix_for_real_random_vectors. *_The_autocorrelation_matrix_is_a_ positive_semidefinite_matrix,_i.e._\mathbf^_\operatorname__\mathbf_\ge_0_\quad_\text_\mathbf_\in_\mathbb^n_for_a_real_random_vector,_and_respectively_\mathbf^_\operatorname__\mathbf_\ge_0_\quad_\text_\mathbf_\in_\mathbb^n_in_case_of_a_complex_random_vector. *_All_eigenvalues_of_the_autocorrelation_matrix_are_real_and_non-negative. *_The_''auto-covariance_matrix''_is_related_to_the_autocorrelation_matrix_as_follows:\operatorname__=_\operatorname \mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf.html" ;"title="mathbf.html" ;"title="\mathbf_-_\operatorname[\mathbf">\mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf">mathbf.html" ;"title="\mathbf_-_\operatorname[\mathbf">\mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf^]_=__\operatorname__-_\operatorname[\mathbf]_\operatorname[\mathbf]^Respectively_for_complex_random_vectors:\operatorname__=_\operatorname \mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf.html" ;"title="mathbf.html" ;"title="\mathbf_-_\operatorname[\mathbf">\mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf">mathbf.html" ;"title="\mathbf_-_\operatorname[\mathbf">\mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf^]_=__\operatorname__-_\operatorname[\mathbf]_\operatorname[\mathbf]^


__Auto-correlation_of_deterministic_signals_

In_signal_processing_ Signal_processing_is_an__electrical_engineering_subfield_that_focuses_on_analyzing,_modifying_and_synthesizing_''_signals'',_such_as_sound,__images,_and_scientific_measurements._Signal_processing_techniques_are_used_to_optimize_transmissions,__...
,_the_above_definition_is_often_used_without_the_normalization,_that_is,_without_subtracting_the_mean_and_dividing_by_the_variance._When_the_autocorrelation_function_is_normalized_by_mean_and_variance,_it_is_sometimes_referred_to_as_the_autocorrelation_coefficient_or_autocovariance_function.


__Auto-correlation_of_continuous-time_signal_

Given_a_ signal_f(t),_the_continuous_autocorrelation_R_(\tau)_is_most_often_defined_as_the_continuous_ cross-correlation_integral_of_f(t)_with_itself,_at_lag_\tau. where_\overline_represents_the_ complex_conjugate_of_f(t)._Note_that_the_parameter_t_in_the_integral_is_a_dummy_variable_and_is_only_necessary_to_calculate_the_integral._It_has_no_specific_meaning.


__Auto-correlation_of_discrete-time_signal_

The_discrete_autocorrelation_R_at_lag_\ell_for_a_discrete-time_signal_y(n)_is The_above_definitions_work_for_signals_that_are_square_integrable,_or_square_summable,_that_is,_of_finite_energy._Signals_that_"last_forever"_are_treated_instead_as_random_processes,_in_which_case_different_definitions_are_needed,_based_on_expected_values._For_ wide-sense-stationary_random_processes,_the_autocorrelations_are_defined_as \begin R_(\tau)_&=_\operatorname\left (t)\overline\right\\ R_(\ell)_&=_\operatorname\left (n)\,\overline\right. \end For_processes_that_are_not_ stationary,_these_will_also_be_functions_of_t,_or_n. For_processes_that_are_also_ ergodic,_the_expectation_can_be_replaced_by_the_limit_of_a_time_average._The_autocorrelation_of_an_ergodic_process_is_sometimes_defined_as_or_equated_to \begin R_(\tau)_&=_\lim__\frac_1_T_\int_0^T_f(t+\tau)\overline\,_t_\\ R_(\ell)_&=_\lim__\frac_1_N_\sum_^_y(n)\,\overline_. \end These_definitions_have_the_advantage_that_they_give_sensible_well-defined_single-parameter_results_for_periodic_functions,_even_when_those_functions_are_not_the_output_of_stationary_ergodic_processes. Alternatively,_signals_that_''last_forever''_can_be_treated_by_a_short-time_autocorrelation_function_analysis,_using_finite_time_integrals._(See_ short-time_Fourier_transform_for_a_related_process.)


_Definition_for_periodic_signals

If_f_is_a_continuous_periodic_function_of_period_T,_the_integration_from_-\infty_to_\infty_is_replaced_by_integration_over_any_interval_ _0,t_0+T/math>_of_length_T: R_(\tau)_\triangleq_\int_^_f(t+\tau)_\overline_\,dt which_is_equivalent_to R_(\tau)_\triangleq_\int_^_f(t)_\overline_\,dt


_Properties

In_the_following,_we_will_describe_properties_of_one-dimensional_autocorrelations_only,_since_most_properties_are_easily_transferred_from_the_one-dimensional_case_to_the_multi-dimensional_cases._These_properties_hold_for_ wide-sense_stationary_processes. *_A_fundamental_property_of_the_autocorrelation_is_symmetry,_R_(\tau)_=_R_(-\tau),_which_is_easy_to_prove_from_the_definition._In_the_continuous_case, **_the_autocorrelation_is_an_even_function_ In_mathematics,_even_functions_and_odd_functions_are__functions_which_satisfy_particular_symmetry_relations,_with_respect_to_taking__additive_inverses._They_are_important_in_many_areas_of_mathematical_analysis,_especially_the_theory_of_power_se_...
_R_(-\tau)_=_R_(\tau)_when_f_is_a_real_function,_and **_the_autocorrelation_is_a_ Hermitian_function_R_(-\tau)_=_R_^*(\tau)_when_f_is_a_ complex_function. *_The_continuous_autocorrelation_function_reaches_its_peak_at_the_origin,_where_it_takes_a_real_value,_i.e._for_any_delay_\tau,_, R_(\tau), _\leq_R_(0)._This_is_a_consequence_of_the_ rearrangement_inequality._The_same_result_holds_in_the_discrete_case. *_The_autocorrelation_of_a_
periodic_function A periodic function is a function that repeats its values at regular intervals. For example, the trigonometric functions, which repeat at intervals of 2\pi radians, are periodic functions. Periodic functions are used throughout science to des ...
_is,_itself,_periodic_with_the_same_period. *_The_autocorrelation_of_the_sum_of_two_completely_uncorrelated_functions_(the_cross-correlation_is_zero_for_all_\tau)_is_the_sum_of_the_autocorrelations_of_each_function_separately. *_Since_autocorrelation_is_a_specific_type_of_ cross-correlation,_it_maintains_all_the_properties_of_cross-correlation. *_By_using_the_symbol_*_to_represent_
convolution In mathematics (in particular, functional analysis), convolution is a mathematical operation on two functions ( and ) that produces a third function (f*g) that expresses how the shape of one is modified by the other. The term ''convolution'' ...
_and_g__is_a_function_which_manipulates_the_function_f_and_is_defined_as_g_(f)(t)=f(-t),_the_definition_for_R_(\tau)_may_be_written_as:R_(\tau)_=_(f_*_g_(\overline))(\tau)


_Multi-dimensional_autocorrelation

Multi-
dimension In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coord ...
al_autocorrelation_is_defined_similarly._For_example,_in_
three_dimensions Three-dimensional space (also: 3D space, 3-space or, rarely, tri-dimensional space) is a geometric setting in which three values (called '' parameters'') are required to determine the position of an element (i.e., point). This is the inform ...
_the_autocorrelation_of_a_square-summable_ discrete_signal_would_be R(j,k,\ell)_=_\sum__x_\,\overline__. When_mean_values_are_subtracted_from_signals_before_computing_an_autocorrelation_function,_the_resulting_function_is_usually_called_an_auto-covariance_function.


_Efficient_computation

For_data_expressed_as_a_ discrete_sequence,_it_is_frequently_necessary_to_compute_the_autocorrelation_with_high_ computational_efficiency._A_ brute_force_method_based_on_the_signal_processing_definition_R_(j)_=_\sum_n_x_n\,\overline__can_be_used_when_the_signal_size_is_small._For_example,_to_calculate_the_autocorrelation_of_the_real_signal_sequence_x_=_(2,3,-1)_(i.e._x_0=2,_x_1=3,_x_2=-1,_and_x_i_=_0_for_all_other_values_of_)_by_hand,_we_first_recognize_that_the_definition_just_given_is_the_same_as_the_"usual"_multiplication,_but_with_right_shifts,_where_each_vertical_addition_gives_the_autocorrelation_for_particular_lag_values: \begin _______&_2_&_3_&_-1_\\ \times_&_2_&_3_&_-1_\\ \hline _______&-2_&-3_&_1_\\ _______&___&_6_&_9_&_-3_\\ _____+_&___&___&_4_&_6_&_-2_\\ \hline _______&_-2_&_3_&14_&_3_&_-2 \end Thus_the_required_autocorrelation_sequence_is_R_=(-2,3,14,3,-2),_where_R_(0)=14,_R_(-1)=_R_(1)=3,_and_R_(-2)=_R_(2)_=_-2,_the_autocorrelation_for_other_lag_values_being_zero._In_this_calculation_we_do_not_perform_the_carry-over_operation_during_addition_as_is_usual_in_normal_multiplication._Note_that_we_can_halve_the_number_of_operations_required_by_exploiting_the_inherent_symmetry_of_the_autocorrelation._If_the_signal_happens_to_be_periodic,_i.e._x=(\ldots,2,3,-1,2,3,-1,\ldots),_then_we_get_a_circular_autocorrelation_(similar_to_ circular_convolution)_where_the_left_and_right_tails_of_the_previous_autocorrelation_sequence_will_overlap_and_give_R_=(\ldots,14,1,1,14,1,1,\ldots)_which_has_the_same_period_as_the_signal_sequence_x._The_procedure_can_be_regarded_as_an_application_of_the_convolution_property_of_ Z-transform_of_a_discrete_signal. While_the_brute_force_algorithm_is_
order Order, ORDER or Orders may refer to: * Categorization, the process in which ideas and objects are recognized, differentiated, and understood * Heterarchy, a system of organization wherein the elements have the potential to be ranked a number of ...
_,_several_efficient_algorithms_exist_which_can_compute_the_autocorrelation_in_order_._For_example,_the_ Wiener–Khinchin_theorem_allows_computing_the_autocorrelation_from_the_raw_data__with_two_
fast_Fourier_transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in ...
s_(FFT): \begin F_R(f)_&=_\operatorname (t)\\ S(f)_&=_F_R(f)_F^*_R(f)_\\ R(\tau)_&=_\operatorname (f)\end where_IFFT_denotes_the_inverse_
fast_Fourier_transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in ...
._The_asterisk_denotes_ complex_conjugate. Alternatively,_a_multiple__correlation_can_be_performed_by_using_brute_force_calculation_for_low__values,_and_then_progressively_binning_the__data_with_a_
logarithm In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a number  to the base  is the exponent to which must be raised, to produce . For example, since , the ''logarithm base'' 10 ...
ic_density_to_compute_higher_values,_resulting_in_the_same__efficiency,_but_with_lower_memory_requirements.


_Estimation

For_a_ discrete_process_with_known_mean_and_variance_for_which_we_observe_n_observations_\,_an_estimate_of_the_autocorrelation_coefficient_may_be_obtained_as _\hat(k)=\frac_\sum_^_(X_t-\mu)(X_-\mu)_ for_any_positive_integer_k._When_the_true_mean_\mu_and_variance_\sigma^2_are_known,_this_estimate_is_ unbiased._If_the_true_mean_and_variance__ In_probability_theory_and_statistics,_variance_is_the__expectation_of_the_squared__deviation_of_a__random_variable_from_its__population_mean_or__sample_mean._Variance_is_a_measure_of_dispersion,_meaning_it_is_a_measure_of_how_far_a_set_of_numbe_...
_of_the_process_are_not_known_there_are_several_possibilities: *_If_\mu_and_\sigma^2_are_replaced_by_the_standard_formulae_for_sample_mean_and_sample_variance,_then_this_is_a_ biased_estimate. *_A_
periodogram In signal processing, a periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898. Today, the periodogram is a component of more sophisticated methods (see spectral estimation). It is the most c ...
-based_estimate_replaces_n-k_in_the_above_formula_with_n._This_estimate_is_always_biased;_however,_it_usually_has_a_smaller_mean_squared_error. *_Other_possibilities_derive_from_treating_the_two_portions_of_data_\_and_\_separately_and_calculating_separate_sample_means_and/or_sample_variances_for_use_in_defining_the_estimate. The_advantage_of_estimates_of_the_last_type_is_that_the_set_of_estimated_autocorrelations,_as_a_function_of_k,_then_form_a_function_which_is_a_valid_autocorrelation_in_the_sense_that_it_is_possible_to_define_a_theoretical_process_having_exactly_that_autocorrelation._Other_estimates_can_suffer_from_the_problem_that,_if_they_are_used_to_calculate_the_variance_of_a_linear_combination_of_the_X's,_the_variance_calculated_may_turn_out_to_be_negative.


_Regression_analysis

In_
regression_analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
_using_ time_series_data,_autocorrelation_in_a_variable_of_interest_is_typically_modeled_either_with_an_
autoregressive_model In statistics, econometrics and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The autoregressive model spe ...
_(AR),_a_ moving_average_model_(MA),_their_combination_as_an_ autoregressive-moving-average_model_(ARMA),_or_an_extension_of_the_latter_called_an_ autoregressive_integrated_moving_average_model_(ARIMA)._With_multiple_interrelated_data_series,_ vector_autoregression_(VAR)_or_its_extensions_are_used. In_
ordinary_least_squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
_(OLS),_the_adequacy_of_a_model_specification_can_be_checked_in_part_by_establishing_whether_there_is_autocorrelation_of_the_ regression_residuals._Problematic_autocorrelation_of_the_errors,_which_themselves_are_unobserved,_can_generally_be_detected_because_it_produces_autocorrelation_in_the_observable_residuals._(Errors_are_also_known_as_"error_terms"_in_
econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. ...
.)_Autocorrelation_of_the_errors_violates_the_ordinary_least_squares_assumption_that_the_error_terms_are_uncorrelated,_meaning_that_the_ Gauss_Markov_theorem_does_not_apply,_and_that_OLS_estimators_are_no_longer_the_Best_Linear_Unbiased_Estimators_(
BLUE Blue is one of the three primary colours in the RYB colour model (traditional colour theory), as well as in the RGB (additive) colour model. It lies between violet and cyan on the spectrum of visible light. The eye perceives blue when ...
)._While_it_does_not_bias_the_OLS_coefficient_estimates,_the_ standard_errors_tend_to_be_underestimated_(and_the_ t-scores_overestimated)_when_the_autocorrelations_of_the_errors_at_low_lags_are_positive. The_traditional_test_for_the_presence_of_first-order_autocorrelation_is_the_
Durbin–Watson_statistic In statistics, the Durbin–Watson statistic is a test statistic used to detect the presence of autocorrelation at lag 1 in the residuals (prediction errors) from a regression analysis. It is named after James Durbin and Geoffrey Watson. The ...
_or,_if_the_explanatory_variables_include_a_lagged_dependent_variable,_ Durbin's_h_statistic._The_Durbin-Watson_can_be_linearly_mapped_however_to_the_Pearson_correlation_between_values_and_their_lags.__A_more_flexible_test,_covering_autocorrelation_of_higher_orders_and_applicable_whether_or_not_the_regressors_include_lags_of_the_dependent_variable,_is_the_
Breusch–Godfrey_test In statistics, the Breusch–Godfrey test is used to assess the validity of some of the modelling assumptions inherent in applying regression-like models to observed data series. In particular, it tests for the presence of serial correlation that ...
._This_involves_an_auxiliary_regression,_wherein_the_residuals_obtained_from_estimating_the_model_of_interest_are_regressed_on_(a)_the_original_regressors_and_(b)_''k''_lags_of_the_residuals,_where_'k'_is_the_order_of_the_test._The_simplest_version_of_the_test_statistic_from_this_auxiliary_regression_is_''TR''2,_where_''T''_is_the_sample_size_and_''R''2_is_the_ coefficient_of_determination._Under_the_null_hypothesis_of_no_autocorrelation,_this_statistic_is_asymptotically_ distributed_as_\chi^2_with_''k''_degrees_of_freedom. Responses_to_nonzero_autocorrelation_include_ generalized_least_squares_and_the_ Newey–West_HAC_estimator_(Heteroskedasticity_and_Autocorrelation_Consistent). In_the_estimation_of_a_ moving_average_model_(MA),_the_autocorrelation_function_is_used_to_determine_the_appropriate_number_of_lagged_error_terms_to_be_included._This_is_based_on_the_fact_that_for_an_MA_process_of_order_''q'',_we_have_R(\tau)_\neq_0,_for__\tau_=_0,1,_\ldots_,_q,_and__R(\tau)_=_0,_for_\tau_>q.


_Applications

*_Autocorrelation_analysis_is_used_heavily_in_ fluorescence_correlation_spectroscopy_to_provide_quantitative_insight_into_molecular-level_diffusion_and_chemical_reactions. *_Another_application_of_autocorrelation_is_the_measurement_of_ optical_spectra_and_the_measurement_of_very-short-duration_
light Light or visible light is electromagnetic radiation that can be perceived by the human eye. Visible light is usually defined as having wavelengths in the range of 400–700 nanometres (nm), corresponding to frequencies of 750–420 t ...
_
pulses In medicine, a pulse represents the tactile arterial palpation of the cardiac cycle (heartbeat) by trained fingertips. The pulse may be palpated in any place that allows an artery to be compressed near the surface of the body, such as at the nec ...
_produced_by_
laser A laser is a device that emits light through a process of optical amplification based on the stimulated emission of electromagnetic radiation. The word "laser" is an acronym for "light amplification by stimulated emission of radiation". The ...
s,_both_using_ optical_autocorrelators. *_Autocorrelation_is_used_to_analyze_ dynamic_light_scattering_data,_which_notably_enables_determination_of_the_ particle_size_distributions_of_nanometer-sized_particles_or_
micelle A micelle () or micella () (plural micelles or micellae, respectively) is an aggregate (or supramolecular assembly) of surfactant amphipathic lipid molecules dispersed in a liquid, forming a colloidal suspension (also known as associated coll ...
s_suspended_in_a_fluid._A_laser_shining_into_the_mixture_produces_a_ speckle_pattern_that_results_from_the_motion_of_the_particles._Autocorrelation_of_the_signal_can_be_analyzed_in_terms_of_the_diffusion_of_the_particles._From_this,_knowing_the_viscosity_of_the_fluid,_the_sizes_of_the_particles_can_be_calculated. *_Utilized_in_the_ GPS_system_to_correct_for_the_ propagation_delay,_or_time_shift,_between_the_point_of_time_at_the_transmission_of_the_ carrier_signal_at_the_satellites,_and_the_point_of_time_at_the_receiver_on_the_ground.__This_is_done_by_the_receiver_generating_a_replica_signal_of_the_1,023-bit_C/A_(Coarse/Acquisition)_code,_and_generating_lines_of_code_chips_ 1,1in_packets_of_ten_at_a_time,_or_10,230_chips_(1,023_×_10),_shifting_slightly_as_it_goes_along_in_order_to_accommodate_for_the_ doppler_shift_in_the_incoming_satellite_signal,_until_the_receiver_replica_signal_and_the_satellite_signal_codes_match_up. *_The_ small-angle_X-ray_scattering_intensity_of_a_nanostructured_system_is_the_Fourier_transform_of_the_spatial_autocorrelation_function_of_the_electron_density. *In_
surface_science Surface science is the study of physical and chemical phenomena that occur at the interface of two phases, including solid– liquid interfaces, solid– gas interfaces, solid–vacuum interfaces, and liquid– gas interfaces. It includes th ...
_and_ scanning_probe_microscopy,_autocorrelation_is_used_to_establish_a_link_between_surface_morphology_and_functional_characteristics. *_In_optics,_normalized_autocorrelations_and_cross-correlations_give_the_
degree_of_coherence In quantum optics, correlation functions are used to characterize the statistical and coherence properties of an electromagnetic field. The degree of coherence is the normalized correlation of electric fields; in its simplest form, termed g^. ...
_of_an_electromagnetic_field. *_In_signal_processing_ Signal_processing_is_an__electrical_engineering_subfield_that_focuses_on_analyzing,_modifying_and_synthesizing_''_signals'',_such_as_sound,__images,_and_scientific_measurements._Signal_processing_techniques_are_used_to_optimize_transmissions,__...
,_autocorrelation_can_give_information_about_repeating_events_like_ musical_beats_(for_example,_to_determine_
tempo In musical terminology, tempo ( Italian, 'time'; plural ''tempos'', or ''tempi'' from the Italian plural) is the speed or pace of a given piece. In classical music, tempo is typically indicated with an instruction at the start of a piece (ofte ...
)_or_ pulsar_ frequencies,_though_it_cannot_tell_the_position_in_time_of_the_beat._It_can_also_be_used_to_ estimate_the_pitch_of_a_musical_tone. *_In_ music_recording,_autocorrelation_is_used_as_a_
pitch_detection_algorithm Pitch may refer to: Acoustic frequency * Pitch (music), the perceived frequency of sound including "definite pitch" and "indefinite pitch" ** Absolute pitch or "perfect pitch" ** Pitch class, a set of all pitches that are a whole number of octav ...
_prior_to_vocal_processing,_as_a_ distortion_effect_or_to_eliminate_undesired_mistakes_and_inaccuracies. *_Autocorrelation_in_space_rather_than_time,_via_the_ Patterson_function,_is_used_by_X-ray_diffractionists_to_help_recover_the_"Fourier_phase_information"_on_atom_positions_not_available_through_diffraction_alone. *_In_statistics,_spatial_autocorrelation_between_sample_locations_also_helps_one_estimate_ mean_value_uncertainties_when_sampling_a_heterogeneous_population. *_The_ SEQUEST_algorithm_for_analyzing_ mass_spectra_makes_use_of_autocorrelation_in_conjunction_with_ cross-correlation_to_score_the_similarity_of_an_observed_spectrum_to_an_idealized_spectrum_representing_a_
peptide Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. ...
. *_In_
astrophysics Astrophysics is a science that employs the methods and principles of physics and chemistry in the study of astronomical objects and phenomena. As one of the founders of the discipline said, Astrophysics "seeks to ascertain the nature of the h ...
,_autocorrelation_is_used_to_study_and_characterize_the_spatial_distribution_of_ galaxies_in_the_universe_and_in_multi-wavelength_observations_of_low_mass_
X-ray_binaries X-ray binaries are a class of binary stars that are luminous in X-rays. The X-rays are produced by matter falling from one component, called the ''donor'' (usually a relatively normal star), to the other component, called the ''accretor'', which ...
. *_In_ panel_data,_spatial_autocorrelation_refers_to_correlation_of_a_variable_with_itself_through_space. *_In_analysis_of_
Markov_chain_Monte_Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
_data,_autocorrelation_must_be_taken_into_account_for_correct_error_determination. *_In_
geosciences Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four sphe ...
_(specifically_in_
geophysics Geophysics () is a subject of natural science concerned with the physical processes and physical properties of the Earth and its surrounding space environment, and the use of quantitative methods for their analysis. The term ''geophysics'' so ...
)_it_can_be_used_to_compute_an_autocorrelation_seismic_attribute,_out_of_a_3D_seismic_survey_of_the_underground. *_In_
medical_ultrasound Medical ultrasound includes diagnostic techniques (mainly imaging techniques) using ultrasound, as well as therapeutic applications of ultrasound. In diagnosis, it is used to create an image of internal body structures such as tendons, mu ...
_imaging,_autocorrelation_is_used_to_visualize_blood_flow. *_In_
intertemporal_portfolio_choice Intertemporal portfolio choice is the process of allocating one's investable wealth to various assets, especially financial assets, repeatedly over time, in such a way as to optimize some criterion. The set of asset proportions at any time defines ...
,_the_presence_or_absence_of_autocorrelation_in_an_asset's_ rate_of_return_can_affect_the_optimal_portion_of_the_portfolio_to_hold_in_that_asset. *_Autocorrelation_has_been_used_to_accurately_measure_power_system_frequency_in_ numerical_relays.


__Serial_dependence_

Serial_dependence_is_closely_linked_to_the_notion_of_autocorrelation,_but_represents_a_distinct_concept_(see_ Correlation_and_dependence)._In_particular,_it_is_possible_to_have_serial_dependence_but_no_(linear)_correlation._In_some_fields_however,_the_two_terms_are_used_as_synonyms. A__
time_series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
_of_a__random_variable_has_serial_dependence_if_the_value_at_some_time_t_in_the_series_is_ statistically_dependent_on_the_value_at_another_time_s._A_series_is_serially_independent_if_there_is_no_dependence_between_any_pair. If_a_time_series_\left\_is_ stationary,_then_statistical_dependence_between_the_pair_(X_t,X_s)_would_imply_that_there_is_statistical_dependence_between_all_pairs_of_values_at_the_same_lag_\tau=s-t.


_See_also

*_ Autocorrelation_matrix *_
Autocorrelation_technique The autocorrelation technique is a method for estimating the dominating frequency in a complex signal, as well as its variance. Specifically, it calculates the first two moments of the power spectrum, namely the mean and variance. It is also know ...
*_ Autocorrelation_of_a_formal_word *_ Autocorrelator *_ Correlation_function *_
Correlogram In the analysis of data, a correlogram is a chart of correlation statistics. For example, in time series analysis, a plot of the sample autocorrelations r_h\, versus h\, (the time lags) is an autocorrelogram. If cross-correlation is plott ...
*_ Cross-correlation *_
Galton's_problem Galton's problem, named after Sir Francis Galton, is the problem of drawing inferences from cross-cultural data, due to the statistical phenomenon now called autocorrelation. The problem is now recognized as a general one that applies to all nonex ...
*_
Partial_autocorrelation_function In time series analysis, the partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags. It contrasts with the autocorre ...
*_ Fluorescence_correlation_spectroscopy *_ Optical_autocorrelation *_
Pitch_detection_algorithm Pitch may refer to: Acoustic frequency * Pitch (music), the perceived frequency of sound including "definite pitch" and "indefinite pitch" ** Absolute pitch or "perfect pitch" ** Pitch class, a set of all pitches that are a whole number of octav ...
*_ Triple_correlation *_
CUSUM In statistical quality control, the CUsUM (or cumulative sum control chart) is a sequential analysis technique developed by E. S. Page of the University of Cambridge. It is typically used for monitoring change detection. CUSUM was announced in B ...
*_ Cochrane–Orcutt_estimation_(transformation_for_autocorrelated_error_terms) *_ Prais–Winsten_transformation *_ Scaled_correlation *_ Unbiased_estimation_of_standard_deviation


_References


_Further_reading

*_ *_ *_Mojtaba_Soltanalian,_and_Petre_Stoica.
Computational_design_of_sequences_with_good_correlation_properties
"_IEEE_Transactions_on_Signal_Processing,_60.5_(2012):_2180–2193. *_Solomon_W._Golomb,_and_Guang_Gong
Signal_design_for_good_correlation:_for_wireless_communication,_cryptography,_and_radar
_Cambridge_University_Press,_2005. *_Klapetek,_Petr_(2018)._
Quantitative_Data_Processing_in_Scanning_Probe_Microscopy:_SPM_Applications_for_Nanometrology
'_(Second_ed.)._Elsevier._pp. 108–112__. *_ {{Statistics, analysis _ Signal_processing Time_domain_analysishtml" ;"title="X_, ^2\right]


Autocorrelation of white noise

The autocorrelation of a continuous-time
white noise In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used, with this or similar meanings, in many scientific and technical disciplines ...
signal will have a strong peak (represented by a Dirac delta function) at \tau=0 and will be exactly 0 for all other \tau.


Wiener–Khinchin theorem

The Wiener–Khinchin theorem relates the autocorrelation function \operatorname_ to the power spectral density S_ via the
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed ...
: \operatorname_(\tau) = \int_^\infty S_(f) e^ \, f S_(f) = \int_^\infty \operatorname_(\tau) e^ \, \tau . For real-valued functions, the symmetric autocorrelation function has a real symmetric transform, so the Wiener–Khinchin theorem can be re-expressed in terms of real cosines only: \operatorname_(\tau) = \int_^\infty S_(f) \cos(2 \pi f \tau) \, f S_(f) = \int_^\infty \operatorname_(\tau) \cos(2 \pi f \tau) \, \tau .


Auto-correlation of random vectors

The (potentially time-dependent) auto-correlation matrix (also called second moment) of a (potentially time-dependent) random vector \mathbf = (X_1,\ldots,X_n)^ is an n \times n matrix containing as elements the autocorrelations of all pairs of elements of the random vector \mathbf. The autocorrelation matrix is used in various
digital signal processing Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are ...
algorithms. For a random vector \mathbf = (X_1,\ldots,X_n)^ containing
random element In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line. The concept was introduced by who commented that the “development of probability theory and expansi ...
s whose
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
and
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
exist, the auto-correlation matrix is defined byPapoulis, Athanasius, ''Probability, Random variables and Stochastic processes'', McGraw-Hill, 1991 where ^ denotes transposition and has dimensions n \times n. Written component-wise: \operatorname_ = \begin \operatorname _1 X_1& \operatorname _1 X_2& \cdots & \operatorname _1 X_n\\ \\ \operatorname _2 X_1& \operatorname _2 X_2& \cdots & \operatorname _2 X_n\\ \\ \vdots & \vdots & \ddots & \vdots \\ \\ \operatorname _n X_1& \operatorname _n X_2& \cdots & \operatorname _n X_n\\ \\ \end If \mathbf is a complex random vector, the autocorrelation matrix is instead defined by \operatorname_ \triangleq\ \operatorname mathbf \mathbf^. Here ^ denotes Hermitian transposition. For example, if \mathbf = \left( X_1,X_2,X_3 \right)^ is a random vector, then \operatorname_ is a 3 \times 3 matrix whose (i,j)-th entry is \operatorname _i X_j/math>.


Properties of the autocorrelation matrix

* The autocorrelation matrix is a Hermitian matrix for complex random vectors and a symmetric matrix for real random vectors. * The autocorrelation matrix is a positive semidefinite matrix, i.e. \mathbf^ \operatorname_ \mathbf \ge 0 \quad \text \mathbf \in \mathbb^n for a real random vector, and respectively \mathbf^ \operatorname_ \mathbf \ge 0 \quad \text \mathbf \in \mathbb^n in case of a complex random vector. * All eigenvalues of the autocorrelation matrix are real and non-negative. * The ''auto-covariance matrix'' is related to the autocorrelation matrix as follows:\operatorname_ = \operatorname \mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf.html" ;"title="mathbf.html" ;"title="\mathbf - \operatorname[\mathbf">\mathbf - \operatorname[\mathbf(\mathbf - \operatorname[\mathbf">mathbf.html" ;"title="\mathbf - \operatorname[\mathbf">\mathbf - \operatorname[\mathbf(\mathbf - \operatorname[\mathbf^] = \operatorname_ - \operatorname[\mathbf] \operatorname[\mathbf]^Respectively for complex random vectors:\operatorname_ = \operatorname \mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf.html" ;"title="mathbf.html" ;"title="\mathbf - \operatorname[\mathbf">\mathbf - \operatorname[\mathbf(\mathbf - \operatorname[\mathbf">mathbf.html" ;"title="\mathbf - \operatorname[\mathbf">\mathbf - \operatorname[\mathbf(\mathbf - \operatorname[\mathbf^] = \operatorname_ - \operatorname[\mathbf] \operatorname[\mathbf]^


Auto-correlation of deterministic signals

In
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...
, the above definition is often used without the normalization, that is, without subtracting the mean and dividing by the variance. When the autocorrelation function is normalized by mean and variance, it is sometimes referred to as the autocorrelation coefficient or autocovariance function.


Auto-correlation of continuous-time signal

Given a signal f(t), the continuous autocorrelation R_(\tau) is most often defined as the continuous cross-correlation integral of f(t) with itself, at lag \tau. where \overline represents the complex conjugate of f(t). Note that the parameter t in the integral is a dummy variable and is only necessary to calculate the integral. It has no specific meaning.


Auto-correlation of discrete-time signal

The discrete autocorrelation R at lag \ell for a discrete-time signal y(n) is The above definitions work for signals that are square integrable, or square summable, that is, of finite energy. Signals that "last forever" are treated instead as random processes, in which case different definitions are needed, based on expected values. For wide-sense-stationary random processes, the autocorrelations are defined as \begin R_(\tau) &= \operatorname\left (t)\overline\right\\ R_(\ell) &= \operatorname\left (n)\,\overline\right. \end For processes that are not stationary, these will also be functions of t, or n. For processes that are also ergodic, the expectation can be replaced by the limit of a time average. The autocorrelation of an ergodic process is sometimes defined as or equated to \begin R_(\tau) &= \lim_ \frac 1 T \int_0^T f(t+\tau)\overline\, t \\ R_(\ell) &= \lim_ \frac 1 N \sum_^ y(n)\,\overline . \end These definitions have the advantage that they give sensible well-defined single-parameter results for periodic functions, even when those functions are not the output of stationary ergodic processes. Alternatively, signals that ''last forever'' can be treated by a short-time autocorrelation function analysis, using finite time integrals. (See short-time Fourier transform for a related process.)


Definition for periodic signals

If f is a continuous periodic function of period T, the integration from -\infty to \infty is replaced by integration over any interval _0,t_0+T/math> of length T: R_(\tau) \triangleq \int_^ f(t+\tau) \overline \,dt which is equivalent to R_(\tau) \triangleq \int_^ f(t) \overline \,dt


Properties

In the following, we will describe properties of one-dimensional autocorrelations only, since most properties are easily transferred from the one-dimensional case to the multi-dimensional cases. These properties hold for wide-sense stationary processes. * A fundamental property of the autocorrelation is symmetry, R_(\tau) = R_(-\tau), which is easy to prove from the definition. In the continuous case, ** the autocorrelation is an
even function In mathematics, even functions and odd functions are functions which satisfy particular symmetry relations, with respect to taking additive inverses. They are important in many areas of mathematical analysis, especially the theory of power se ...
R_(-\tau) = R_(\tau) when f is a real function, and ** the autocorrelation is a Hermitian function R_(-\tau) = R_^*(\tau) when f is a complex function. * The continuous autocorrelation function reaches its peak at the origin, where it takes a real value, i.e. for any delay \tau, , R_(\tau), \leq R_(0). This is a consequence of the rearrangement inequality. The same result holds in the discrete case. * The autocorrelation of a
periodic function A periodic function is a function that repeats its values at regular intervals. For example, the trigonometric functions, which repeat at intervals of 2\pi radians, are periodic functions. Periodic functions are used throughout science to des ...
is, itself, periodic with the same period. * The autocorrelation of the sum of two completely uncorrelated functions (the cross-correlation is zero for all \tau) is the sum of the autocorrelations of each function separately. * Since autocorrelation is a specific type of cross-correlation, it maintains all the properties of cross-correlation. * By using the symbol * to represent
convolution In mathematics (in particular, functional analysis), convolution is a mathematical operation on two functions ( and ) that produces a third function (f*g) that expresses how the shape of one is modified by the other. The term ''convolution'' ...
and g_ is a function which manipulates the function f and is defined as g_(f)(t)=f(-t), the definition for R_(\tau) may be written as:R_(\tau) = (f * g_(\overline))(\tau)


Multi-dimensional autocorrelation

Multi-
dimension In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coord ...
al autocorrelation is defined similarly. For example, in
three dimensions Three-dimensional space (also: 3D space, 3-space or, rarely, tri-dimensional space) is a geometric setting in which three values (called '' parameters'') are required to determine the position of an element (i.e., point). This is the inform ...
the autocorrelation of a square-summable discrete signal would be R(j,k,\ell) = \sum_ x_\,\overline_ . When mean values are subtracted from signals before computing an autocorrelation function, the resulting function is usually called an auto-covariance function.


Efficient computation

For data expressed as a discrete sequence, it is frequently necessary to compute the autocorrelation with high computational efficiency. A brute force method based on the signal processing definition R_(j) = \sum_n x_n\,\overline_ can be used when the signal size is small. For example, to calculate the autocorrelation of the real signal sequence x = (2,3,-1) (i.e. x_0=2, x_1=3, x_2=-1, and x_i = 0 for all other values of ) by hand, we first recognize that the definition just given is the same as the "usual" multiplication, but with right shifts, where each vertical addition gives the autocorrelation for particular lag values: \begin & 2 & 3 & -1 \\ \times & 2 & 3 & -1 \\ \hline &-2 &-3 & 1 \\ & & 6 & 9 & -3 \\ + & & & 4 & 6 & -2 \\ \hline & -2 & 3 &14 & 3 & -2 \end Thus the required autocorrelation sequence is R_=(-2,3,14,3,-2), where R_(0)=14, R_(-1)= R_(1)=3, and R_(-2)= R_(2) = -2, the autocorrelation for other lag values being zero. In this calculation we do not perform the carry-over operation during addition as is usual in normal multiplication. Note that we can halve the number of operations required by exploiting the inherent symmetry of the autocorrelation. If the signal happens to be periodic, i.e. x=(\ldots,2,3,-1,2,3,-1,\ldots), then we get a circular autocorrelation (similar to circular convolution) where the left and right tails of the previous autocorrelation sequence will overlap and give R_=(\ldots,14,1,1,14,1,1,\ldots) which has the same period as the signal sequence x. The procedure can be regarded as an application of the convolution property of Z-transform of a discrete signal. While the brute force algorithm is
order Order, ORDER or Orders may refer to: * Categorization, the process in which ideas and objects are recognized, differentiated, and understood * Heterarchy, a system of organization wherein the elements have the potential to be ranked a number of ...
, several efficient algorithms exist which can compute the autocorrelation in order . For example, the Wiener–Khinchin theorem allows computing the autocorrelation from the raw data with two
fast Fourier transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in ...
s (FFT): \begin F_R(f) &= \operatorname (t)\\ S(f) &= F_R(f) F^*_R(f) \\ R(\tau) &= \operatorname (f)\end where IFFT denotes the inverse
fast Fourier transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in ...
. The asterisk denotes complex conjugate. Alternatively, a multiple correlation can be performed by using brute force calculation for low values, and then progressively binning the data with a
logarithm In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a number  to the base  is the exponent to which must be raised, to produce . For example, since , the ''logarithm base'' 10 ...
ic density to compute higher values, resulting in the same efficiency, but with lower memory requirements.


Estimation

For a discrete process with known mean and variance for which we observe n observations \, an estimate of the autocorrelation coefficient may be obtained as \hat(k)=\frac \sum_^ (X_t-\mu)(X_-\mu) for any positive integer k. When the true mean \mu and variance \sigma^2 are known, this estimate is unbiased. If the true mean and
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
of the process are not known there are several possibilities: * If \mu and \sigma^2 are replaced by the standard formulae for sample mean and sample variance, then this is a biased estimate. * A
periodogram In signal processing, a periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898. Today, the periodogram is a component of more sophisticated methods (see spectral estimation). It is the most c ...
-based estimate replaces n-k in the above formula with n. This estimate is always biased; however, it usually has a smaller mean squared error. * Other possibilities derive from treating the two portions of data \ and \ separately and calculating separate sample means and/or sample variances for use in defining the estimate. The advantage of estimates of the last type is that the set of estimated autocorrelations, as a function of k, then form a function which is a valid autocorrelation in the sense that it is possible to define a theoretical process having exactly that autocorrelation. Other estimates can suffer from the problem that, if they are used to calculate the variance of a linear combination of the X's, the variance calculated may turn out to be negative.


Regression analysis

In
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
using time series data, autocorrelation in a variable of interest is typically modeled either with an
autoregressive model In statistics, econometrics and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The autoregressive model spe ...
(AR), a moving average model (MA), their combination as an autoregressive-moving-average model (ARMA), or an extension of the latter called an autoregressive integrated moving average model (ARIMA). With multiple interrelated data series, vector autoregression (VAR) or its extensions are used. In
ordinary least squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
(OLS), the adequacy of a model specification can be checked in part by establishing whether there is autocorrelation of the regression residuals. Problematic autocorrelation of the errors, which themselves are unobserved, can generally be detected because it produces autocorrelation in the observable residuals. (Errors are also known as "error terms" in
econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. ...
.) Autocorrelation of the errors violates the ordinary least squares assumption that the error terms are uncorrelated, meaning that the Gauss Markov theorem does not apply, and that OLS estimators are no longer the Best Linear Unbiased Estimators (
BLUE Blue is one of the three primary colours in the RYB colour model (traditional colour theory), as well as in the RGB (additive) colour model. It lies between violet and cyan on the spectrum of visible light. The eye perceives blue when ...
). While it does not bias the OLS coefficient estimates, the standard errors tend to be underestimated (and the t-scores overestimated) when the autocorrelations of the errors at low lags are positive. The traditional test for the presence of first-order autocorrelation is the
Durbin–Watson statistic In statistics, the Durbin–Watson statistic is a test statistic used to detect the presence of autocorrelation at lag 1 in the residuals (prediction errors) from a regression analysis. It is named after James Durbin and Geoffrey Watson. The ...
or, if the explanatory variables include a lagged dependent variable, Durbin's h statistic. The Durbin-Watson can be linearly mapped however to the Pearson correlation between values and their lags. A more flexible test, covering autocorrelation of higher orders and applicable whether or not the regressors include lags of the dependent variable, is the
Breusch–Godfrey test In statistics, the Breusch–Godfrey test is used to assess the validity of some of the modelling assumptions inherent in applying regression-like models to observed data series. In particular, it tests for the presence of serial correlation that ...
. This involves an auxiliary regression, wherein the residuals obtained from estimating the model of interest are regressed on (a) the original regressors and (b) ''k'' lags of the residuals, where 'k' is the order of the test. The simplest version of the test statistic from this auxiliary regression is ''TR''2, where ''T'' is the sample size and ''R''2 is the coefficient of determination. Under the null hypothesis of no autocorrelation, this statistic is asymptotically distributed as \chi^2 with ''k'' degrees of freedom. Responses to nonzero autocorrelation include generalized least squares and the Newey–West HAC estimator (Heteroskedasticity and Autocorrelation Consistent). In the estimation of a moving average model (MA), the autocorrelation function is used to determine the appropriate number of lagged error terms to be included. This is based on the fact that for an MA process of order ''q'', we have R(\tau) \neq 0, for \tau = 0,1, \ldots , q, and R(\tau) = 0, for \tau >q.


Applications

* Autocorrelation analysis is used heavily in fluorescence correlation spectroscopy to provide quantitative insight into molecular-level diffusion and chemical reactions. * Another application of autocorrelation is the measurement of optical spectra and the measurement of very-short-duration
light Light or visible light is electromagnetic radiation that can be perceived by the human eye. Visible light is usually defined as having wavelengths in the range of 400–700 nanometres (nm), corresponding to frequencies of 750–420 t ...
pulses In medicine, a pulse represents the tactile arterial palpation of the cardiac cycle (heartbeat) by trained fingertips. The pulse may be palpated in any place that allows an artery to be compressed near the surface of the body, such as at the nec ...
produced by
laser A laser is a device that emits light through a process of optical amplification based on the stimulated emission of electromagnetic radiation. The word "laser" is an acronym for "light amplification by stimulated emission of radiation". The ...
s, both using optical autocorrelators. * Autocorrelation is used to analyze dynamic light scattering data, which notably enables determination of the particle size distributions of nanometer-sized particles or
micelle A micelle () or micella () (plural micelles or micellae, respectively) is an aggregate (or supramolecular assembly) of surfactant amphipathic lipid molecules dispersed in a liquid, forming a colloidal suspension (also known as associated coll ...
s suspended in a fluid. A laser shining into the mixture produces a speckle pattern that results from the motion of the particles. Autocorrelation of the signal can be analyzed in terms of the diffusion of the particles. From this, knowing the viscosity of the fluid, the sizes of the particles can be calculated. * Utilized in the GPS system to correct for the propagation delay, or time shift, between the point of time at the transmission of the carrier signal at the satellites, and the point of time at the receiver on the ground. This is done by the receiver generating a replica signal of the 1,023-bit C/A (Coarse/Acquisition) code, and generating lines of code chips 1,1in packets of ten at a time, or 10,230 chips (1,023 × 10), shifting slightly as it goes along in order to accommodate for the doppler shift in the incoming satellite signal, until the receiver replica signal and the satellite signal codes match up. * The small-angle X-ray scattering intensity of a nanostructured system is the Fourier transform of the spatial autocorrelation function of the electron density. *In
surface science Surface science is the study of physical and chemical phenomena that occur at the interface of two phases, including solid– liquid interfaces, solid– gas interfaces, solid–vacuum interfaces, and liquid– gas interfaces. It includes th ...
and scanning probe microscopy, autocorrelation is used to establish a link between surface morphology and functional characteristics. * In optics, normalized autocorrelations and cross-correlations give the
degree of coherence In quantum optics, correlation functions are used to characterize the statistical and coherence properties of an electromagnetic field. The degree of coherence is the normalized correlation of electric fields; in its simplest form, termed g^. ...
of an electromagnetic field. * In
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...
, autocorrelation can give information about repeating events like musical beats (for example, to determine
tempo In musical terminology, tempo ( Italian, 'time'; plural ''tempos'', or ''tempi'' from the Italian plural) is the speed or pace of a given piece. In classical music, tempo is typically indicated with an instruction at the start of a piece (ofte ...
) or pulsar frequencies, though it cannot tell the position in time of the beat. It can also be used to estimate the pitch of a musical tone. * In music recording, autocorrelation is used as a
pitch detection algorithm Pitch may refer to: Acoustic frequency * Pitch (music), the perceived frequency of sound including "definite pitch" and "indefinite pitch" ** Absolute pitch or "perfect pitch" ** Pitch class, a set of all pitches that are a whole number of octav ...
prior to vocal processing, as a distortion effect or to eliminate undesired mistakes and inaccuracies. * Autocorrelation in space rather than time, via the Patterson function, is used by X-ray diffractionists to help recover the "Fourier phase information" on atom positions not available through diffraction alone. * In statistics, spatial autocorrelation between sample locations also helps one estimate mean value uncertainties when sampling a heterogeneous population. * The SEQUEST algorithm for analyzing mass spectra makes use of autocorrelation in conjunction with cross-correlation to score the similarity of an observed spectrum to an idealized spectrum representing a
peptide Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. ...
. * In
astrophysics Astrophysics is a science that employs the methods and principles of physics and chemistry in the study of astronomical objects and phenomena. As one of the founders of the discipline said, Astrophysics "seeks to ascertain the nature of the h ...
, autocorrelation is used to study and characterize the spatial distribution of galaxies in the universe and in multi-wavelength observations of low mass
X-ray binaries X-ray binaries are a class of binary stars that are luminous in X-rays. The X-rays are produced by matter falling from one component, called the ''donor'' (usually a relatively normal star), to the other component, called the ''accretor'', which ...
. * In panel data, spatial autocorrelation refers to correlation of a variable with itself through space. * In analysis of
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
data, autocorrelation must be taken into account for correct error determination. * In
geosciences Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four sphe ...
(specifically in
geophysics Geophysics () is a subject of natural science concerned with the physical processes and physical properties of the Earth and its surrounding space environment, and the use of quantitative methods for their analysis. The term ''geophysics'' so ...
) it can be used to compute an autocorrelation seismic attribute, out of a 3D seismic survey of the underground. * In
medical ultrasound Medical ultrasound includes diagnostic techniques (mainly imaging techniques) using ultrasound, as well as therapeutic applications of ultrasound. In diagnosis, it is used to create an image of internal body structures such as tendons, mu ...
imaging, autocorrelation is used to visualize blood flow. * In
intertemporal portfolio choice Intertemporal portfolio choice is the process of allocating one's investable wealth to various assets, especially financial assets, repeatedly over time, in such a way as to optimize some criterion. The set of asset proportions at any time defines ...
, the presence or absence of autocorrelation in an asset's rate of return can affect the optimal portion of the portfolio to hold in that asset. * Autocorrelation has been used to accurately measure power system frequency in numerical relays.


Serial dependence

Serial dependence is closely linked to the notion of autocorrelation, but represents a distinct concept (see Correlation and dependence). In particular, it is possible to have serial dependence but no (linear) correlation. In some fields however, the two terms are used as synonyms. A
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
of a random variable has serial dependence if the value at some time t in the series is statistically dependent on the value at another time s. A series is serially independent if there is no dependence between any pair. If a time series \left\ is stationary, then statistical dependence between the pair (X_t,X_s) would imply that there is statistical dependence between all pairs of values at the same lag \tau=s-t.


See also

* Autocorrelation matrix *
Autocorrelation technique The autocorrelation technique is a method for estimating the dominating frequency in a complex signal, as well as its variance. Specifically, it calculates the first two moments of the power spectrum, namely the mean and variance. It is also know ...
* Autocorrelation of a formal word * Autocorrelator * Correlation function *
Correlogram In the analysis of data, a correlogram is a chart of correlation statistics. For example, in time series analysis, a plot of the sample autocorrelations r_h\, versus h\, (the time lags) is an autocorrelogram. If cross-correlation is plott ...
* Cross-correlation *
Galton's problem Galton's problem, named after Sir Francis Galton, is the problem of drawing inferences from cross-cultural data, due to the statistical phenomenon now called autocorrelation. The problem is now recognized as a general one that applies to all nonex ...
*
Partial autocorrelation function In time series analysis, the partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags. It contrasts with the autocorre ...
* Fluorescence correlation spectroscopy * Optical autocorrelation *
Pitch detection algorithm Pitch may refer to: Acoustic frequency * Pitch (music), the perceived frequency of sound including "definite pitch" and "indefinite pitch" ** Absolute pitch or "perfect pitch" ** Pitch class, a set of all pitches that are a whole number of octav ...
* Triple correlation *
CUSUM In statistical quality control, the CUsUM (or cumulative sum control chart) is a sequential analysis technique developed by E. S. Page of the University of Cambridge. It is typically used for monitoring change detection. CUSUM was announced in B ...
* Cochrane–Orcutt estimation (transformation for autocorrelated error terms) * Prais–Winsten transformation * Scaled correlation * Unbiased estimation of standard deviation


References


Further reading

* * * Mojtaba Soltanalian, and Petre Stoica.
Computational design of sequences with good correlation properties
" IEEE Transactions on Signal Processing, 60.5 (2012): 2180–2193. * Solomon W. Golomb, and Guang Gong
Signal design for good correlation: for wireless communication, cryptography, and radar
Cambridge University Press, 2005. * Klapetek, Petr (2018).
Quantitative Data Processing in Scanning Probe Microscopy: SPM Applications for Nanometrology
' (Second ed.). Elsevier. pp. 108–112 . * {{Statistics, analysis Signal processing Time domain analysis>X_, ^2\right/math>


Autocorrelation of white noise

The autocorrelation of a continuous-time
white noise In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used, with this or similar meanings, in many scientific and technical disciplines ...
signal will have a strong peak (represented by a Dirac delta function) at \tau=0 and will be exactly 0 for all other \tau.


Wiener–Khinchin theorem

The Wiener–Khinchin theorem relates the autocorrelation function \operatorname_ to the power spectral density S_ via the
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed ...
: \operatorname_(\tau) = \int_^\infty S_(f) e^ \, f S_(f) = \int_^\infty \operatorname_(\tau) e^ \, \tau . For real-valued functions, the symmetric autocorrelation function has a real symmetric transform, so the Wiener–Khinchin theorem can be re-expressed in terms of real cosines only: \operatorname_(\tau) = \int_^\infty S_(f) \cos(2 \pi f \tau) \, f S_(f) = \int_^\infty \operatorname_(\tau) \cos(2 \pi f \tau) \, \tau .


Auto-correlation of random vectors

The (potentially time-dependent) auto-correlation matrix (also called second moment) of a (potentially time-dependent) random vector \mathbf = (X_1,\ldots,X_n)^ is an n \times n matrix containing as elements the autocorrelations of all pairs of elements of the random vector \mathbf. The autocorrelation matrix is used in various
digital signal processing Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are ...
algorithms. For a random vector \mathbf = (X_1,\ldots,X_n)^ containing
random element In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line. The concept was introduced by who commented that the “development of probability theory and expansi ...
s whose
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
and
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
exist, the auto-correlation matrix is defined byPapoulis, Athanasius, ''Probability, Random variables and Stochastic processes'', McGraw-Hill, 1991 where ^ denotes transposition and has dimensions n \times n. Written component-wise: \operatorname_ = \begin \operatorname _1 X_1& \operatorname _1 X_2& \cdots & \operatorname _1 X_n\\ \\ \operatorname _2 X_1& \operatorname _2 X_2& \cdots & \operatorname _2 X_n\\ \\ \vdots & \vdots & \ddots & \vdots \\ \\ \operatorname _n X_1& \operatorname _n X_2& \cdots & \operatorname _n X_n\\ \\ \end If \mathbf is a complex random vector, the autocorrelation matrix is instead defined by \operatorname_ \triangleq\ \operatorname mathbf \mathbf^. Here ^ denotes Hermitian transposition. For example, if \mathbf = \left( X_1,X_2,X_3 \right)^ is a random vector, then \operatorname_ is a 3 \times 3 matrix whose (i,j)-th entry is \operatorname _i X_j/math>.


Properties of the autocorrelation matrix

* The autocorrelation matrix is a Hermitian matrix for complex random vectors and a symmetric matrix for real random vectors. * The autocorrelation matrix is a positive semidefinite matrix, i.e. \mathbf^ \operatorname_ \mathbf \ge 0 \quad \text \mathbf \in \mathbb^n for a real random vector, and respectively \mathbf^ \operatorname_ \mathbf \ge 0 \quad \text \mathbf \in \mathbb^n in case of a complex random vector. * All eigenvalues of the autocorrelation matrix are real and non-negative. * The ''auto-covariance matrix'' is related to the autocorrelation matrix as follows:\operatorname_ = \operatorname \mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf.html" ;"title="mathbf.html" ;"title="\mathbf - \operatorname[\mathbf">\mathbf - \operatorname[\mathbf(\mathbf - \operatorname[\mathbf">mathbf.html" ;"title="\mathbf - \operatorname[\mathbf">\mathbf - \operatorname[\mathbf(\mathbf - \operatorname[\mathbf^] = \operatorname_ - \operatorname[\mathbf] \operatorname[\mathbf]^Respectively for complex random vectors:\operatorname_ = \operatorname \mathbf_-_\operatorname[\mathbf(\mathbf_-_\operatorname[\mathbf.html" ;"title="mathbf.html" ;"title="\mathbf - \operatorname[\mathbf">\mathbf - \operatorname[\mathbf(\mathbf - \operatorname[\mathbf">mathbf.html" ;"title="\mathbf - \operatorname[\mathbf">\mathbf - \operatorname[\mathbf(\mathbf - \operatorname[\mathbf^] = \operatorname_ - \operatorname[\mathbf] \operatorname[\mathbf]^


Auto-correlation of deterministic signals

In
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...
, the above definition is often used without the normalization, that is, without subtracting the mean and dividing by the variance. When the autocorrelation function is normalized by mean and variance, it is sometimes referred to as the autocorrelation coefficient or autocovariance function.


Auto-correlation of continuous-time signal

Given a signal f(t), the continuous autocorrelation R_(\tau) is most often defined as the continuous cross-correlation integral of f(t) with itself, at lag \tau. where \overline represents the complex conjugate of f(t). Note that the parameter t in the integral is a dummy variable and is only necessary to calculate the integral. It has no specific meaning.


Auto-correlation of discrete-time signal

The discrete autocorrelation R at lag \ell for a discrete-time signal y(n) is The above definitions work for signals that are square integrable, or square summable, that is, of finite energy. Signals that "last forever" are treated instead as random processes, in which case different definitions are needed, based on expected values. For wide-sense-stationary random processes, the autocorrelations are defined as \begin R_(\tau) &= \operatorname\left (t)\overline\right\\ R_(\ell) &= \operatorname\left (n)\,\overline\right. \end For processes that are not stationary, these will also be functions of t, or n. For processes that are also ergodic, the expectation can be replaced by the limit of a time average. The autocorrelation of an ergodic process is sometimes defined as or equated to \begin R_(\tau) &= \lim_ \frac 1 T \int_0^T f(t+\tau)\overline\, t \\ R_(\ell) &= \lim_ \frac 1 N \sum_^ y(n)\,\overline . \end These definitions have the advantage that they give sensible well-defined single-parameter results for periodic functions, even when those functions are not the output of stationary ergodic processes. Alternatively, signals that ''last forever'' can be treated by a short-time autocorrelation function analysis, using finite time integrals. (See short-time Fourier transform for a related process.)


Definition for periodic signals

If f is a continuous periodic function of period T, the integration from -\infty to \infty is replaced by integration over any interval _0,t_0+T/math> of length T: R_(\tau) \triangleq \int_^ f(t+\tau) \overline \,dt which is equivalent to R_(\tau) \triangleq \int_^ f(t) \overline \,dt


Properties

In the following, we will describe properties of one-dimensional autocorrelations only, since most properties are easily transferred from the one-dimensional case to the multi-dimensional cases. These properties hold for wide-sense stationary processes. * A fundamental property of the autocorrelation is symmetry, R_(\tau) = R_(-\tau), which is easy to prove from the definition. In the continuous case, ** the autocorrelation is an
even function In mathematics, even functions and odd functions are functions which satisfy particular symmetry relations, with respect to taking additive inverses. They are important in many areas of mathematical analysis, especially the theory of power se ...
R_(-\tau) = R_(\tau) when f is a real function, and ** the autocorrelation is a Hermitian function R_(-\tau) = R_^*(\tau) when f is a complex function. * The continuous autocorrelation function reaches its peak at the origin, where it takes a real value, i.e. for any delay \tau, , R_(\tau), \leq R_(0). This is a consequence of the rearrangement inequality. The same result holds in the discrete case. * The autocorrelation of a
periodic function A periodic function is a function that repeats its values at regular intervals. For example, the trigonometric functions, which repeat at intervals of 2\pi radians, are periodic functions. Periodic functions are used throughout science to des ...
is, itself, periodic with the same period. * The autocorrelation of the sum of two completely uncorrelated functions (the cross-correlation is zero for all \tau) is the sum of the autocorrelations of each function separately. * Since autocorrelation is a specific type of cross-correlation, it maintains all the properties of cross-correlation. * By using the symbol * to represent
convolution In mathematics (in particular, functional analysis), convolution is a mathematical operation on two functions ( and ) that produces a third function (f*g) that expresses how the shape of one is modified by the other. The term ''convolution'' ...
and g_ is a function which manipulates the function f and is defined as g_(f)(t)=f(-t), the definition for R_(\tau) may be written as:R_(\tau) = (f * g_(\overline))(\tau)


Multi-dimensional autocorrelation

Multi-
dimension In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coord ...
al autocorrelation is defined similarly. For example, in
three dimensions Three-dimensional space (also: 3D space, 3-space or, rarely, tri-dimensional space) is a geometric setting in which three values (called '' parameters'') are required to determine the position of an element (i.e., point). This is the inform ...
the autocorrelation of a square-summable discrete signal would be R(j,k,\ell) = \sum_ x_\,\overline_ . When mean values are subtracted from signals before computing an autocorrelation function, the resulting function is usually called an auto-covariance function.


Efficient computation

For data expressed as a discrete sequence, it is frequently necessary to compute the autocorrelation with high computational efficiency. A brute force method based on the signal processing definition R_(j) = \sum_n x_n\,\overline_ can be used when the signal size is small. For example, to calculate the autocorrelation of the real signal sequence x = (2,3,-1) (i.e. x_0=2, x_1=3, x_2=-1, and x_i = 0 for all other values of ) by hand, we first recognize that the definition just given is the same as the "usual" multiplication, but with right shifts, where each vertical addition gives the autocorrelation for particular lag values: \begin & 2 & 3 & -1 \\ \times & 2 & 3 & -1 \\ \hline &-2 &-3 & 1 \\ & & 6 & 9 & -3 \\ + & & & 4 & 6 & -2 \\ \hline & -2 & 3 &14 & 3 & -2 \end Thus the required autocorrelation sequence is R_=(-2,3,14,3,-2), where R_(0)=14, R_(-1)= R_(1)=3, and R_(-2)= R_(2) = -2, the autocorrelation for other lag values being zero. In this calculation we do not perform the carry-over operation during addition as is usual in normal multiplication. Note that we can halve the number of operations required by exploiting the inherent symmetry of the autocorrelation. If the signal happens to be periodic, i.e. x=(\ldots,2,3,-1,2,3,-1,\ldots), then we get a circular autocorrelation (similar to circular convolution) where the left and right tails of the previous autocorrelation sequence will overlap and give R_=(\ldots,14,1,1,14,1,1,\ldots) which has the same period as the signal sequence x. The procedure can be regarded as an application of the convolution property of Z-transform of a discrete signal. While the brute force algorithm is
order Order, ORDER or Orders may refer to: * Categorization, the process in which ideas and objects are recognized, differentiated, and understood * Heterarchy, a system of organization wherein the elements have the potential to be ranked a number of ...
, several efficient algorithms exist which can compute the autocorrelation in order . For example, the Wiener–Khinchin theorem allows computing the autocorrelation from the raw data with two
fast Fourier transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in ...
s (FFT): \begin F_R(f) &= \operatorname (t)\\ S(f) &= F_R(f) F^*_R(f) \\ R(\tau) &= \operatorname (f)\end where IFFT denotes the inverse
fast Fourier transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in ...
. The asterisk denotes complex conjugate. Alternatively, a multiple correlation can be performed by using brute force calculation for low values, and then progressively binning the data with a
logarithm In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a number  to the base  is the exponent to which must be raised, to produce . For example, since , the ''logarithm base'' 10 ...
ic density to compute higher values, resulting in the same efficiency, but with lower memory requirements.


Estimation

For a discrete process with known mean and variance for which we observe n observations \, an estimate of the autocorrelation coefficient may be obtained as \hat(k)=\frac \sum_^ (X_t-\mu)(X_-\mu) for any positive integer k. When the true mean \mu and variance \sigma^2 are known, this estimate is unbiased. If the true mean and
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
of the process are not known there are several possibilities: * If \mu and \sigma^2 are replaced by the standard formulae for sample mean and sample variance, then this is a biased estimate. * A
periodogram In signal processing, a periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898. Today, the periodogram is a component of more sophisticated methods (see spectral estimation). It is the most c ...
-based estimate replaces n-k in the above formula with n. This estimate is always biased; however, it usually has a smaller mean squared error. * Other possibilities derive from treating the two portions of data \ and \ separately and calculating separate sample means and/or sample variances for use in defining the estimate. The advantage of estimates of the last type is that the set of estimated autocorrelations, as a function of k, then form a function which is a valid autocorrelation in the sense that it is possible to define a theoretical process having exactly that autocorrelation. Other estimates can suffer from the problem that, if they are used to calculate the variance of a linear combination of the X's, the variance calculated may turn out to be negative.


Regression analysis

In
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
using time series data, autocorrelation in a variable of interest is typically modeled either with an
autoregressive model In statistics, econometrics and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The autoregressive model spe ...
(AR), a moving average model (MA), their combination as an autoregressive-moving-average model (ARMA), or an extension of the latter called an autoregressive integrated moving average model (ARIMA). With multiple interrelated data series, vector autoregression (VAR) or its extensions are used. In
ordinary least squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
(OLS), the adequacy of a model specification can be checked in part by establishing whether there is autocorrelation of the regression residuals. Problematic autocorrelation of the errors, which themselves are unobserved, can generally be detected because it produces autocorrelation in the observable residuals. (Errors are also known as "error terms" in
econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. ...
.) Autocorrelation of the errors violates the ordinary least squares assumption that the error terms are uncorrelated, meaning that the Gauss Markov theorem does not apply, and that OLS estimators are no longer the Best Linear Unbiased Estimators (
BLUE Blue is one of the three primary colours in the RYB colour model (traditional colour theory), as well as in the RGB (additive) colour model. It lies between violet and cyan on the spectrum of visible light. The eye perceives blue when ...
). While it does not bias the OLS coefficient estimates, the standard errors tend to be underestimated (and the t-scores overestimated) when the autocorrelations of the errors at low lags are positive. The traditional test for the presence of first-order autocorrelation is the
Durbin–Watson statistic In statistics, the Durbin–Watson statistic is a test statistic used to detect the presence of autocorrelation at lag 1 in the residuals (prediction errors) from a regression analysis. It is named after James Durbin and Geoffrey Watson. The ...
or, if the explanatory variables include a lagged dependent variable, Durbin's h statistic. The Durbin-Watson can be linearly mapped however to the Pearson correlation between values and their lags. A more flexible test, covering autocorrelation of higher orders and applicable whether or not the regressors include lags of the dependent variable, is the
Breusch–Godfrey test In statistics, the Breusch–Godfrey test is used to assess the validity of some of the modelling assumptions inherent in applying regression-like models to observed data series. In particular, it tests for the presence of serial correlation that ...
. This involves an auxiliary regression, wherein the residuals obtained from estimating the model of interest are regressed on (a) the original regressors and (b) ''k'' lags of the residuals, where 'k' is the order of the test. The simplest version of the test statistic from this auxiliary regression is ''TR''2, where ''T'' is the sample size and ''R''2 is the coefficient of determination. Under the null hypothesis of no autocorrelation, this statistic is asymptotically distributed as \chi^2 with ''k'' degrees of freedom. Responses to nonzero autocorrelation include generalized least squares and the Newey–West HAC estimator (Heteroskedasticity and Autocorrelation Consistent). In the estimation of a moving average model (MA), the autocorrelation function is used to determine the appropriate number of lagged error terms to be included. This is based on the fact that for an MA process of order ''q'', we have R(\tau) \neq 0, for \tau = 0,1, \ldots , q, and R(\tau) = 0, for \tau >q.


Applications

* Autocorrelation analysis is used heavily in fluorescence correlation spectroscopy to provide quantitative insight into molecular-level diffusion and chemical reactions. * Another application of autocorrelation is the measurement of optical spectra and the measurement of very-short-duration
light Light or visible light is electromagnetic radiation that can be perceived by the human eye. Visible light is usually defined as having wavelengths in the range of 400–700 nanometres (nm), corresponding to frequencies of 750–420 t ...
pulses In medicine, a pulse represents the tactile arterial palpation of the cardiac cycle (heartbeat) by trained fingertips. The pulse may be palpated in any place that allows an artery to be compressed near the surface of the body, such as at the nec ...
produced by
laser A laser is a device that emits light through a process of optical amplification based on the stimulated emission of electromagnetic radiation. The word "laser" is an acronym for "light amplification by stimulated emission of radiation". The ...
s, both using optical autocorrelators. * Autocorrelation is used to analyze dynamic light scattering data, which notably enables determination of the particle size distributions of nanometer-sized particles or
micelle A micelle () or micella () (plural micelles or micellae, respectively) is an aggregate (or supramolecular assembly) of surfactant amphipathic lipid molecules dispersed in a liquid, forming a colloidal suspension (also known as associated coll ...
s suspended in a fluid. A laser shining into the mixture produces a speckle pattern that results from the motion of the particles. Autocorrelation of the signal can be analyzed in terms of the diffusion of the particles. From this, knowing the viscosity of the fluid, the sizes of the particles can be calculated. * Utilized in the GPS system to correct for the propagation delay, or time shift, between the point of time at the transmission of the carrier signal at the satellites, and the point of time at the receiver on the ground. This is done by the receiver generating a replica signal of the 1,023-bit C/A (Coarse/Acquisition) code, and generating lines of code chips 1,1in packets of ten at a time, or 10,230 chips (1,023 × 10), shifting slightly as it goes along in order to accommodate for the doppler shift in the incoming satellite signal, until the receiver replica signal and the satellite signal codes match up. * The small-angle X-ray scattering intensity of a nanostructured system is the Fourier transform of the spatial autocorrelation function of the electron density. *In
surface science Surface science is the study of physical and chemical phenomena that occur at the interface of two phases, including solid– liquid interfaces, solid– gas interfaces, solid–vacuum interfaces, and liquid– gas interfaces. It includes th ...
and scanning probe microscopy, autocorrelation is used to establish a link between surface morphology and functional characteristics. * In optics, normalized autocorrelations and cross-correlations give the
degree of coherence In quantum optics, correlation functions are used to characterize the statistical and coherence properties of an electromagnetic field. The degree of coherence is the normalized correlation of electric fields; in its simplest form, termed g^. ...
of an electromagnetic field. * In
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...
, autocorrelation can give information about repeating events like musical beats (for example, to determine
tempo In musical terminology, tempo ( Italian, 'time'; plural ''tempos'', or ''tempi'' from the Italian plural) is the speed or pace of a given piece. In classical music, tempo is typically indicated with an instruction at the start of a piece (ofte ...
) or pulsar frequencies, though it cannot tell the position in time of the beat. It can also be used to estimate the pitch of a musical tone. * In music recording, autocorrelation is used as a
pitch detection algorithm Pitch may refer to: Acoustic frequency * Pitch (music), the perceived frequency of sound including "definite pitch" and "indefinite pitch" ** Absolute pitch or "perfect pitch" ** Pitch class, a set of all pitches that are a whole number of octav ...
prior to vocal processing, as a distortion effect or to eliminate undesired mistakes and inaccuracies. * Autocorrelation in space rather than time, via the Patterson function, is used by X-ray diffractionists to help recover the "Fourier phase information" on atom positions not available through diffraction alone. * In statistics, spatial autocorrelation between sample locations also helps one estimate mean value uncertainties when sampling a heterogeneous population. * The SEQUEST algorithm for analyzing mass spectra makes use of autocorrelation in conjunction with cross-correlation to score the similarity of an observed spectrum to an idealized spectrum representing a
peptide Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. ...
. * In
astrophysics Astrophysics is a science that employs the methods and principles of physics and chemistry in the study of astronomical objects and phenomena. As one of the founders of the discipline said, Astrophysics "seeks to ascertain the nature of the h ...
, autocorrelation is used to study and characterize the spatial distribution of galaxies in the universe and in multi-wavelength observations of low mass
X-ray binaries X-ray binaries are a class of binary stars that are luminous in X-rays. The X-rays are produced by matter falling from one component, called the ''donor'' (usually a relatively normal star), to the other component, called the ''accretor'', which ...
. * In panel data, spatial autocorrelation refers to correlation of a variable with itself through space. * In analysis of
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
data, autocorrelation must be taken into account for correct error determination. * In
geosciences Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four sphe ...
(specifically in
geophysics Geophysics () is a subject of natural science concerned with the physical processes and physical properties of the Earth and its surrounding space environment, and the use of quantitative methods for their analysis. The term ''geophysics'' so ...
) it can be used to compute an autocorrelation seismic attribute, out of a 3D seismic survey of the underground. * In
medical ultrasound Medical ultrasound includes diagnostic techniques (mainly imaging techniques) using ultrasound, as well as therapeutic applications of ultrasound. In diagnosis, it is used to create an image of internal body structures such as tendons, mu ...
imaging, autocorrelation is used to visualize blood flow. * In
intertemporal portfolio choice Intertemporal portfolio choice is the process of allocating one's investable wealth to various assets, especially financial assets, repeatedly over time, in such a way as to optimize some criterion. The set of asset proportions at any time defines ...
, the presence or absence of autocorrelation in an asset's rate of return can affect the optimal portion of the portfolio to hold in that asset. * Autocorrelation has been used to accurately measure power system frequency in numerical relays.


Serial dependence

Serial dependence is closely linked to the notion of autocorrelation, but represents a distinct concept (see Correlation and dependence). In particular, it is possible to have serial dependence but no (linear) correlation. In some fields however, the two terms are used as synonyms. A
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
of a random variable has serial dependence if the value at some time t in the series is statistically dependent on the value at another time s. A series is serially independent if there is no dependence between any pair. If a time series \left\ is stationary, then statistical dependence between the pair (X_t,X_s) would imply that there is statistical dependence between all pairs of values at the same lag \tau=s-t.


See also

* Autocorrelation matrix *
Autocorrelation technique The autocorrelation technique is a method for estimating the dominating frequency in a complex signal, as well as its variance. Specifically, it calculates the first two moments of the power spectrum, namely the mean and variance. It is also know ...
* Autocorrelation of a formal word * Autocorrelator * Correlation function *
Correlogram In the analysis of data, a correlogram is a chart of correlation statistics. For example, in time series analysis, a plot of the sample autocorrelations r_h\, versus h\, (the time lags) is an autocorrelogram. If cross-correlation is plott ...
* Cross-correlation *
Galton's problem Galton's problem, named after Sir Francis Galton, is the problem of drawing inferences from cross-cultural data, due to the statistical phenomenon now called autocorrelation. The problem is now recognized as a general one that applies to all nonex ...
*
Partial autocorrelation function In time series analysis, the partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags. It contrasts with the autocorre ...
* Fluorescence correlation spectroscopy * Optical autocorrelation *
Pitch detection algorithm Pitch may refer to: Acoustic frequency * Pitch (music), the perceived frequency of sound including "definite pitch" and "indefinite pitch" ** Absolute pitch or "perfect pitch" ** Pitch class, a set of all pitches that are a whole number of octav ...
* Triple correlation *
CUSUM In statistical quality control, the CUsUM (or cumulative sum control chart) is a sequential analysis technique developed by E. S. Page of the University of Cambridge. It is typically used for monitoring change detection. CUSUM was announced in B ...
* Cochrane–Orcutt estimation (transformation for autocorrelated error terms) * Prais–Winsten transformation * Scaled correlation * Unbiased estimation of standard deviation


References


Further reading

* * * Mojtaba Soltanalian, and Petre Stoica.
Computational design of sequences with good correlation properties
" IEEE Transactions on Signal Processing, 60.5 (2012): 2180–2193. * Solomon W. Golomb, and Guang Gong
Signal design for good correlation: for wireless communication, cryptography, and radar
Cambridge University Press, 2005. * Klapetek, Petr (2018).
Quantitative Data Processing in Scanning Probe Microscopy: SPM Applications for Nanometrology
' (Second ed.). Elsevier. pp. 108–112 . * {{Statistics, analysis Signal processing Time domain analysis