signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing '' signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...

, a matched filter is obtained by correlating a known delayed signal, or ''template'', with an unknown signal to detect the presence of the template in the unknown signal. This is equivalent to convolving the unknown signal with a conjugated time-reversed version of the template. The matched filter is the optimal

linear filter Linear filters process time-varying input signals to produce output signals, subject to the constraint of linearity. In most cases these linear filters are also time invariant (or shift invariant) in which case they can be analyzed exactly using ...

for maximizing the signal-to-noise ratio (SNR) in the presence of additive

stochastic Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselv ...

noise Noise is unwanted sound considered unpleasant, loud or disruptive to hearing. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrations through a medium, such as air or water. The difference aris ...

. Matched filters are commonly used in

radar Radar is a detection system that uses radio waves to determine the distance (''ranging''), angle, and radial velocity of objects relative to the site. It can be used to detect aircraft, Marine radar, ships, spacecraft, guided missiles, motor v ...

, in which a known signal is sent out, and the reflected signal is examined for common elements of the out-going signal. Pulse compression is an example of matched filtering. It is so called because the impulse response is matched to input pulse signals. Two-dimensional matched filters are commonly used in

image processing An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensio ...

, e.g., to improve the SNR of X-ray observations. Matched filtering is a demodulation technique with LTI (linear time invariant) filters to maximize SNR. It was originally also known as a ''North filter''.

Derivation

Derivation via matrix algebra

The following section derives the matched filter for a

discrete-time system In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled. Discrete time Discrete time views values of variables as occurring at distinct, separate "po ...

. The derivation for a continuous-time system is similar, with summations replaced with integrals. The matched filter is the linear filter,

h

, that maximizes the output signal-to-noise ratio. :

\ y = \sum_^ h -k x

where

x /math> is the input as a function of the independent variable k, and y /math> is the filtered output. Though we most often express filters as the impulse response of convolution systems, as above (see

LTI system theory LTI can refer to: * '' LTI – Lingua Tertii Imperii'', a book by Victor Klemperer * Language Technologies Institute, a division of Carnegie Mellon University * Linear time-invariant system, an engineering theory that investigates the response o ...

), it is easiest to think of the matched filter in the context of the

inner product In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, often ...

, which we will see shortly. We can derive the linear filter that maximizes output signal-to-noise ratio by invoking a geometric argument. The intuition behind the matched filter relies on correlating the received signal (a vector) with a filter (another vector) that is parallel with the signal, maximizing the inner product. This enhances the signal. When we consider the additive stochastic noise, we have the additional challenge of minimizing the output due to noise by choosing a filter that is orthogonal to the noise. Let us formally define the problem. We seek a filter,

h

, such that we maximize the output signal-to-noise ratio, where the output is the inner product of the filter and the observed signal

x

. Our observed signal consists of the desirable signal

s

and additive noise

v

: :

\ x=s+v.\,

Let us define the

covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...

of the noise, reminding ourselves that this matrix has Hermitian symmetry, a property that will become useful in the derivation: :

\ R_v=E\\,

where

v^\mathrm

denotes the conjugate transpose of

v

, and

E

denotes expectation. Let us call our output,

y

, the inner product of our filter and the observed signal such that :

= h^\mathrmx = h^\mathrms + h^\mathrmv = y_s + y_v.

We now define the signal-to-noise ratio, which is our objective function, to be the ratio of the power of the output due to the desired signal to the power of the output due to the noise: :

\mathrm = \frac.

We rewrite the above: :

\mathrm = \frac.

We wish to maximize this quantity by choosing

h

. Expanding the denominator of our objective function, we have :

\ E\ = E\ = h^\mathrm E\ h = h^\mathrmR_vh.\,

Now, our

\mathrm

becomes :

\mathrm = \frac.

We will rewrite this expression with some matrix manipulation. The reason for this seemingly counterproductive measure will become evident shortly. Exploiting the Hermitian symmetry of the covariance matrix

R_v

, we can write :

\mathrm = \frac
                  ,

We would like to find an upper bound on this expression. To do so, we first recognize a form of the Cauchy–Schwarz inequality: :

\ , a^\mathrmb, ^2 \leq (a^\mathrma)(b^\mathrmb),\,

which is to say that the square of the inner product of two vectors can only be as large as the product of the individual inner products of the vectors. This concept returns to the intuition behind the matched filter: this upper bound is achieved when the two vectors

a

and

b

are parallel. We resume our derivation by expressing the upper bound on our

\mathrm

in light of the geometric inequality above: :

\mathrm = \frac
                  
             \leq
             \frac
                  .

Our valiant matrix manipulation has now paid off. We see that the expression for our upper bound can be greatly simplified: :

\mathrm = \frac
                  
             \leq s^\mathrm R_v^ s.

We can achieve this upper bound if we choose, :

\ R_v^h = \alpha R_v^s

where

\alpha

is an arbitrary real number. To verify this, we plug into our expression for the output

\mathrm

: :

\mathrm = \frac
                  
           = \frac
                  
           = \frac
                  
           = s^\mathrm R_v^ s.

Thus, our optimal matched filter is :

\ h = \alpha R_v^s.

We often choose to normalize the expected value of the power of the filter output due to the noise to unity. That is, we constrain :

\ E\ = 1.\,

This constraint implies a value of

\alpha

, for which we can solve: :

\ E\ = \alpha^2 s^\mathrm R_v^ s = 1,

yielding :

\ \alpha = \frac,

giving us our normalized filter, :

\ h = \frac R_v^s.

If we care to write the impulse response

h

of the filter for the convolution system, it is simply the complex conjugate time reversal of the input

s

. Though we have derived the matched filter in discrete time, we can extend the concept to continuous-time systems if we replace

R_v

with the continuous-time

autocorrelation Autocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable ...

function of the noise, assuming a continuous signal

s(t)

, continuous noise

v(t)

, and a continuous filter

h(t)

Derivation via Lagrangian

Alternatively, we may solve for the matched filter by solving our maximization problem with a Lagrangian. Again, the matched filter endeavors to maximize the output signal-to-noise ratio (

\mathrm

) of a filtered deterministic signal in stochastic additive noise. The observed sequence, again, is :

\ x = s + v,\,

with the noise covariance matrix, :

\ R_v = E\.\,

The signal-to-noise ratio is :

\mathrm = \frac,

where

y_s = h^\mathrm s

and

y_v = h^\mathrm v

. Evaluating the expression in the numerator, we have :

\ , y_s, ^2 = ^\mathrm y_s = h^\mathrm s s^\mathrm h.\,

and in the denominator, :

\ E\ = E\ = E\ = h^\mathrm R_v h.\,

The signal-to-noise ratio becomes :

\mathrm = \frac.

If we now constrain the denominator to be 1, the problem of maximizing

\mathrm

is reduced to maximizing the numerator. We can then formulate the problem using a Lagrange multiplier: :

\ h^\mathrm R_v h = 1

\ \mathcal = h^\mathrm s s^\mathrm h + \lambda (1 - h^\mathrm R_v h )

\ \nabla_ \mathcal = s s^\mathrm h - \lambda R_v h = 0

\ (s s^\mathrm) h = \lambda R_v h

which we recognize as a ''generalized eigenvalue problem'' :

\ h^\mathrm (s s^\mathrm) h = \lambda h^\mathrm R_v h.

Since

s s^\mathrm

is of unit rank, it has only one nonzero eigenvalue. It can be shown that this eigenvalue equals :

\ \lambda_ = s^\mathrm R_v^ s,

yielding the following optimal matched filter :

\ h = \frac R_v^ s.

This is the same result found in the previous subsection.

Interpretation as a least-squares estimator

Derivation

Matched filtering can also be interpreted as a least-squares estimator for the optimal location and scaling of a given model or template. Once again, let the observed sequence be defined as :

\ x_k = s_k + v_k,\,

where

v_k

is uncorrelated zero mean noise. The signal

s_k

is assumed to be a scaled and shifted version of a known model sequence

f_k

: :

\ s_k = \mu_0\cdot f_

We want to find optimal estimates

j^*

and

\mu^*

for the unknown shift

j_0

and scaling

\mu_0

by minimizing the least-squares residual between the observed sequence

x_k

and a "probing sequence"

h_

: :

\ j^*,\mu^* = \arg\min_ \sum_k \left(x_k - \mu\cdot h_\right)^2

The appropriate

h_

will later turn out to be the matched filter, but is as yet unspecified. Expanding

x_k

and the square within the sum yields :

\ j^*,\mu^* = \arg\min_\left \sum_k (s_k+v_k)^2 + \mu^2\sum_k h_^2 - 2\mu\sum_k  s_k h_ - 2\mu\sum_k  v_k h_\right

The first term in brackets is a constant (since the observed signal is given) and has no influence on the optimal solution. The last term has constant expected value because the noise is uncorrelated and has zero mean. We can therefore drop both terms from the optimization. After reversing the sign, we obtain the equivalent optimization problem :

\ j^*,\mu^* = \arg\max_\left 2\mu\sum_k  s_k h_ - \mu^2\sum_k h_^2\right

Setting the derivative w.r.t.

\mu

to zero gives an analytic solution for

\mu^*

: :

\ \mu^* = \frac.

Inserting this into our objective function yields a reduced maximization problem for just

j^*

: :

\ j^* = \arg\max_j\frac.

The numerator can be upper-bounded by means of the Cauchy–Schwarz inequality: :

\ \frac \le \frac = \sum_k  s_k^2 = \text.

The optimization problem assumes its maximum when equality holds in this expression. According to the properties of the Cauchy–Schwarz inequality, this is only possible when :

\ h_=\nu \cdot s_k = \kappa\cdot f_.

for arbitrary non-zero constants

\nu

\kappa

, and the optimal solution is obtained at

j^*=j_0

as desired. Thus, our "probing sequence"

h_

must be proportional to the signal model

f_

, and the convenient choice

\kappa=1

yields the matched filter :

\ h_=f_.

Note that the filter is the mirrored signal model. This ensures that the operation

\sum_k  x_k h_

to be applied in order to find the optimum is indeed the convolution between the observed sequence

x_k

and the matched filter

h_k

. The filtered sequence assumes its maximum at the position where the observed sequence

x_k

best matches (in a least-squares sense) the signal model

f_k

Implications

The matched filter may be derived in a variety of ways, but as a special case of a least-squares procedure it may also be interpreted as a

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stat ...

method in the context of a (coloured) Gaussian noise model and the associated Whittle likelihood. If the transmitted signal possessed ''no'' unknown parameters (like time-of-arrival, amplitude,...), then the matched filter would, according to the

Neyman–Pearson lemma In statistics, the Neyman–Pearson lemma was introduced by Jerzy Neyman and Egon Pearson in a paper in 1933. The Neyman-Pearson lemma is part of the Neyman-Pearson theory of statistical testing, which introduced concepts like errors of the sec ...

, minimize the error probability. However, since the exact signal generally is determined by unknown parameters that effectively are estimated (or ''fitted'') in the filtering process, the matched filter constitutes a ''generalized maximum likelihood'' (test-) statistic. The filtered time series may then be interpreted as (proportional to) the

profile likelihood The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood functi ...

, the maximized conditional likelihood as a function of the time parameter. This implies in particular that the error probability (in the sense of Neyman and Pearson, i.e., concerning maximization of the detection probability for a given false-alarm probability) is not necessarily optimal. What is commonly referred to as the '' Signal-to-noise ratio (SNR)'', which is supposed to be maximized by a matched filter, in this context corresponds to

\sqrt

, where

\mathcal

is the (conditionally) maximized likelihood ratio. The construction of the matched filter is based on a ''known'' noise spectrum. In reality, however, the noise spectrum is usually

estimated Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is der ...

from data and hence only known up to a limited precision. For the case of an uncertain spectrum, the matched filter may be generalized to a more robust iterative procedure with favourable properties also in non-Gaussian noise.

Frequency-domain interpretation

When viewed in the frequency domain, it is evident that the matched filter applies the greatest weighting to spectral components exhibiting the greatest signal-to-noise ratio (i.e., large weight where noise is relatively low, and vice versa). In general this requires a non-flat frequency response, but the associated "distortion" is no cause for concern in situations such as

and digital communications, where the original waveform is known and the objective is the detection of this signal against the background noise. On the technical side, the matched filter is a ''weighted least-squares'' method based on the ( heteroscedastic) frequency-domain data (where the "weights" are determined via the noise spectrum, see also previous section), or equivalently, a ''least-squares'' method applied to the whitened data.

Examples

Radar and sonar

Matched filters are often used in signal detection. As an example, suppose that we wish to judge the distance of an object by reflecting a signal off it. We may choose to transmit a pure-tone sinusoid at 1 Hz. We assume that our received signal is an attenuated and phase-shifted form of the transmitted signal with added noise. To judge the distance of the object, we correlate the received signal with a matched filter, which, in the case of white (uncorrelated) noise, is another pure-tone 1-Hz sinusoid. When the output of the matched filter system exceeds a certain threshold, we conclude with high probability that the received signal has been reflected off the object. Using the speed of propagation and the time that we first observe the reflected signal, we can estimate the distance of the object. If we change the shape of the pulse in a specially-designed way, the signal-to-noise ratio and the distance resolution can be even improved after matched filtering: this is a technique known as pulse compression. Additionally, matched filters can be used in parameter estimation problems (see

estimation theory Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their valu ...

). To return to our previous example, we may desire to estimate the speed of the object, in addition to its position. To exploit the Doppler effect, we would like to estimate the frequency of the received signal. To do so, we may correlate the received signal with several matched filters of sinusoids at varying frequencies. The matched filter with the highest output will reveal, with high probability, the frequency of the reflected signal and help us determine the speed of the object. This method is, in fact, a simple version of the discrete Fourier transform (DFT). The DFT takes an

N

-valued complex input and correlates it with

N

matched filters, corresponding to complex exponentials at

N

different frequencies, to yield

N

complex-valued numbers corresponding to the relative amplitudes and phases of the sinusoidal components (see Moving target indication).

Digital communications

The matched filter is also used in communications. In the context of a communication system that sends binary messages from the transmitter to the receiver across a noisy channel, a matched filter can be used to detect the transmitted pulses in the noisy received signal. Imagine we want to send the sequence "0101100100" coded in non polar

non-return-to-zero In telecommunication, a non-return-to-zero (NRZ) line code is a binary code in which ones are represented by one significant condition, usually a positive voltage, while zeros are represented by some other significant condition, usually a negat ...

(NRZ) through a certain channel. Mathematically, a sequence in NRZ code can be described as a sequence of unit pulses or shifted rect functions, each pulse being weighted by +1 if the bit is "1" and by -1 if the bit is "0". Formally, the scaling factor for the

k^\mathrm

bit is, :

\ a_k =
		\begin
			+1,  & \text k \text 1, \\
			-1, & \text k \text 0.
	  \end

We can represent our message,

M(t)

, as the sum of shifted unit pulses: :

\ M(t) = \sum_^\infty a_k \times \Pi \left( \frac \right).

where

T

is the time length of one bit and

\Pi(x)

is the rectangular function. Thus, the signal to be sent by the transmitter is If we model our noisy channel as an AWGN channel, white Gaussian noise is added to the signal. At the receiver end, for a Signal-to-noise ratio of 3 dB, this may look like: A first glance will not reveal the original transmitted sequence. There is a high power of noise relative to the power of the desired signal (i.e., there is a low signal-to-noise ratio). If the receiver were to sample this signal at the correct moments, the resulting binary message could be incorrect. To increase our signal-to-noise ratio, we pass the received signal through a matched filter. In this case, the filter should be matched to an NRZ pulse (equivalent to a "1" coded in NRZ code). Precisely, the impulse response of the ideal matched filter, assuming white (uncorrelated) noise should be a time-reversed complex-conjugated scaled version of the signal that we are seeking. We choose :

\ h(t) = \Pi\left( \frac \right).

In this case, due to symmetry, the time-reversed complex conjugate of

h(t)

is in fact

h(t)

, allowing us to call

h(t)

the impulse response of our matched filter convolution system. After convolving with the correct matched filter, the resulting signal,

M_\mathrm(t)

is, :

\ M_\mathrm(t) = (M * h)(t)

where

*

denotes convolution. Which can now be safely sampled by the receiver at the correct sampling instants, and compared to an appropriate threshold, resulting in a correct interpretation of the binary message.

Gravitational-wave astronomy

Matched filters play a central role in

gravitational-wave astronomy Gravitational-wave astronomy is an emerging branch of observational astronomy which aims to use gravitational waves (minute distortions of spacetime predicted by Albert Einstein's theory of general relativity) to collect observational data abo ...

. The first observation of gravitational waves was based on large-scale filtering of each detector's output for signals resembling the expected shape, followed by subsequent screening for coincident and coherent triggers between both instruments. False-alarm rates, and with that, the

statistical significance In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the p ...

of the detection were then assessed using resampling methods. Inference on the astrophysical source parameters was completed using

Bayesian methods Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, an ...

based on parameterized theoretical models for the signal waveform and (again) on the Whittle likelihood.

Biology

Animals living in relatively static environments would have relatively fixed features of the environment to perceive. This allows the evolution of filters that match the expected signal with the highest signal-to-noise ratio, the matched filter. Sensors that perceive the world "through such a 'matched filter' severely limits the amount of information the brain can pick up from the outside world, but it frees the brain from the need to perform more intricate computations to extract the information finally needed for fulfilling a particular task."

Derivation

Derivation via matrix algebra

Derivation via Lagrangian

Interpretation as a least-squares estimator

Derivation

Implications

Frequency-domain interpretation

Examples

Radar and sonar

Digital communications

Gravitational-wave astronomy

Biology

See also

Notes

References

Further reading