Steered-Response Power Phase Transform (SRP-PHAT) is a popular algorithm for acoustic source localization, well known for its robust performance in adverse acoustic environments. The algorithm can be interpreted as a

beamforming Beamforming or spatial filtering is a signal processing technique used in sensor arrays for directional signal transmission or reception. This is achieved by combining elements in an antenna array in such a way that signals at particular angles ...

-based approach that searches for the candidate position that maximizes the output of a steered delay-and-sum beamformer.

Algorithm

Steered-Response Power

Consider a system of

M

microphones, where each microphone is denoted by a subindex

m \in \

. The discrete-time output signal from a microphone is

s_m (n)

. The (unweighted) steered-response power (SRP) at a spatial point

\mathbf =

, y, z The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...

can be expressed as

P_0(\mathbf) \triangleq \sum_ \left\vert \sum_^ s_m(n-\tau_m(\mathbf))  \right\vert^,

where

\mathbb

denotes the set of integer numbers and

\tau_m(\mathbf)

would be the time-lag due to the propagation from a source located at

\mathbf

to the

m

-th microphone. The (weighted) SRP can be rewritten as

P(\mathbf) = \frac \sum_^\sum_^ \int_^ \Phi_(e^)S_(e^) S_^(e^)e^d\omega,

where

()^

denotes complex conjugation,

S_(e^)

represents the

discrete-time Fourier transform In mathematics, the discrete-time Fourier transform (DTFT) is a form of Fourier analysis that is applicable to a sequence of values. The DTFT is often used to analyze samples of a continuous function. The term ''discrete-time'' refers to the ...

s_m(n)

and

\Phi_(e^)

is a weighting function in the frequency domain (later discussed). The term

\tau_(\mathbf)

is the discrete time-difference of arrival (TDOA) of a signal emitted at position

\mathbf

to microphones

m_1

and

m_2

, given by