HOME

TheInfoList



OR:

The short-time Fourier transform (STFT), is a Fourier-related transform used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time. In practice, the procedure for computing STFTs is to divide a longer time signal into shorter segments of equal length and then compute the Fourier transform separately on each shorter segment. This reveals the Fourier spectrum on each shorter segment. One then usually plots the changing spectra as a function of time, known as a
spectrogram A spectrogram is a visual representation of the spectrum of frequencies of a signal as it varies with time. When applied to an audio signal, spectrograms are sometimes called sonographs, voiceprints, or voicegrams. When the data are represen ...
or waterfall plot, such as commonly used in
software defined radio Software-defined radio (SDR) is a radio communication system where components that have been traditionally implemented in analog hardware (e.g. mixers, filters, amplifiers, modulators/demodulators, detectors, etc.) are instead implemented by ...
(SDR) based spectrum displays. Full bandwidth displays covering the whole range of an SDR commonly use fast Fourier transforms (FFTs) with 2^24 points on desktop computers.


Forward STFT


Continuous-time STFT

Simply, in the continuous-time case, the function to be transformed is multiplied by a
window function In signal processing and statistics, a window function (also known as an apodization function or tapering function) is a mathematical function that is zero-valued outside of some chosen interval, normally symmetric around the middle of the int ...
which is nonzero for only a short period of time. The
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed ...
(a one-dimensional function) of the resulting signal is taken, then the window is slid along the time axis until the end resulting in a two-dimensional representation of the signal. Mathematically, this is written as: :\mathbf\(\tau,\omega) \equiv X(\tau, \omega) = \int_^ x(t) w(t-\tau) e^ \, d t where w(\tau) is the
window function In signal processing and statistics, a window function (also known as an apodization function or tapering function) is a mathematical function that is zero-valued outside of some chosen interval, normally symmetric around the middle of the int ...
, commonly a
Hann window The Hann function is named after the Austrian meteorologist Julius von Hann. It is a window function used to perform Hann smoothing. The function, with length L and amplitude 1/L, is given by: : w_0(x) \triangleq \left\.   For digital sign ...
or
Gaussian window In discrete-time signal processing, windowing is a preliminary signal shaping technique, usually applied to improve the appearance and usefulness of a subsequent Discrete Fourier Transform. Several '' window functions'' can be defined, based on ...
centered around zero, and x(t) is the signal to be transformed (note the difference between the window function w and the frequency \omega). X(\tau, \omega) is essentially the Fourier transform of x(t)w(t-\tau), a complex function representing the phase and magnitude of the signal over time and frequency. Often phase unwrapping is employed along either or both the time axis, \tau, and frequency axis, \omega, to suppress any
jump discontinuity Continuous functions are of utmost importance in mathematics, functions and applications. However, not all functions are continuous. If a function is not continuous at a point in its domain, one says that it has a discontinuity there. The set of a ...
of the phase result of the STFT. The time index \tau is normally considered to be "''slow''" time and usually not expressed in as high resolution as time t. Given that the STFT is essentially a Fourier transform times a window function, the STFT is also called windowed Fourier transform or time-dependent Fourier transform.


Discrete-time STFT

In the discrete time case, the data to be transformed could be broken up into chunks or frames (which usually overlap each other, to reduce artifacts at the boundary). Each chunk is
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed ...
ed, and the complex result is added to a matrix, which records magnitude and phase for each point in time and frequency. This can be expressed as: :\mathbf\(m,\omega)\equiv X(m,\omega) = \sum_^ x -m^ likewise, with signal ''x'' 'n''and window ''w'' 'n'' In this case, ''m'' is discrete and ω is continuous, but in most typical applications the STFT is performed on a computer using the
fast Fourier transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in ...
, so both variables are discrete and quantized. The magnitude squared of the STFT yields the
spectrogram A spectrogram is a visual representation of the spectrum of frequencies of a signal as it varies with time. When applied to an audio signal, spectrograms are sometimes called sonographs, voiceprints, or voicegrams. When the data are represen ...
representation of the power spectral density of the function: :\operatorname\(\tau, \omega) \equiv , X(\tau, \omega), ^2 See also the
modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where ...
(MDCT), which is also a Fourier-related transform that uses overlapping windows.


Sliding DFT

If only a small number of ω are desired, or if the STFT is desired to be evaluated for every shift ''m'' of the window, then the STFT may be more efficiently evaluated using a sliding DFT algorithm.


Inverse STFT

The STFT is invertible, that is, the original signal can be recovered from the transform by the inverse STFT. The most widely accepted way of inverting the STFT is by using the overlap-add (OLA) method, which also allows for modifications to the STFT complex spectrum. This makes for a versatile signal processing method, referred to as the ''overlap and add with modifications'' method.


Continuous-time STFT

Given the width and definition of the window function ''w''(''t''), we initially require the area of the window function to be scaled so that : \int_^ w(\tau) \, d\tau = 1. It easily follows that : \int_^ w(t-\tau) \, d\tau = 1 \quad \forall \ t and : x(t) = x(t) \int_^ w(t-\tau) \, d\tau = \int_^ x(t) w(t-\tau) \, d\tau. The continuous Fourier transform is : X(\omega) = \int_^ x(t) e^ \, dt. Substituting ''x''(''t'') from above: : X(\omega) = \int_^ \left \int_^ x(t) w(t-\tau) \, d\tau \right\, e^ \, dt ::: = \int_^ \int_^ x(t) w(t-\tau) \, e^ \, d\tau \, dt. Swapping order of integration: : X(\omega) = \int_^ \int_^ x(t) w(t-\tau) \, e^ \, dt \, d\tau ::: = \int_^ \left \int_^ x(t) w(t-\tau) \, e^ \, dt \right\, d\tau ::: = \int_^ X(\tau, \omega) \, d\tau. So the Fourier transform can be seen as a sort of phase coherent sum of all of the STFTs of ''x''(''t''). Since the inverse Fourier transform is : x(t) = \frac \int_^ X(\omega) e^ \, d\omega, then ''x''(''t'') can be recovered from ''X''(τ,ω) as : x(t) = \frac \int_^ \int_^ X(\tau, \omega) e^ \, d\tau \, d\omega. or : x(t) = \int_^ \left \frac \int_^ X(\tau, \omega) e^ \, d\omega \right\, d\tau. It can be seen, comparing to above that windowed "grain" or "wavelet" of ''x''(''t'') is : x(t) w(t-\tau) = \frac \int_^ X(\tau, \omega) e^ \, d\omega. the inverse Fourier transform of ''X''(τ,ω) for τ fixed. An alternative definition that is valid only in the vicinity of τ, the inverse transform is: :x(t) = \frac\frac \int_^ X(\tau, \omega) e^ \, d\omega. In general, the window function w(t) has the following properties: :(a) even symmetry: w(t) = w(-t);
:(b) non-increasing (for positive time): w(t) \geq w(s) if , t, \leq , s, ;
:(c) compact support: w(t) is equal to zero when , t, is large.


Resolution issues

One of the pitfalls of the STFT is that it has a fixed resolution. The width of the windowing function relates to how the signal is represented—it determines whether there is good frequency resolution (frequency components close together can be separated) or good time resolution (the time at which frequencies change). A wide window gives better frequency resolution but poor time resolution. A narrower window gives good time resolution but poor frequency resolution. These are called narrowband and wideband transforms, respectively. This is one of the reasons for the creation of the
wavelet transform In mathematics, a wavelet series is a representation of a square-integrable ( real- or complex-valued) function by a certain orthonormal series generated by a wavelet. This article provides a formal, mathematical definition of an orthonormal ...
and multiresolution analysis, which can give good time resolution for high-frequency events and good frequency resolution for low-frequency events, the combination best suited for many real signals. This property is related to the Heisenberg
uncertainty principle In quantum mechanics, the uncertainty principle (also known as Heisenberg's uncertainty principle) is any of a variety of mathematical inequalities asserting a fundamental limit to the accuracy with which the values for certain pairs of physic ...
, but not directly – see Gabor limit for discussion. The product of the standard deviation in time and frequency is limited. The boundary of the uncertainty principle (best simultaneous resolution of both) is reached with a Gaussian window function (or mask function), as the Gaussian minimizes the Fourier uncertainty principle. This is called the
Gabor transform The Gabor transform, named after Dennis Gabor, is a special case of the short-time Fourier transform. It is used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time. The function to be tran ...
(and with modifications for multiresolution becomes the
Morlet wavelet In mathematics, the Morlet wavelet (or Gabor wavelet)0). The parameter \sigma in the Morlet wavelet allows trade between time and frequency resolutions. Conventionally, the restriction \sigma>5 is used to avoid problems with the Morlet wavelet a ...
transform). One can consider the STFT for varying window size as a two-dimensional domain (time and frequency), as illustrated in the example below, which can be calculated by varying the window size. However, this is no longer a strictly time-frequency representation – the kernel is not constant over the entire signal.


Examples

When the original function is: :X(t,f) = \int^\infty_w(t-\tau) x(\tau) e^ d\tau We can have a simple example: w(t) = 1 for , t, smaller than or equal B w(t) = 0 otherwise B = window Now the original function of the Short-time Fourier transform can be changed as :X(t,f) = \int^_x(\tau) e^ d\tau Another example: Using the following sample signal x(t) that is composed of a set of four sinusoidal waveforms joined together in sequence. Each waveform is only composed of one of four frequencies (10, 25, 50, 100 Hz). The definition of x(t) is: :x(t)=\begin \cos (2 \pi 10 t) & 0\,\mathrm \le t < 5 \,\mathrm \\ \cos (2 \pi 25 t) & 5\,\mathrm \le t < 10\,\mathrm \\ \cos (2 \pi 50 t) & 10\,\mathrm \le t < 15\,\mathrm \\ \cos (2 \pi 100 t) & 15\,\mathrm \le t < 20\,\mathrm \\ \end Then it is sampled at 400 Hz. The following spectrograms were produced: The 25 ms window allows us to identify a precise time at which the signals change but the precise frequencies are difficult to identify. At the other end of the scale, the 1000 ms window allows the frequencies to be precisely seen but the time between frequency changes is blurred. Other examples: :w(t) = exp(\sigma-t^) Normally we call exp(\sigma-t^) a Gaussian function or Gabor function. When we use it, the short-time Fourier transform is called the "Gabor transform".


Explanation

It can also be explained with reference to the sampling and
Nyquist frequency In signal processing, the Nyquist frequency (or folding frequency), named after Harry Nyquist, is a characteristic of a sampler, which converts a continuous function or signal into a discrete sequence. In units of cycles per second ( Hz), it ...
. Take a window of ''N'' samples from an arbitrary real-valued signal at sampling rate ''f''s . Taking the Fourier transform produces ''N'' complex coefficients. Of these coefficients only half are useful (the last ''N/2'' being the complex conjugate of the first ''N/2'' in reverse order, as this is a real valued signal). These ''N/2'' coefficients represent the frequencies 0 to ''f''s/2 (Nyquist) and two consecutive coefficients are spaced apart by ''f''s/''N'' Hz. To increase the frequency resolution of the window the frequency spacing of the coefficients needs to be reduced. There are only two variables, but decreasing ''f''s (and keeping ''N'' constant) will cause the window size to increase — since there are now fewer samples per unit time. The other alternative is to increase ''N'', but this again causes the window size to increase. So any attempt to increase the frequency resolution causes a larger window size and therefore a reduction in time resolution—and vice versa.


Rayleigh frequency

As the
Nyquist frequency In signal processing, the Nyquist frequency (or folding frequency), named after Harry Nyquist, is a characteristic of a sampler, which converts a continuous function or signal into a discrete sequence. In units of cycles per second ( Hz), it ...
is a limitation in the maximum frequency that can be meaningfully analysed, so is the Rayleigh frequency a limitation on the minimum frequency. The Rayleigh frequency is the minimum frequency that can be resolved by a finite duration time window. Given a time window that is Τ seconds long, the minimum frequency that can be resolved is 1/Τ Hz. The Rayleigh frequency is an important consideration in applications of the short-time Fourier transform (STFT), as well as any other method of harmonic analysis on a signal of finite record-length.


Application

STFTs as well as standard Fourier transforms and other tools are frequently used to analyze music. The
spectrogram A spectrogram is a visual representation of the spectrum of frequencies of a signal as it varies with time. When applied to an audio signal, spectrograms are sometimes called sonographs, voiceprints, or voicegrams. When the data are represen ...
can, for example, show frequency on the horizontal axis, with the lowest frequencies at left, and the highest at the right. The height of each bar (augmented by color) represents the
amplitude The amplitude of a periodic variable is a measure of its change in a single period (such as time or spatial period). The amplitude of a non-periodic signal is its magnitude compared with a reference value. There are various definitions of am ...
of the frequencies within that band. The depth dimension represents time, where each new bar was a separate distinct transform. Audio engineers use this kind of visual to gain information about an audio sample, for example, to locate the frequencies of specific noises (especially when used with greater frequency resolution) or to find frequencies which may be more or less resonant in the space where the signal was recorded. This information can be used for equalization or tuning other audio effects.


Implementation

Original function :X(t,f) = \int^\infty_w(t-\tau) x(\tau) e^ d\tau Converting into the discrete form: :t = n\Delta_t, f=m\Delta_f, \tau =p\Delta_t :X(n\Delta_t, m\Delta_f)=\sum^\infty_w((n-p)\Delta_t)x(p\Delta_t)e^\Delta_t Suppose that :w(t) \cong 0 \text , t, >B, \frac=Q Then we can write the original function into :X(n\Delta_t, m\Delta_f)= \sum^_w((n-p)\Delta_t)x(p\Delta_t)e^\Delta_t


Direct implementation


Constraints

a. Nyquist criterion (avoiding the aliasing effect): :\Delta_t < \frac, where \Omega is the bandwidth of x(\tau) w(t-\tau)


FFT-based method


Constraint

a. \Delta_t \Delta_f = \tfrac, where N is an integer b. N \geq 2Q+1 c. Nyquist criterion (avoiding the aliasing effect): :\Delta_t < \frac, \Omega is the bandwidth of x(\tau) w(t-\tau) :X(n\Delta_t, m\Delta_f) = \sum_^ w((n - p)\Delta_t)x(p\Delta_t) e^\Delta_t :\text q = p - (n - Q), \text p = (n - Q) + q : X(n\Delta_t, m\Delta_f) = \Delta_t e^ \sum_^ x_1(q)e^ :\text x_1(q) = \begin w((Q - q)\Delta_t)x((n - Q + q)\Delta_t) & 0 \leq q \leq 2Q\\ 0 & 2Q < q < N \end


Recursive method


Constraint

a. \Delta_t \Delta_f = \tfrac, where N is an integer b. N \geq 2Q+1 c. Nyquist criterion (avoiding the aliasing effect): :\Delta_t < \frac, \Omega is the bandwidth of x(\tau) w(t-\tau) d. Only for implementing the rectangular-STFT Rectangular window imposes the constraint :w((n - p)\Delta_t) = 1 Substitution gives: : \begin X(n\Delta_t, m\Delta_f) &= \sum_^ w((n - p)\Delta_t)&x(p\Delta_t) e^\Delta_t \\ &= \sum_^ &x(p\Delta_t) e^\Delta_t \\ \end Change of variable for : : X((n-1)\Delta_t, m\Delta_f) = \sum_^ x(p\Delta_t) e^\Delta_t Calculate X(\min\Delta_t, m\Delta_f) by the ''N''-point FFT: :X(n_0\Delta_t, m\Delta_f) = \Delta_t e^ \sum_^ x_1(q) e^, \qquad n_0=\min where : x_1(q) = \begin x((n - Q + q)\Delta_t) & q \leq 2Q\\ 0 & q >2Q \end Applying the recursive formula to calculate X(n\Delta_t, m\Delta_f) :X(n\Delta_t, m\Delta_f) = X((n-1)\Delta_t, m\Delta_f) - x((n - Q -1)\Delta_t) e^\Delta_t + x((n+Q)\Delta_t)e^\Delta_t


Chirp Z transform


Constraint

:\exp = \exp \cdot \exp\cdot \exp so :X(n\Delta_t, m\Delta_f) = \Delta_t \sum_^ w((n-p)\Delta_t)x(p\Delta_t)e^ :X(n\Delta_t, m\Delta_f) = \Delta_t e^ \sum_^ w((n-p)\Delta_t)x(p\Delta_t)e^ e^


Implementation comparison


See also

* Least-squares spectral analysis *
Spectral density estimation In statistical signal processing, the goal of spectral density estimation (SDE) or simply spectral estimation is to estimate the spectral density (also known as the power spectral density) of a signal from a sequence of time samples of the sig ...
* Time-frequency analysis * Time-frequency representation * Reassignment method Other time-frequency transforms: *
Cone-shape distribution function The cone-shape distribution function, also known as the Zhao–Atlas–Marks time-frequency distribution,Leon Cohen, Time Frequency Analysis: Theory and Applications, Prentice Hall, (1994) (acronymized as the ZAM distribution or ZAMD), is one of t ...
* Constant-Q transform *
Fractional Fourier transform In mathematics, in the area of harmonic analysis, the fractional Fourier transform (FRFT) is a family of linear transformations generalizing the Fourier transform. It can be thought of as the Fourier transform to the ''n''-th power, where ''n'' n ...
*
Gabor transform The Gabor transform, named after Dennis Gabor, is a special case of the short-time Fourier transform. It is used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time. The function to be tran ...
* Newland transform * S transform *
Wavelet transform In mathematics, a wavelet series is a representation of a square-integrable ( real- or complex-valued) function by a certain orthonormal series generated by a wavelet. This article provides a formal, mathematical definition of an orthonormal ...
* Chirplet transform


References


External links


DiscreteTFDs – software for computing the short-time Fourier transform and other time-frequency distributionsSingular Spectral Analysis – MultiTaper Method Toolkit
– a free software program to analyze short, noisy time series
kSpectra Toolkit for Mac OS X from SpectraWorksTime stretched short time Fourier transform for time frequency analysis of ultra wideband signalsA BSD-licensed Matlab class to perform STFT and inverse STFTLTFAT – A free (GPL) Matlab / Octave toolbox to work with short-time Fourier transforms and time-frequency analysisSonogram visible speech – A free (GPL)Freeware for short-time Fourier transforms and time-frequency analysisNational Taiwan University, Time-Frequency Analysis and Wavelet Transform 2021, Professor of Jian-Jiun Ding, Department of Electrical Engineering
{{DEFAULTSORT:Short-Time Fourier Transform Fourier analysis Time–frequency analysis Transforms Signal processing