HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
and
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, and scientific measurements. Signal processing techniq ...
, step detection (also known as step smoothing, step filtering, shift detection, jump detection or
edge detection Edge detection includes a variety of mathematical methods that aim at identifying edges, curves in a digital image at which the image brightness changes sharply or, more formally, has discontinuities. The same problem of finding discontinuitie ...
) is the process of finding abrupt changes (steps, jumps, shifts) in the mean level of a
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Exa ...
or signal. It is usually considered as a special case of the statistical method known as
change detection In statistical analysis, change detection or change point detection tries to identify times when the probability distribution of a stochastic process or time series changes. In general the problem concerns both detecting whether or not a change ...
or change point detection. Often, the step is small and the time series is corrupted by some kind of
noise Noise is unwanted sound considered unpleasant, loud or disruptive to hearing. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrations through a medium, such as air or water. The difference arise ...
, and this makes the problem challenging because the step may be hidden by the noise. Therefore, statistical and/or signal processing algorithms are often required. The step detection problem occurs in multiple scientific and engineering contexts, for example in
statistical process control Statistical process control (SPC) or statistical quality control (SQC) is the application of statistical methods to monitor and control the quality of a production process. This helps to ensure that the process operates efficiently, producing m ...
(the
control chart Control charts is a graph used in production control to determine whether quality and manufacturing processes are being controlled under stable conditions. (ISO 7870-1) The hourly status is arranged on the graph, and the occurrence of abnormalit ...
being the most directly related method), in exploration
geophysics Geophysics () is a subject of natural science concerned with the physical processes and physical properties of the Earth and its surrounding space environment, and the use of quantitative methods for their analysis. The term ''geophysics'' som ...
(where the problem is to segment a well-log recording into stratigraphic zones), in
genetics Genetics is the study of genes, genetic variation, and heredity in organisms.Hartl D, Jones E (2005) It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinian friar wor ...
(the problem of separating microarray data into similar copy-number regimes), and in
biophysics Biophysics is an interdisciplinary science that applies approaches and methods traditionally used in physics to study biological phenomena. Biophysics covers all scales of biological organization, from molecular to organismic and populations. ...
(detecting state transitions in a
molecular machine A molecular machine, nanite, or nanomachine is a molecular component that produces quasi-mechanical movements (output) in response to specific stimuli (input). In cellular biology, macromolecular machines frequently perform tasks essential for l ...
as recorded in time-position traces). For 2D signals, the related problem of
edge detection Edge detection includes a variety of mathematical methods that aim at identifying edges, curves in a digital image at which the image brightness changes sharply or, more formally, has discontinuities. The same problem of finding discontinuitie ...
has been studied intensively for
image processing An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensiona ...
.


Algorithms

When the step detection must be performed as and when the data arrives, then
online algorithm In computer science, an online algorithm is one that can process its input piece-by-piece in a serial fashion, i.e., in the order that the input is fed to the algorithm, without having the entire input available from the start. In contrast, an o ...
s are usually used, and it becomes a special case of
sequential analysis In statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data are evaluated as they are collected, and further sampling is stopped in accordance with a pre- ...
. Such algorithms include the classical
CUSUM In statistical quality control, the CUsUM (or cumulative sum control chart) is a sequential analysis technique developed by E. S. Page of the University of Cambridge. It is typically used for monitoring change detection. CUSUM was announced in ...
method applied to changes in mean. By contrast, ''offline'' algorithms are applied to the data potentially long after it has been received. Most offline algorithms for step detection in digital data can be categorised as ''top-down'', ''bottom-up'', ''sliding window'', or ''global'' methods.


Top-down

These algorithms start with the assumption that there are no steps and introduce possible candidate steps one at a time, testing each candidate to find the one that minimizes some criteria (such as the least-squares fit of the estimated, underlying piecewise constant signal). An example is the ''stepwise jump placement'' algorithm, first studied in geophysical problems, that has found recent uses in modern biophysics.


Bottom-up

Bottom-up algorithms take the "opposite" approach to top-down methods, first assuming that there is a step in between every sample in the digital signal, and then successively merging steps based on some criteria tested for every candidate merge.


Sliding window

By considering a small "window" of the signal, these algorithms look for evidence of a step occurring within the window. The window "slides" across the time series, one time step at a time. The evidence for a step is tested by statistical procedures, for example, by use of the two-sample
Student's t-test A ''t''-test is any statistical hypothesis test in which the test statistic follows a Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of ...
. Alternatively, a
nonlinear filter In signal processing, a nonlinear (or non-linear) filter is a filter whose output is not a linear function of its input. That is, if the filter outputs signals ''R'' and ''S'' for two input signals ''r'' and ''s'' separately, but does not always o ...
such as the
median filter The median filter is a non-linear digital filtering technique, often used to remove noise from an image or signal. Such noise reduction is a typical pre-processing step to improve the results of later processing (for example, edge detection on an ...
is applied to the signal. Filters such as these attempt to remove the noise whilst preserving the abrupt steps.


Global

Global algorithms consider the entire signal in one go, and attempt to find the steps in the signal by some kind of optimization procedure. Algorithms include
wavelet A wavelet is a wave-like oscillation with an amplitude that begins at zero, increases or decreases, and then returns to zero one or more times. Wavelets are termed a "brief oscillation". A taxonomy of wavelets has been established, based on the num ...
methods, and
total variation denoising In signal processing, particularly image processing, total variation denoising, also known as total variation regularization or total variation filtering, is a noise removal process (filter). It is based on the principle that signals with excessi ...
which uses methods from
convex optimization Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization probl ...
. Where the steps can be modelled as a
Markov chain A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happe ...
, then
Hidden Markov Models A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ob ...
are also often used (a popular approach in the biophysics community). When there are only a few unique values of the mean, then k-means clustering can also be used.


Linear versus nonlinear signal processing methods for step detection

Because steps and (independent) noise have theoretically infinite
bandwidth Bandwidth commonly refers to: * Bandwidth (signal processing) or ''analog bandwidth'', ''frequency bandwidth'', or ''radio bandwidth'', a measure of the width of a frequency range * Bandwidth (computing), the rate of data transfer, bit rate or thr ...
and so overlap in the
Fourier basis In mathematics, a basis function is an element of a particular basis for a function space. Every function in the function space can be represented as a linear combination of basis functions, just as every vector in a vector space can be repres ...
,
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, and scientific measurements. Signal processing techniq ...
approaches to step detection generally do not use classical smoothing techniques such as the
low pass filter A low-pass filter is a filter that passes signals with a frequency lower than a selected cutoff frequency and attenuates signals with frequencies higher than the cutoff frequency. The exact frequency response of the filter depends on the filter des ...
. Instead, most algorithms are explicitly nonlinear or time-varying.


Step detection and piecewise constant signals

Because the aim of step detection is to find a series of instantaneous jumps in the mean of a signal, the wanted, underlying, mean signal is
piecewise constant In mathematics, a function on the real numbers is called a step function if it can be written as a finite linear combination of indicator functions of intervals. Informally speaking, a step function is a piecewise constant function having on ...
. For this reason, step detection can be profitably viewed as the problem of recovering a piecewise constant signal corrupted by noise. There are two complementary models for piecewise constant signals: as 0-degree splines with a few knots, or as
level set In mathematics, a level set of a real-valued function of real variables is a set where the function takes on a given constant value , that is: : L_c(f) = \left\~, When the number of independent variables is two, a level set is calle ...
s with a few unique levels. Many algorithms for step detection are therefore best understood as either 0-degree spline fitting, or level set recovery, methods.


Step detection as level set recovery

When there are only a few unique values of the mean, clustering techniques such as k-means clustering or
mean-shift Mean shift is a non-parametric feature-space mathematical analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm. Application domains include cluster analysis in computer vision and image processing. ...
are appropriate. These techniques are best understood as methods for finding a level set description of the underlying piecewise constant signal.


Step detection as 0-degree spline fitting

Many algorithms explicitly fit 0-degree splines to the noisy signal in order to detect steps (including stepwise jump placement methods), but there are other popular algorithms that can also be seen to be spline fitting methods after some transformation, for example
total variation denoising In signal processing, particularly image processing, total variation denoising, also known as total variation regularization or total variation filtering, is a noise removal process (filter). It is based on the principle that signals with excessi ...
.


Generalized step detection by piecewise constant denoising

All the algorithms mentioned above have certain advantages and disadvantages in particular circumstances, yet, a surprisingly large number of these step detection algorithms are special cases of a more general algorithm. This algorithm involves the minimization of a global functional: Here, ''x''''i'' for ''i'' = 1, ...., ''N'' is the discrete-time input signal of length ''N'', and ''m''''i'' is the signal output from the algorithm. The goal is to minimize ''H'' 'm''with respect to the output signal ''m''. The form of the function \scriptstyle \Lambda determines the particular algorithm. For example, choosing: :\Lambda = \frac12 \left, x_i-m_j\^2 I(i-j=0) + \gamma\left, m_i-m_j\ I(i-j=1) where ''I''(''S'') = 0 if the condition ''S'' is false, and one otherwise, obtains the
total variation denoising In signal processing, particularly image processing, total variation denoising, also known as total variation regularization or total variation filtering, is a noise removal process (filter). It is based on the principle that signals with excessi ...
algorithm with regularization parameter \gamma. Similarly: :\Lambda=\min \left\ leads to the mean shift algorithm, when using an adaptive step size Euler integrator initialized with the input signal ''x''. Here ''W'' > 0 is a parameter that determines the support of the mean shift kernel. Another example is: :\Lambda = \frac \cdot I(, i-j, \le W) leading to the
bilateral filter A bilateral filter is a non-linear, edge-preserving, and noise-reducing smoothing filter for images. It replaces the intensity of each pixel with a weighted average of intensity values from nearby pixels. This weight can be based on a Gaussian d ...
, where \scriptstyle \beta>0 is the tonal kernel parameter, and ''W'' is the spatial kernel support. Yet another special case is: :\Lambda= \frac12 \left, x_i-m_j\^2 I(i-j=0) + \gamma \left, m_i-m_j\^0 I(i-j=1) specifying a group of algorithms that attempt to greedily fit 0-degree splines to the signal. Here, \scriptstyle \left, x\^0 is defined as zero if ''x'' = 0, and one otherwise. Many of the functionals in equation () defined by the particular choice of \scriptstyle \Lambda are
convex Convex or convexity may refer to: Science and technology * Convex lens, in optics Mathematics * Convex set, containing the whole line segment that joins points ** Convex polygon, a polygon which encloses a convex set of points ** Convex polytope ...
: they can be minimized using methods from
convex optimization Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization probl ...
. Still others are non-convex but a range of algorithms for minimizing these functionals have been devised.


Step detection using the Potts model

A classical variational method for step detection is the Potts model. It is given by the non-convex optimization problem : u^* = \arg\min_ \gamma \, \nabla u \, _0 + \, u-x\, _p^p The term \, \nabla u \, _0 = \# \ penalizes the number of jumps and the term \, u-x\, _p^p = \sum_^N , u_i - x_i, ^p measures fidelity to data ''x''. The parameter γ > 0 controls the tradeoff between regularity and
data fidelity Data integrity is the maintenance of, and the assurance of, data accuracy and consistency over its entire life-cycle and is a critical aspect to the design, implementation, and usage of any system that stores, processes, or retrieves data. The ter ...
. Since the minimizer u^* is piecewise constant the steps are given by the non-zero locations of the gradient \nabla u^*. For p = 2 and p= 1 there are fast algorithms which give an exact solution of the Potts problem in O(N^2). A. Weinmann, M. Storath, L. Demaret. "The L^1-Potts functional for robust jump-sparse reconstruction." SIAM Journal on Numerical Analysis, 53(1):644-673 (2015).


See also

* Time-series segmentation


References


External links


PWCTools: Flexible Matlab and Python software for step detection by piecewise constant denoisingPottslab: Matlab toolbox for piecewise constant estimation based on the Potts model
{{DEFAULTSORT:Step Detection Change detection Nonlinear filters Statistical signal processing Feature detection (computer vision)