Changepoint Detection
   HOME

TheInfoList



OR:

In
statistical analysis Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
, change detection or change point detection tries to identify times when the
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
of a
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Sto ...
or
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...
changes. In general the problem concerns both detecting whether or not a change has occurred, or whether several changes might have occurred, and identifying the times of any such changes. Specific applications, like step detection and
edge detection Edge or EDGE may refer to: Technology Computing * Edge computing, a network load-balancing system * Edge device, an entry point to a computer network * Adobe Edge, a graphical development application * Microsoft Edge, a web browser developed b ...
, may be concerned with changes in the
mean A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
,
variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
,
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
, or
spectral density In signal processing, the power spectrum S_(f) of a continuous time signal x(t) describes the distribution of power into frequency components f composing that signal. According to Fourier analysis, any physical signal can be decomposed into ...
of the process. More generally change detection also includes the detection of anomalous behavior:
anomaly detection In data analysis, anomaly detection (also referred to as outlier detection and sometimes as novelty detection) is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of ...
. In ''offline'' change point detection it is assumed that a sequence of length T is available and the goal is to identify whether any change point(s) occurred in the series. This is an example of
post hoc analysis In a scientific study, post hoc analysis (from Latin ''post hoc'', "after this") consists of statistical analyses that were specified after the data were seen. They are usually used to uncover specific differences between three or more group mean ...
and is often approached using
hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...
methods. By contrast, ''online'' change point detection is concerned with detecting change points in an incoming data stream.


Background

A
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...
measures the progression of one or more quantities over time. For instance, the figure above shows the level of water in the
Nile The Nile (also known as the Nile River or River Nile) is a major north-flowing river in northeastern Africa. It flows into the Mediterranean Sea. The Nile is the longest river in Africa. It has historically been considered the List of river sy ...
river between 1870 and 1970. Change point detection is concerned with identifying whether, and if so ''when'', the behavior of the series changes significantly. In the Nile river example, the volume of water changes significantly after a dam was built in the river. Importantly, anomalous observations that differ from the ongoing behavior of the time series are not generally considered change points as long as the series returns to its previous behavior afterwards. Mathematically, we can describe a time series as an ordered sequence of observations (x_1, x_2, \ldots). We can write the
joint distribution A joint or articulation (or articular surface) is the connection made between bones, ossicles, or other hard structures in the body which link an animal's skeletal system into a functional whole.Saladin, Ken. Anatomy & Physiology. 7th ed. McGraw- ...
of a subset x_ = (x_a, x_, \ldots, x_) of the time series as p(x_). If the goal is to determine whether a change point occurred at a time \tau in a finite time series of length T, then we really ask whether p(x_) equals p(x_). This problem can be generalized to the case of more than one change point.


Algorithms


Online change detection

Using the
sequential analysis In statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data is evaluated as it is collected, and further sampling is stopped in accordance with a pre-defi ...
("online") approach, any change test must make a trade-off between these common metrics: * False alarm rate * Misdetection rate * Detection delay In a Bayes change-detection problem, a prior distribution is available for the change time. Online change detection is also done using
streaming algorithm In computer science, streaming algorithms are algorithms for processing data streams in which the input is presented as a sequence of items and can be examined in only a few passes, typically one-pass algorithm, just one. These algorithms are desi ...
s.


Offline change detection

Basseville (1993, Section 2.6) discusses
offline In computer technology and telecommunications, online indicates a state of connectivity, and offline indicates a disconnected state. In modern terminology, this usually refers to an Internet connection, but (especially when expressed as "on li ...
change-in-mean detection with hypothesis testing based on the works of Page and Picard and maximum-likelihood estimation of the change time, related to two-phase regression. Other approaches employ clustering based on
maximum likelihood estimation In statistics, maximum likelihood estimation (MLE) is a method of estimation theory, estimating the Statistical parameter, parameters of an assumed probability distribution, given some observed data. This is achieved by Mathematical optimization, ...
,, use
optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
to infer the number and times of changes, via spectral analysis, or singular spectrum analysis. Statistically speaking, change detection is often considered as a model selection problem. Models with more changepoints fit data better but with more parameters. The best trade-off can be found by optimizing a model selection criterion such as Akaike information criterion and
Bayesian information criterion In statistics, the Bayesian information criterion (BIC) or Schwarz information criterion (also SIC, SBC, SBIC) is a criterion for model selection among a finite set of models; models with lower BIC are generally preferred. It is based, in part, on ...
. Bayesian model selection has also been used. Bayesian methods often quantify uncertainties of all sorts and answer questions hard to tackle by classical methods, such as what is the probability of having a change at a given time and what is the probability of the data having a certain number of changepoints. "Offline" approaches cannot be used on streaming data because they need to compare to statistics of the complete time series, and cannot react to changes in real-time but often provide a more accurate estimation of the change time and magnitude.


Applications

Change detection tests are often used in manufacturing for
quality control Quality control (QC) is a process by which entities review the quality of all factors involved in production. ISO 9000 defines quality control as "a part of quality management focused on fulfilling quality requirements". This approach plac ...
,
intrusion detection An intrusion detection system (IDS) is a device or software application that monitors a network or systems for malicious activity or policy violations. Any intrusion activity or violation is typically either reported to an administrator or collec ...
,
spam filtering Spam most often refers to: * Spam (food), a consumer brand product of canned processed pork of the Hormel Foods Corporation * Spamming, unsolicited or undesired electronic messages ** Email spam, unsolicited, undesired, or illegal email messages ...
, website tracking, and medical diagnostics.


Linguistic change detection

Linguistic Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
change detection refers to the ability to detect word-level changes across multiple presentations of the same sentence. Researchers have found that the amount of
semantic Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...
overlap (i.e., relatedness) between the changed word and the new word influences the ease with which such a detection is made (Sturt, Sanford, Stewart, & Dawydiak, 2004). Additional research has found that focussing one's attention to the word that will be changed during the initial reading of the original sentence can improve detection. This was shown using italicized text to focus attention, whereby the word that will be changing is italicized in the original sentence (Sanford, Sanford, Molle, & Emmott, 2006), as well as using
clefting A cleft sentence is a complex sentence (one having a main clause and a dependent clause) that has a meaning that could be expressed by a simple sentence. Clefts typically put a particular Constituent (linguistics), constituent into Focus (linguisti ...
constructions such as "''It was the'' tree that needed water." (Kennette, Wurm, & Van Havermaet, 2010). These change-detection phenomena appear to be robust, even occurring cross-linguistically when bilinguals read the original sentence in their
native language A first language (L1), native language, native tongue, or mother tongue is the first language a person has been exposed to from birth or within the critical period hypothesis, critical period. In some countries, the term ''native language'' ...
and the changed sentence in their
second language A second language (L2) is a language spoken in addition to one's first language (L1). A second language may be a neighbouring language, another language of the speaker's home country, or a foreign language. A speaker's dominant language, which ...
(Kennette, Wurm & Van Havermaet, 2010). Recently, researchers have detected word-level changes in semantics across time by computationally analyzing temporal corpora (for example: the word ''"gay"'' has acquired a new meaning over time'')'' using change point detection. This is also applicable to reading non-words such as music. Even though music is not a language, it is still written and people to comprehend its meaning which involves perception and attention, allowing change detection to be present.


Visual change detection

Visual change detection is one's ability to detect differences between two or more images or scenes. This is essential in many everyday tasks. One example is detecting changes on the road to drive safely and successfully. Change detection is crucial in operating motor vehicles to detect other vehicles, traffic control signals, pedestrians, and more. Another example of utilizing visual change detection is facial recognition. When noticing one's appearance, change detection is vital, as faces are "dynamic" and can change in appearance due to different factors such as "lighting conditions, facial expressions, aging, and occlusion". Change detection algorithms use various techniques, such as "feature tracking, alignment, and normalization," to capture and compare different facial features and patterns across individuals in order to correctly identify people. Visual change detection involves the integration of "multiple sensors inputs, cognitive processes, and attentional mechanisms," often focusing on multiple stimuli at once. The brain processes visual information from the eyes, compares it with previous knowledge stored in memory, and identifies differences between the two stimuli. This process occurs rapidly and unconsciously, allowing individuals to respond to changing environments and make necessary adjustments to their behavior.


Cognitive change detection

There have been several studies conducted to analyze the cognitive functions of change detection. With cognitive change detection, researchers have found that most people overestimate their change detection, when in reality, they are more susceptible to
change blindness Change blindness is a perceptual phenomenon that occurs when a change in a visual stimulus is introduced and the observer does not notice it. For example, observers often fail to notice major differences introduced into an image while it flickers ...
than they think. Cognitive change detection has many complexities based on external factors, and sensory pathways play a key role in determining one's success in detecting changes. One study proposes and proves that the multi-sensory pathway network, which consists of three sensory pathways, significantly increases the effectiveness of change detection. Sensory pathway one fuses the stimuli together, sensory pathway two involves using the middle concatenation strategy to learn the changed behavior, and sensory pathway three involves using the middle difference strategy to learn the changed behavior. With all three of these working together, change detection has a significantly increased success rate. It was previously believed that the posterior parietal cortex (PPC) played a role in enhancing change detection due to its focus on "sensory and task-related activity". However, studies have also disproven that the PPC is necessary for change detection; although these have high functional correlation with each other, the PPC's mechanistic involvement in change detection is insignificant. Moreover, top-down processing plays an important role in change detection because it enables people to resort to background knowledge which then influences perception, which is also common in children. Researchers have conducted a longitudinal study surrounding children's development and the change detection throughout infancy to adulthood. In this, it was found that change detection is stronger in young infants compared to older children, with top-down processing being a main contributor to this outcome.


See also

* Structural break—Change in model structure *
Detection theory Detection theory or signal detection theory is a means to measure the ability to differentiate between information-bearing patterns (called stimulus in living organisms, signal in machines) and random patterns that distract from the information (c ...
*
Hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...
* Recall rate *
Receiver operating characteristic A receiver operating characteristic curve, or ROC curve, is a graph of a function, graphical plot that illustrates the performance of a binary classifier model (can be used for multi class classification as well) at varying threshold values. ROC ...
*
Change blindness Change blindness is a perceptual phenomenon that occurs when a change in a visual stimulus is introduced and the observer does not notice it. For example, observers often fail to notice major differences introduced into an image while it flickers ...


References


Further reading

* * {{DEFAULTSORT:Change Detection