Time-series Segmentation
   HOME

TheInfoList



OR:

Time-series segmentation is a method of
time-series analysis In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Exa ...
in which an input time-series is divided into a sequence of discrete segments in order to reveal the underlying properties of its source. A typical application of time-series segmentation is in
speaker diarization Speaker diarisation ( or diarization) is the process of partitioning an audio stream containing human speech into homogeneous segments according to the identity of each speaker. It can enhance the readability of an automatic speech transcription b ...
, in which an audio signal is partitioned into several pieces according to who is speaking at what times. Algorithms based on change-point detection include sliding windows, bottom-up, and top-down methods. Probabilistic methods based on
hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ob ...
s have also proved useful in solving this problem.


Overview of the segmentation problem

It is often the case that a time-series can be represented as a sequence of discrete segments of finite length. For example, the trajectory of a
stock market A stock market, equity market, or share market is the aggregation of buyers and sellers of stocks (also called shares), which represent ownership claims on businesses; these may include ''securities'' listed on a public stock exchange, as ...
could be partitioned into regions that lie in between important world events, the input to a handwriting recognition application could be segmented into the various words or letters that it was believed to consist of, or the audio recording of a conference could be divided according to who was speaking when. In the latter two cases, one may take advantage of the fact that the label assignments of individual segments may repeat themselves (for example, if a person speaks at several separate occasions during a conference) by attempting to
cluster may refer to: Science and technology Astronomy * Cluster (spacecraft), constellation of four European Space Agency spacecraft * Asteroid cluster, a small asteroid family * Cluster II (spacecraft), a European Space Agency mission to study t ...
the segments according to their distinguishing properties (such as the
spectral ''Spectral'' is a 2016 3D military science fiction, supernatural horror fantasy and action-adventure thriller war film directed by Nic Mathieu. Written by himself, Ian Fried, and George Nolfi from a story by Fried and Mathieu. The film stars J ...
content of each speaker's voice). There are two general approaches to this problem. The first involves looking for change points in the time-series: for example, one may assign a segment boundary whenever there is a large jump in the average value of the signal. The second approach involves assuming that each segment in the time-series is generated by a system with distinct parameters, and then inferring the most probable segment locations and the system parameters that describe them. While the first approach tends to only look for changes in a short window of time, the second approach generally takes into account the entire time-series when deciding which label to assign to a given point.


Segmentation algorithms


Hidden Markov Models

Under the
hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ob ...
, the time-series \boldsymbol_ = (\boldsymbol_1, ..., \boldsymbol_T) is assumed to have been generated as the system transitions among a set of discrete, hidden states z \in \. At each time t, a sample \boldsymbol_t is drawn from an observation (or emission) distribution indexed by the current hidden state, i.e., \boldsymbol_t \sim P_(\boldsymbol_t). The goal of the segmentation problem is to infer the hidden state at each time, as well as the parameters describing the emission distribution associated with each hidden state. Hidden state sequence and emission distribution parameters can be learned using the Baum-Welch algorithm, which is a variant of expectation maximization applied to HMMs. Typically in the segmentation problem self-transition probabilities among states are assumed to be high, such that the system remains in each state for nonnegligible time. More robust parameter-learning methods involve placing
hierarchical Dirichlet process In statistics and machine learning, the hierarchical Dirichlet process (HDP) is a nonparametric Bayesian approach to clustering grouped data. It uses a Dirichlet process for each group of data, with the Dirichlet processes for all groups sharing ...
priors over the HMM transition matrix.Teh, Yee Whye, et al.
Hierarchical dirichlet processes
" Journal of the American Statistical Association 101.476 (2006).


See also

*
Step detection In statistics and signal processing, step detection (also known as step smoothing, step filtering, shift detection, jump detection or edge detection) is the process of finding abrupt changes (steps, jumps, shifts) in the mean level of a time serie ...


References

{{Reflist Time series Silva, R. P., Zarpelão, B. B., Cano, A., & Barbon Junior, S. (2021). Time series segmentation based on stationarity analysis to improve new samples prediction. Sensors, 21(21), 7333
download