Time Warp Edit Distance (TWED) is a
measure of similarity
In statistics and related fields, a similarity measure or similarity function or similarity metric is a real-valued function that quantifies the similarity between two objects. Although no single definition of a similarity exists, usually such meas ...
(or dissimilarity) for discrete
time series
In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
matching with
time
Time is the continued sequence of existence and events that occurs in an apparently irreversible succession from the past, through the present, into the future. It is a component quantity of various measurements used to sequence events, ...
'
elasticity'. In comparison to other distance measures, (e.g. DTW (
dynamic time warping
In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walk ...
) or LCS (
longest common subsequence problem
The longest common subsequence (LCS) problem is the problem of finding the longest subsequence common to all sequences in a set of sequences (often just two sequences). It differs from the longest common substring problem: unlike substrings, sub ...
)), TWED is a
metric
Metric or metrical may refer to:
* Metric system, an internationally adopted decimal system of measurement
* An adjective indicating relation to measurement in general, or a noun describing a specific type of measurement
Mathematics
In mathem ...
. Its
computational time complexity is
, but can be drastically reduced in some specific situations by using a corridor to reduce the search space. Its
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered ...
space complexity
The space complexity of an algorithm or a computer program is the amount of memory space required to solve an instance of the computational problem as a function of characteristics of the input. It is the memory required by an algorithm until it ex ...
can be reduced to
. It was first proposed in 2009 by P.-F. Marteau.
Definition
whereas
Whereas the
recursion
Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathemati ...
is initialized as:
with
Implementations
An implementation of the TWED
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
in
C with a
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
wrapper is available at
TWED is also implemented into the Time Series Subsequence Search Python package (TSSEARCH for short) available a
An
R (programming language), R implementation of TWED has been integrated into the TraMineR, a
R package for
mining
Mining is the extraction of valuable minerals or other geological materials from the Earth, usually from an ore body, lode, vein, seam, reef, or placer deposit. The exploitation of these deposits for raw material is based on the economic ...
, describing and visualizing sequences of states or events, and more generally discrete sequence
data
In the pursuit of knowledge, data (; ) is a collection of discrete Value_(semiotics), values that convey information, describing quantity, qualitative property, quality, fact, statistics, other basic units of meaning, or simply sequences of sy ...
.
Additionally
cuTWEDis a
CUDA
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach ...
- accelerated implementation of TWED which uses an improved algorithm due to G. Wright (2020). This method is linear in memory and massively parallelized. cuTWED is written in CUDA C/
C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
, comes with Python bindings, and also includes Python bindings for Marteau's reference C implementation.
Python
import numpy as np
def dlp(A, B, p=2):
cost = np.sum(np.power(np.abs(A - B), p))
return np.power(cost, 1 / p)
def twed(A, timeSA, B, timeSB, nu, _lambda):
# istance, DP= TWED( A, timeSA, B, timeSB, lambda, nu )
# Compute Time Warp Edit Distance (TWED) for given time series A and B
#
# A := Time series A (e.g. 10 2 30 4
# timeSA := Time stamp of time series A (e.g. 1:4)
# B := Time series B
# timeSB := Time stamp of time series B
# lambda := Penalty for deletion operation
# nu := Elasticity parameter - nu >=0 needed for distance measure
# Reference :
# Marteau, P.; F. (2009). "Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching".
# IEEE Transactions on Pattern Analysis and Machine Intelligence. 31 (2): 306–318. arXiv:cs/0703033
# http://people.irisa.fr/Pierre-Francois.Marteau/
# Check if input arguments
if len(A) != len(timeSA):
print("The length of A is not equal length of timeSA")
return None, None
if len(B) != len(timeSB):
print("The length of B is not equal length of timeSB")
return None, None
if nu < 0:
print("nu is negative")
return None, None
# Add padding
A = np.array( + list(A))
timeSA = np.array( + list(timeSA))
B = np.array( + list(B))
timeSB = np.array( + list(timeSB))
n = len(A)
m = len(B)
# Dynamical programming
DP = np.zeros((n, m))
# Initialize DP Matrix and set first row and column to infinity
DP , := np.inf
DP, 0
The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
= np.inf
DP, 0
The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
= 0
# Compute minimal cost
for i in range(1, n):
for j in range(1, m):
# Calculate and save cost of various operations
C = np.ones((3, 1)) * np.inf
# Deletion in A
C = (
DP - 1, j + dlp(A - 1 A
+ nu * (timeSA - timeSA - 1
+ _lambda
)
# Deletion in B
C = (
DP , j - 1 + dlp(B - 1 B
+ nu * (timeSB - timeSB - 1
+ _lambda
)
# Keep data points in both time series
C = (
DP - 1, j - 1 + dlp(A B
+ dlp(A - 1 B - 1
+ nu * (abs(timeSA - timeSB + abs(timeSA - 1- timeSB - 1)
)
# Choose the operation with the minimal cost and update DP Matrix
DP, j
The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline (t ...
= np.min(C)
distance = DP - 1, m - 1 return distance, DP
Backtracking, to find the most
cost-efficient path:
def backtracking(DP):
# best_path = BACKTRACKING ( DP )
# Compute the most cost-efficient path
# DP := DP matrix of the TWED function
x = np.shape(DP)
i = x - 1
j = x - 1
# The indices of the paths are save in opposite direction
# path = np.ones((i + j, 2 )) * np.inf;
best_path = []
steps = 0
while i != 0 or j != 0:
best_path.append((i - 1, j - 1))
C = np.ones((3, 1)) * np.inf
# Keep data points in both time series
C = DP - 1, j - 1 # Deletion in A
C = DP - 1, j # Deletion in B
C = DP , j - 1
# Find the index for the lowest cost
idx = np.argmin(C)
if idx 0:
# Keep data points in both time series
i = i - 1
j = j - 1
elif idx 1:
# Deletion in A
i = i - 1
j = j
else:
# Deletion in B
i = i
j = j - 1
steps = steps + 1
best_path.append((i - 1, j - 1))
best_path.reverse()
return best_path :
MATLAB
function istance, DP= twed(A, timeSA, B, timeSB, lambda, nu)
% istance, DP= TWED( A, timeSA, B, timeSB, lambda, nu )
% Compute Time Warp Edit Distance (TWED) for given time series A and B
%
% A := Time series A (e.g. 10 2 30 4
% timeSA := Time stamp of time series A (e.g. 1:4)
% B := Time series B
% timeSB := Time stamp of time series B
% lambda := Penalty for deletion operation
% nu := Elasticity parameter - nu >=0 needed for distance measure
%
% Code by: P.-F. Marteau - http://people.irisa.fr/Pierre-Francois.Marteau/
% Check if input arguments
if length(A) ~= length(timeSA)
warning('The length of A is not equal length of timeSA')
return
end
if length(B) ~= length(timeSB)
warning('The length of B is not equal length of timeSB')
return
end
if nu < 0
warning('nu is negative')
return
end
% Add padding
A = A
timeSA = timeSA
B = B
timeSB = timeSB
% Dynamical programming
DP = zeros(length(A), length(B));
% Initialize DP Matrix and set first row and column to infinity
DP(1, :) = inf;
DP(:, 1) = inf;
DP(1, 1) = 0;
n = length(timeSA);
m = length(timeSB);
% Compute minimal cost
for i = 2:n
for j = 2:m
cost = Dlp(A(i), B(j));
% Calculate and save cost of various operations
C = ones(3, 1) * inf;
% Deletion in A
C(1) = DP(i - 1, j) + Dlp(A(i - 1), A(i)) + nu * (timeSA(i) - timeSA(i - 1)) + lambda;
% Deletion in B
C(2) = DP(i, j - 1) + Dlp(B(j - 1), B(j)) + nu * (timeSB(j) - timeSB(j - 1)) + lambda;
% Keep data points in both time series
C(3) = DP(i - 1, j - 1) + Dlp(A(i), B(j)) + Dlp(A(i - 1), B(j - 1)) + ...
nu * (abs(timeSA(i) - timeSB(j)) + abs(timeSA(i - 1) - timeSB(j - 1)));
% Choose the operation with the minimal cost and update DP Matrix
DP(i, j) = min(C);
end
end
distance = DP(n, m);
% Function to calculate euclidean distance
function ost= Dlp(A, B)
cost = sqrt(sum((A - B) .^ 2, 2));
end
end
Backtracking, to find the most cost-efficient path:
function ath
Ath (; nl, Aat, ; pcd, Ât; wa, Ate) is a city and municipality of Wallonia located in the province of Hainaut, Belgium.
The municipality consists of the following districts: Arbre, Ath, Bouvignies, Ghislenghien, Gibecq, Houtaing, ...
= backtracking(DP)
% path = BACKTRACKING ( DP )
% Compute the most cost-efficient path
% DP := DP matrix of the TWED function
x = size(DP);
i = x(1);
j = x(2);
% The indices of the paths are save in opposite direction
path = ones(i + j, 2) * Inf;
steps = 1;
while (i ~= 1 , , j ~= 1)
path(steps, :) = ; j
C = ones(3, 1) * inf;
% Keep data points in both time series
C(1) = DP(i - 1, j - 1);
% Deletion in A
C(2) = DP(i - 1, j);
% Deletion in B
C(3) = DP(i, j - 1);
% Find the index for the lowest cost
, idx= min(C);
switch idx
case 1
% Keep data points in both time series
i = i - 1;
j = j - 1;
case 2
% Deletion in A
i = i - 1;
j = j;
case 3
% Deletion in B
i = i;
j = j - 1;
end
steps = steps + 1;
end
path(steps, :) = j
% Path was calculated in reversed direction.
path = path(1:steps, :);
path = path(end: - 1:1, :);
end
References
*{{cite journal , last1=Marteau , first1=P. , last2=F. , year=2009 , title=Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching , journal=IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=31 , issue=2 , pages=306–318 , doi=10.1109/TPAMI.2008.76 , pmid=19110495 , arxiv=cs/0703033, s2cid=10049446
Time series
Algorithms