The Tversky index, named after
Amos Tversky
Amos Nathan Tversky ( he, עמוס טברסקי; March 16, 1937 – June 2, 1996) was an Israeli cognitive and mathematical psychologist and a key figure in the discovery of systematic human cognitive bias and handling of risk.
Much of his ...
, is an asymmetric
similarity measure on
sets that compares a variant to a prototype. The Tversky index can be seen as a generalization of the
Sørensen–Dice coefficient and the
Jaccard index.
For sets ''X'' and ''Y'' the Tversky index is a number between 0 and 1 given by
Here,
denotes the
relative complement of Y in X.
Further,
are parameters of the Tversky index. Setting
produces the Jaccard index; setting
produces the Sørensen–Dice coefficient.
If we consider ''X'' to be the prototype and ''Y'' to be the variant, then
corresponds to the weight of the prototype and
corresponds to the weight of the variant. Tversky measures with
are of special interest.
Because of the inherent asymmetry, the Tversky index does not meet the criteria for a similarity metric. However, if symmetry is needed a variant of the original formulation has been proposed using max and min functions
[Jimenez, S., Becerra, C., Gelbukh, A]
SOFTCARDINALITY-CORE: Improving Text Overlap with Distributional Measures for Semantic Textual Similarity
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity, p.194-201, June 7–8, 2013, Atlanta, Georgia, USA.
.
,
,
This formulation also re-arranges parameters
and
. Thus,
controls the balance between
and
in the denominator. Similarly,
controls the effect of the symmetric difference
versus
in the denominator.
Notes
{{reflist
Eponymous indices
Index numbers
Measure theory
Similarity measures
Asymmetry