In
information theory
Information theory is the scientific study of the quantification (science), quantification, computer data storage, storage, and telecommunication, communication of information. The field was originally established by the works of Harry Nyquist a ...
,
linguistics
Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and structure. Linguis ...
, and
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, the Levenshtein distance is a
string metric
In mathematics and computer science, a string metric (also known as a string similarity metric or string distance function) is a metric that measures distance ("inverse similarity") between two text strings for approximate string matching or comp ...
for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician
Vladimir Levenshtein
Vladimir Iosifovich Levenshtein ( rus, Влади́мир Ио́сифович Левенште́йн, p=vlɐˈdʲimʲɪr ɨˈosʲɪfəvʲɪtɕ lʲɪvʲɪnˈʂtʲejn, a=Ru-Vladimir Iosifovich Levenstein.oga; 20 May 1935 – 6 September 2017) was a ...
, who considered this distance in 1965.
Levenshtein distance may also be referred to as ''edit distance'', although that term may also denote a larger family of distance metrics known collectively as
edit distance
In computational linguistics and computer science, edit distance is a string metric, i.e. a way of quantifying how dissimilar two strings (e.g., words) are to one another, that is measured by counting the minimum number of operations required to ...
.
It is closely related to
pairwise string alignments.
Definition
The Levenshtein distance between two strings
(of length
and
respectively) is given by
where
:
where the
of some string
is a string of all but the first character of
, and