In
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed
trie
In computer science, a trie (, ), also known as a digital tree or prefix tree, is a specialized search tree data structure used to store and retrieve strings from a dictionary or set. Unlike a binary search tree, nodes in a trie do not store t ...
containing all the
suffixes
In linguistics, a suffix is an affix which is placed after the stem of a word. Common examples are case endings, which indicate the grammatical case of nouns and adjectives, and verb endings, which form the conjugation of verbs. Suffixes can ca ...
of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations.
The construction of such a tree for the string
takes time and space linear in the length of
. Once constructed, several operations can be performed quickly, such as locating a
substring in
, locating a substring if a certain number of mistakes are allowed, and locating matches for a
regular expression
A regular expression (shortened as regex or regexp), sometimes referred to as rational expression, is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
pattern. Suffix trees also provided one of the first linear-time solutions for the
longest common substring problem
Long may refer to:
Measurement
* Long, characteristic of something of great duration
* Long, characteristic of something of great length
* Longitude (abbreviation: long.), a geographic coordinate
* Longa (music), note value in early music mens ...
. These speedups come at a cost: storing a string's suffix tree typically requires significantly more space than storing the string itself.
History
The concept was first introduced by .
Rather than the suffix