Factor Oracle
   HOME
*





Factor Oracle
A factor oracle is a finite state automaton that can efficiently search for factors (substrings) in a body of text. Older techniques, such as suffix trees, were time-efficient but required significant amounts of memory. Factor oracles, by contrast, can be constructed in linear time and space in an incremental fashion. Overview Older techniques for matching strings include: suffix arrays, suffix trees, suffix automata or directed acyclic word graphs, and factor automata (Allauzen, Crochemore, Raffinot, 1999). In 1999, Allauzen, Crochemore, and Raffinot, presented the factor oracle algorithm as a memory efficient improvement upon these older techniques for string matching and compression. Starting in the mid-2000s, factor oracles have found application in computer music, as well.Assayag G., Dubnov S., ''Using Factor Oracles for Machine Improvisation.'' Soft Computing - A Fusion of Foundations, Methodologies and Applications. 2004-09-01. Springer Berlin / Heidelberg Implem ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Finite State Automaton
A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of '' states'' at any given time. The FSM can change from one state to another in response to some inputs; the change from one state to another is called a ''transition''. An FSM is defined by a list of its states, its initial state, and the inputs that trigger each transition. Finite-state machines are of two types— deterministic finite-state machines and non-deterministic finite-state machines. A deterministic finite-state machine can be constructed equivalent to any non-deterministic one. The behavior of state machines can be observed in many devices in modern society that perform a predetermined sequence of actions depending on a sequence of events with which they are presented. Simple examples are vending machines, which dispense p ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Substring
In formal language theory and computer science, a substring is a contiguous sequence of characters within a string. For instance, "''the best of''" is a substring of "''It was the best of times''". In contrast, "''Itwastimes''" is a subsequence of "''It was the best of times''", but not a substring. Prefixes and suffixes are special cases of substrings. A prefix of a string S is a substring of S that occurs at the beginning of S; likewise, a suffix of a string S is a substring that occurs at the end of S. The substrings of the string "''apple''" would be: "''a''", "''ap''", "''app''", "''appl''", "''apple''", "''p''", "''pp''", "''ppl''", "''pple''", "''pl''", "''ple''", "''l''", "''le''" "''e''", "" (note the empty string at the end). Substring A string u is a substring (or factor) of a string t if there exists two strings p and s such that t = pus. In particular, the empty string is a substring of every string. Example: The string u=ana is equal to substrings (and subse ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Suffix Tree
In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations. The construction of such a tree for the string S takes time and space linear in the length of S. Once constructed, several operations can be performed quickly, for instance locating a substring in S, locating a substring if a certain number of mistakes are allowed, locating matches for a regular expression pattern etc. Suffix trees also provide one of the first linear-time solutions for the longest common substring problem. These speedups come at a cost: storing a string's suffix tree typically requires significantly more space than storing the string itself. History The concept was first introduced by . Rather than the suffix S ..n/math>, Weiner stored in his trie the ''pref ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Suffix Array
In computer science, a suffix array is a sorted array of all suffixes of a string. It is a data structure used in, among others, full-text indices, data-compression algorithms, and the field of bibliometrics. Suffix arrays were introduced by as a simple, space efficient alternative to suffix trees. They had independently been discovered by Gaston Gonnet in 1987 under the name ''PAT array'' . gave the first in-place \mathcal(n) time suffix array construction algorithm that is optimal both in time and space, where ''in-place'' means that the algorithm only needs \mathcal(1) additional space beyond the input string and the output suffix array. Enhanced suffix arrays (ESAs) are suffix arrays with additional tables that reproduce the full functionality of suffix trees preserving the same time and memory complexity. The suffix array for a subset of all suffixes of a string is called sparse suffix array. Multiple probabilistic algorithms have been developed to minimize the additiona ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Suffix Automaton
In computer science, a suffix automaton is an efficient data structure for representing the substring index of a given string which allows the storage, processing, and retrieval of compressed information about all its substrings. The suffix automaton of a string S is the smallest directed acyclic graph with a dedicated initial vertex and a set of "final" vertices, such that paths from the initial vertex to final vertices represent the suffixes of the string. In terms of automata theory, a suffix automaton is the minimal partial deterministic finite automaton that recognizes the set of suffixes of a given string S=s_1 s_2 \dots s_n. The state graph of a suffix automaton is called a directed acyclic word graph (DAWG), a term that is also sometimes used for any deterministic acyclic finite state automaton. Suffix automata were introduced in 1983 by a group of scientists from the University of Denver and the University of Colorado Boulder. They suggested a linear time online algo ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Deterministic Acyclic Finite State Automaton
In computer science, a deterministic acyclic finite state automaton (DAFSA),Jan Daciuk, Stoyan Mihov, Bruce Watson and Richard Watson (2000). Incremental construction of minimal acyclic finite state automata. Computational Linguistics 26(1):3-16. also called a directed acyclic word graph (DAWG; though that name also refers to a related data structure that functions as a suffix index) is a data structure that represents a set of strings, and allows for a query operation that tests whether a given string belongs to the set in time proportional to its length. Algorithms exist to construct and maintain such automata, while keeping them minimal. A DAFSA is a special case of a finite state recognizer that takes the form of a directed acyclic graph with a single source vertex (a vertex with no incoming edges), in which each edge of the graph is labeled by a letter or symbol, and in which each vertex has at most one outgoing edge for each possible letter or symbol. The strings represe ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Suffix Array
In computer science, a suffix array is a sorted array of all suffixes of a string. It is a data structure used in, among others, full-text indices, data-compression algorithms, and the field of bibliometrics. Suffix arrays were introduced by as a simple, space efficient alternative to suffix trees. They had independently been discovered by Gaston Gonnet in 1987 under the name ''PAT array'' . gave the first in-place \mathcal(n) time suffix array construction algorithm that is optimal both in time and space, where ''in-place'' means that the algorithm only needs \mathcal(1) additional space beyond the input string and the output suffix array. Enhanced suffix arrays (ESAs) are suffix arrays with additional tables that reproduce the full functionality of suffix trees preserving the same time and memory complexity. The suffix array for a subset of all suffixes of a string is called sparse suffix array. Multiple probabilistic algorithms have been developed to minimize the additiona ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Generalised Suffix Tree
In computer science, a generalized suffix tree is a suffix tree for a set of strings. Given the set of strings D=S_1,S_2,\dots,S_d of total length n, it is a Patricia tree containing all n suffixes of the strings. It is mostly used in bioinformatics. Functionality It can be built in \Theta(n) time and space, and can be used to find all z occurrences of a string P of length m in O(m + z) time, which is asymptotically optimal (assuming the size of the alphabet is constant). When constructing such a tree, each string should be padded with a unique out-of-alphabet marker symbol (or string) to ensure no suffix is a substring of another, guaranteeing each suffix is represented by a unique leaf node. Algorithms for constructing a GST include Ukkonen's algorithm (1995) and McCreight's algorithm (1976). Example A suffix tree for the strings ABAB and BABA is shown in a figure above. They are padded with the unique terminator strings $0 and $1. The numbers in the leaf nodes are string ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Automata (computation)
An automaton (; plural: automata or automatons) is a relatively self-operating machine, or control mechanism designed to automatically follow a sequence of operations, or respond to predetermined instructions.Automaton – Definition and More from the Free Merriam-Webster Dictionary http://www.merriam-webster.com/dictionary/automaton Some automata, such as bellstrikers in mechanical clocks, are designed to give the illusion to the casual observer that they are operating under their own power. Since long ago, the term is commonly associated with automated puppets that resemble moving humans or animals, built to impress and/or to entertain people. Animatronics are a modern type of automata with electronics, often used for the portrayal of characters in films and in theme park attractions. Etymology The word "automaton" is the latinization of the Ancient Greek , , (neuter) "acting of one's own will". This word was first used by Homer to describe an automatic door opening, or au ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]