In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
, a term index is a data structure to facilitate fast lookup of terms and
clauses
In language, a clause is a constituent that comprises a semantic predicand (expressed or not) and a semantic predicate. A typical clause consists of a subject and a syntactic predicate, the latter typically a verb phrase composed of a verb with ...
in a
logic program,
deductive database
A deductive database is a database system that can make deductions (i.e. conclude additional facts) based on rules and facts stored in the (deductive) database. Datalog is the language typically used to specify facts, rules and queries in deductiv ...
, or
automated theorem prover.
Overview
Many operations in automatic theorem provers require search in huge collections of terms and clauses. Such operations typically fall into the following scheme. Given a collection
of terms (clauses) and a query term (clause)
, find in
some/all terms
related to
according to a certain retrieval condition. Most interesting retrieval conditions are formulated as existence of a substitution that relates in a special way the query and the retrieved objects
. Here is a list of retrieval conditions frequently used in provers:
* term
is unifiable with term
, i.e., there exists a substitution
, such that
=
* term
is an instance of
, i.e., there exists a substitution
, such that
=
* term
is a generalisation of
, i.e., there exists a substitution
, such that
=
* clause
subsumes clause
, i.e., there exists a substitution
, such that
is a subset/submultiset of
* clause
is subsumed by
, i.e., there exists a substitution
, such that
is a subset/submultiset of
More often than not, we are actually interested in finding the appropriate
substitutions explicitly, together with the retrieved terms
,
rather than just in establishing existence of such substitutions.
Very often the sizes of term sets to be searched are large,
the retrieval calls are frequent and the retrieval condition test
is rather complex. In such situations linear search in
, when the retrieval
condition is tested on every term from
, becomes prohibitively costly.
To overcome this problem, special data structures, called ''indexes'', are
designed in order to support fast retrieval. Such data structures,
together with the accompanying algorithms for index maintenance
and retrieval, are called ''term indexing techniques''.
Classic indexing techniques
*
discrimination trees
*
substitution trees
*
path indexing
Substitution trees outperform path indexing, discrimination tree indexing, and abstraction trees.
A discrimination tree term index stores its information in a
trie
In computer science, a trie, also called digital tree or prefix tree, is a type of ''k''-ary search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes ...
data structure.
Indexing techniques used in logic programming
First-argument indexing is the most common strategy where the first argument is used as index. It distinguishes atomic values and the principal functor of compound terms.
Nonfirst argument indexing is a variation of first-argument indexing that uses the same or similar techniques as first-argument indexing on one or more alternative arguments. For instance, if a predicate call uses variables for the first argument, the system may choose to use the second argument as the index instead.
Multiargument indexing creates a combined index over multiple instantiated arguments if there is not a sufficiently selective single argument index.
Deep indexing is used when multiple clauses use the same principal functor for some argument. It recursively uses the same or similar indexing techniques on the arguments of the compound terms.
Trie
In computer science, a trie, also called digital tree or prefix tree, is a type of ''k''-ary search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes ...
indexing uses a prefix tree to find applicable clauses.
References
{{reflist
Further reading
* P. Graf, Term Indexing, Lecture Notes in Computer Science 1053, 1996 (slightly outdated overview)
* R. Sekar and I.V. Ramakrishnan and A. Voronkov, Term Indexing, in A. Robinson and A. Voronkov, editors,
Handbook of Automated Reasoning
The ''Handbook of Automated Reasoning'' (, 2128 pages) is a collection of survey articles on the field of automated reasoning. Published in June 2001 by MIT Press, it is edited by John Alan Robinson and Andrei Voronkov. Volume 1 describes methods ...
, volume 2, 2001 (recent overview)
* W. W. McCune, Experiments with Discrimination-Tree Indexing and Path Indexing for Term Retrieval, Journal of Automated Reasoning, 9(2), 1992
* P. Graf, Substitution Tree Indexing, Proc. of RTA, Lecture Notes in Computer Science 914, 1995
* M. Stickel, The Path Indexing Method for Indexing Terms, Tech. Rep. 473,
Artificial Intelligence Center
The Artificial Intelligence Center is a laboratory in the Information and Computing Sciences Division of SRI International. It was founded in 1966 by Charles Rosen and studies artificial intelligence. One of their early projects was Shakey the Robo ...
,
SRI International
SRI International (SRI) is an American nonprofit scientific research institute and organization headquartered in Menlo Park, California. The trustees of Stanford University established SRI in 1946 as a center of innovation to support economic ...
, 1989
* S. Schulz, Simple and Efficient Clause Subsumption with Feature Vector Indexing, Proc. of IJCAR-2004 workshop ESFOR, 2004
* A. Riazanov and A. Voronkov, Partially Adaptive Code Trees, Proc. JELIA, Lecture Notes in Artificial Intelligence 1919, 2000
* H. Ganzinger and R. Nieuwenhuis and P. Nivela, Fast Term Indexing with Coded Context Trees, Journal of Automated Reasoning, 32(2), 2004
* A. Riazanov and A. Voronkov, Efficient Instance Retrieval with Standard and Relational Path Indexing, Information and Computation, 199(1–2), 2005
Data structures
Logic programming
Theorem proving software systems