Query rewriting is a typically automatic transformation that takes a set of
database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
tables,
views, and/or
queries, usually
indices, often gathered data and query statistics, and other
metadata, and yields a set of different queries, which produce the same results but execute with better performance (for example, faster, or with lower memory use). Query rewriting can be based on
relational algebra
In database theory, relational algebra is a theory that uses algebraic structures with a well-founded semantics for modeling data, and defining queries on it. The theory was introduced by Edgar F. Codd.
The main application of relational algebra ...
or an extension thereof (e.g. multiset relational algebra with sorting, aggregation and three-valued predicates i.e. NULLs as in the case of
SQL). The
equivalence rules of relational algebra are exploited, in other words, different
query structure
Query languages, data query languages or database query languages (DQL) are computer languages used to make queries in databases and information systems. A well known example is the Structured Query Language (SQL).
Types
Broadly, query langu ...
s and orderings can be mathematically proven to yield the same result. For example, filtering on fields A and B, or cross joining R and S can be done in any order, but there can be a performance difference. Multiple operations may be combined, and operation orders may be altered.
The result of query rewriting may not be at the same
abstraction level or
application programming interface (API) as the original set of queries (though often is). For example, the input queries may be in relational algebra or SQL, and the rewritten queries may be closer to the physical representation of the data, e.g. array operations. Query rewriting can also involve
materialization of views and other subqueries; operations that may or may not be available to the API user. The query rewriting transformation can b
aided bycreating indices from which the optimizer can choose (some database systems create their own indexes if deemed useful), mandating the use of specific indices, creating materialized and/or denormalized views, or helping a database system gather statistics on the data and query use, as the optimality depends on patterns in data and typical query usage.
Query rewriting may be rule based or optimizer based. Some sources discuss query rewriting as a distinct step prior to optimization, operating at the level of the user accessible algebra API (e.g. SQL).
There are other, largely unrelated concepts also named similarly, for example, query rewriting by
search engines
A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a l ...
.
See also
*
Query optimization
References
{{DEFAULTSORT:Query Rewriting
Data management