Codd's theorem
   HOME

TheInfoList



OR:

Codd's theorem states that
relational algebra In database theory, relational algebra is a theory that uses algebraic structures with a well-founded semantics for modeling data, and defining queries on it. The theory was introduced by Edgar F. Codd. The main application of relational algebr ...
and the domain-independent
relational calculus The relational calculus consists of two calculi, the tuple relational calculus and the domain relational calculus, that are part of the relational model for databases and provide a declarative way to specify database queries. The raison d'ĂȘtre ...
queries, two well-known foundational query languages for the relational model, are precisely equivalent in expressive power. That is, a database query can be formulated in one language if and only if it can be expressed in the other. The theorem is named after Edgar F. Codd, the father of the
relational model The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of t ...
for database management. The domain independent
relational calculus The relational calculus consists of two calculi, the tuple relational calculus and the domain relational calculus, that are part of the relational model for databases and provide a declarative way to specify database queries. The raison d'ĂȘtre ...
queries are precisely those relational calculus queries that are invariant under choosing domains of values beyond those appearing in the database itself. That is, queries that may return different results for different domains are excluded. An example of such a forbidden query is the query "select all tuples other than those occurring in relation R", where R is a relation in the database. Assuming different domains, i.e., sets of atomic data items from which tuples can be constructed, this query returns different results and thus is clearly not domain independent. Codd's Theorem is notable since it establishes the equivalence of two syntactically quite dissimilar languages:
relational algebra In database theory, relational algebra is a theory that uses algebraic structures with a well-founded semantics for modeling data, and defining queries on it. The theory was introduced by Edgar F. Codd. The main application of relational algebr ...
is a variable-free language, while relational calculus is a logical language with variables and quantification. Relational calculus is essentially equivalent to first-order logic, and indeed, Codd's Theorem had been known to logicians since the late 1940s. Query languages that are equivalent in expressive power to relational algebra were called relationally complete by Codd. By Codd's Theorem, this includes relational calculus. Relational completeness clearly does not imply that any interesting database query can be expressed in relationally complete languages. Well-known examples of inexpressible queries include simple aggregations (counting tuples, or summing up values occurring in tuples, which are operations expressible in SQL but not in relational algebra) and computing the
transitive closure In mathematics, the transitive closure of a binary relation on a set is the smallest relation on that contains and is transitive. For finite sets, "smallest" can be taken in its usual sense, of having the fewest related pairs; for infinite ...
of a graph given by its binary edge relation (see also expressive power). Codd's theorem also doesn't consider SQL nulls and the
three-valued logic In logic, a three-valued logic (also trinary logic, trivalent, ternary, or trilean, sometimes abbreviated 3VL) is any of several many-valued logic systems in which there are three truth values indicating ''true'', ''false'' and some indetermina ...
they entail; the logical treatment of nulls remains mired in controversy.For recent work extending Codd's theorem in this direction see Additionally, SQL has
multiset In mathematics, a multiset (or bag, or mset) is a modification of the concept of a set that, unlike a set, allows for multiple instances for each of its elements. The number of instances given for each element is called the multiplicity of that e ...
semantics and allows duplicate rows. Nevertheless, relational completeness constitutes an important yardstick by which the expressive power of query languages can be compared.


Notes


References

* *


External links

*{{cite web , title=Database Theory: 3. Codd's Theorem , first=Reinhard , last=Pichler , publisher=Institute of Logic and Computation, DBAI Group, TU Wien , date=March 20, 2018 , url=https://www.dbai.tuwien.ac.at/staff/pichler/dbt/slides/dbt03.pdf , access-date=August 8, 2019 , archive-date=August 22, 2020 , archive-url=https://web.archive.org/web/20200822204112/https://www.dbai.tuwien.ac.at/staff/pichler/dbt/slides/dbt03.pdf , url-status=dead Relational model Theorems in the foundations of mathematics