HOME

TheInfoList



OR:

The chase is a simple fixed-point algorithm testing and enforcing implication of data dependencies in
database systems In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
. It plays important roles in
database theory Database theory encapsulates a broad range of topics related to the study and research of the theoretical realm of databases and database management systems. Theoretical aspects of data management include, among other areas, the foundations of q ...
as well as in practice. It is used, directly or indirectly, on an everyday basis by people who design databases, and it is used in commercial systems to reason about the consistency and correctness of a data design. New applications of the chase in meta-data management and data exchange are still being discovered. The chase has its origins in two seminal papers of 1979, one by Alfred V. Aho, Catriel Beeri, and
Jeffrey D. Ullman Jeffrey David Ullman (born November 22, 1942) is an American computer scientist and the Stanford W. Ascherman Professor of Engineering, Emeritus, at Stanford University. His textbooks on compilers (various editions are popularly known as the ...
and the other by
David Maier David Maier (born 2 June 1953) is the Maseeh Professor of Emerging Technologies in the Department of Computer Science at Portland State University. Born in Eugene, OR, he has also been a computer science faculty member at the State University of ...
, Alberto O. Mendelzon, and Yehoshua Sagiv. In its simplest application the chase is used for testing whether the
projection Projection, projections or projective may refer to: Physics * Projection (physics), the action/process of light, heat, or sound reflecting from a surface to another in a different direction * The display of images by a projector Optics, graphic ...
of a
relation schema In database theory, a relation, as originally defined by E. F. Codd, is a set of tuples (d1, d2, ..., dn), where each element dj is a member of Dj, a data domain. Codd's original definition notwithstanding, and contrary to the usua ...
constrained by some
functional dependencies In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, a functional dependency is a constraint between two attributes in a relation. Given a relation ' ...
onto a given decomposition can be recovered by rejoining the projections. Let ''t'' be a tuple in \pi_(R) \bowtie \pi_(R) \bowtie ... \bowtie \pi_(R) where ''R'' is a
relation Relation or relations may refer to: General uses *International relations, the study of interconnection of politics, economics, and law on a global level *Interpersonal relationship, association or acquaintance between two or more people *Public ...
and ''F'' is a set of functional dependencies (FD). If tuples in ''R'' are represented as ''t1, ..., tk'', the join of the projections of each ''ti'' should agree with ''t'' on \pi_(R) where ''i'' = 1, 2, ..., ''k''. If ''ti'' is not on \pi_(R), the value is unknown. The chase can be done by drawing a tableau (which is the same formalism used in
tableau query Tableau (French for 'little table' literally, also used to mean 'picture'; tableaux or, rarely, tableaus) may refer to: Arts * ''Tableau'', a series of four paintings by Piet Mondrian titled ''Tableau I'' through to ''Tableau IV'' * ''Tableau viv ...
). Suppose ''R'' has
attributes Attribute may refer to: * Attribute (philosophy), an extrinsic property of an object * Attribute (research), a characteristic of an object * Grammatical modifier, in natural languages * Attribute (computing), a specification that defines a prope ...
''A, B, ...'' and components of ''t'' are ''a, b, ...''. For ''ti'' use the same letter as ''t'' in the components that are in S''i'' but subscript the letter with ''i'' if the component is not in S''i''. Then, ''ti'' will agree with ''t'' if it is in S''i'' and will have a unique value otherwise. The chase process is
confluent In geography, a confluence (also: ''conflux'') occurs where two or more flowing bodies of water join to form a single channel. A confluence can occur in several configurations: at the point where a tributary joins a larger river (main stem); o ...
. There exist implementations of the chase algorithm, some of them are also open-source.


Example

Let ''R''(''A'', ''B'', ''C'', ''D'') be a relation schema known to obey the set of functional dependencies ''F'' = . Suppose ''R'' is decomposed into three relation schemas S1 = , S2 = and S3 = . Determining whether this decomposition is lossless can be done by performing a chase as shown below. The initial tableau for this decomposition is: The first row represents S1. The components for attributes ''A'' and ''D'' are unsubscripted and those for attributes ''B'' and ''C'' are subscripted with ''i'' = 1. The second and third rows are filled in the same manner with S2 and S3 respectively. The goal for this test is to use the given ''F'' to prove that ''t'' = (''a'', ''b'', ''c'', ''d'') is really in ''R''. To do so, the tableau can be chased by applying the FDs in ''F'' to equate symbols in the tableau. A final tableau with a row that is the same as ''t'' implies that any tuple ''t'' in the join of the projections is actually a tuple of ''R''.
To perform the chase test, first decompose all FDs in ''F'' so each FD has a single attribute on the right hand side of the "arrow". (In this example, ''F'' remains unchanged because all of its FDs already have a single attribute on the right hand side: ''F'' = .) When equating two symbols, if one of them is unsubscripted, make the other be the same so that the final tableau can have a row that is exactly the same as ''t'' = (''a'', ''b'', ''c'', ''d''). If both have their own subscript, change either to be the other. However, to avoid confusion, all of the occurrences should be changed.
First, apply ''A''→''B'' to the tableau. The first row is (''a'', ''b1'', ''c1'', ''d'') where ''a'' is unsubscripted and ''b1'' is subscripted with 1. Comparing the first row with the second one, change ''b2'' to ''b1''. Since the third row has ''a3'', ''b'' in the third row stays the same. The resulting tableau is: Then consider ''B''→''C''. Both first and second rows have ''b1'' and notice that the second row has an unsubscripted ''c''. Therefore, the first row changes to (''a'', ''b1'', ''c'', ''d''). Then the resulting tableau is: Now consider ''CD''→''A''. The first row has an unsubscripted ''c'' and an unsubscripted ''d'', which is the same as in third row. This means that the A value for row one and three must be the same as well. Hence, change ''a3'' in the third row to ''a''. The resulting tableau is: At this point, notice that the third row is (''a'', ''b'', ''c'', ''d'') which is the same as ''t''. Therefore, this is the final tableau for the chase test with given ''R'' and ''F''. Hence, whenever ''R'' is projected onto S1, S2 and S3 and rejoined, the result is in ''R''. Particularly, the resulting tuple is the same as the tuple of ''R'' that is projected onto .


References

*
Serge Abiteboul Serge Joseph Abiteboul (born 25 August 1953 in Paris, France) is a French computer scientist working in the areas of data management, database theory, and finite model theory. Education The son of two hardware store owners, Abiteboul attended ...
, Richard B. Hull,
Victor Vianu Victor Vianu is a computer scientist, a professor of computer science and engineering at the University of California, San Diego.
: Foundations of Databases. Addison-Wesley, 1995. * A. V. Aho, C. Beeri, and J. D. Ullman: ''The Theory of Joins in Relational Databases''. ACM Transactions on Database Systems 4(3): 297-314, 1979. * J. D. Ullman: ''Principles of Database and Knowledge-Base Systems, Volume I''. Computer Science Press, New York, 1988. * J. D. Ullman, J. Widom: ''A First Course in Database Systems'' (3rd ed.). pp. 96–99. Pearson Prentice Hall, 2008. * Michael Benedikt, George Konstantinidis, Giansalvatore Mecca, Boris Motik, Paolo Papotti, Donatello Santoro, Efthymia Tsamoura: ''Benchmarking the Chase''. In Proc. of PODS, 2017.


Further reading

* {{DEFAULTSORT:Chase (Algorithm) Database theory Database algorithms