Boyce–Codd normal form
   HOME

TheInfoList



OR:

Boyce–Codd normal form (or BCNF or 3.5NF) is a normal form used in
database normalization Database normalization or database normalisation (see spelling differences) is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. ...
. It is a slightly stronger version of the
third normal form Third normal form (3NF) is a database schema design approach for relational databases which uses normalizing principles to reduce the duplication of data, avoid data anomalies, ensure referential integrity, and simplify data management. It was de ...
(3NF). BCNF was developed in 1974 by Raymond F. Boyce and
Edgar F. Codd Edgar Frank "Ted" Codd (19 August 1923 – 18 April 2003) was an English computer scientist who, while working for IBM, invented the relational model for database management, the theoretical basis for relational databases and relational databa ...
to address certain types of anomalies not dealt with by 3NF as originally defined.Codd, E. F. "Recent Investigations into Relational Data Base" in ''Proc. 1974 Congress'' (Stockholm, Sweden, 1974). New York, N.Y.: North-Holland (1974). If a relational schema is in BCNF then all redundancy based on
functional dependency In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, a functional dependency is a constraint between two attributes in a relation. Given a relation '' ...
has been removed, although other types of redundancy may still exist. A relational schema ''R'' is in Boyce–Codd normal form
if and only if In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicondi ...
for every one of its dependencies ''X → Y'', at least one of the following conditions hold: * ''X'' → ''Y'' is a trivial functional dependency (Y ⊆ X), * ''X'' is a
superkey In the relational data model a superkey is a set of attributes that uniquely identifies each tuple of a relation. Because superkey values are unique, tuples with the same superkey value must also have the same non-key attribute values. That is, ...
for schema ''R''.


3NF table always meeting BCNF (Boyce–Codd normal form)

Only in rare cases does a 3NF table not meet the requirements of BCNF. A 3NF table that does not have multiple overlapping candidate keys is guaranteed to be in BCNF.Vincent, M. W. and B. Srinivasan. "A Note on Relational Schemes Which Are in 3NF But Not in BCNF". ''Information Processing Letters'' 48(6), 1993, pp. 281–283. Depending on what its functional dependencies are, a 3NF table with two or more overlapping candidate keys may or may not be in BCNF. An example of a 3NF table that does not meet BCNF is: * Each row in the table represents a court booking at a tennis club. That club has one hard court (Court 1) and one grass court (Court 2) * A booking is defined by its Court and the period for which the Court is reserved * Additionally, each booking has a Rate Type associated with it. There are four distinct rate types: ** SAVER, for Court 1 bookings made by members ** STANDARD, for Court 1 bookings made by non-members ** PREMIUM-A, for Court 2 bookings made by members ** PREMIUM-B, for Court 2 bookings made by non-members The table's
superkey In the relational data model a superkey is a set of attributes that uniquely identifies each tuple of a relation. Because superkey values are unique, tuples with the same superkey value must also have the same non-key attribute values. That is, ...
s are: * S1 = * S2 = * S3 = * S4 = * S5 = * S6 = * S7 = * S8 = * ST = , the trivial superkey Note that even though in the above table ''Start time'' and ''End time'' attributes have no duplicate values for each of them, we still have to admit that in some other days two different bookings on court 1 and court 2 could ''start at the same time'' or ''end at the same time''. This is the reason why and cannot be considered as the table's superkeys. However, only S1, S2, S3 and S4 are
candidate key A candidate key, or simply a key, of a relational database is a minimal superkey. In other words, it is any set of columns that have a unique combination of values in each row (which makes it a superkey), with the additional constraint that removi ...
s (that is, minimal superkeys for that relation) because e.g. S1 ⊂ S5, so S5 cannot be a candidate key. Recall that 2NF prohibits partial functional dependencies of non-prime attributes (i.e., an attribute that does not occur in ''any'' candidate key. See "
candidate key A candidate key, or simply a key, of a relational database is a minimal superkey. In other words, it is any set of columns that have a unique combination of values in each row (which makes it a superkey), with the additional constraint that removi ...
s") and that 3NF prohibits transitive functional dependencies of non-prime attributes on candidate keys. In Today's court bookings table, there are no non-prime attributes: that is, all attributes belong to some candidate key. Therefore the table adheres to both 2NF and 3NF. The table does not adhere to BCNF. This is because of the dependency Rate type → Court in which the determining attribute Rate type – on which Court depends – (1) is neither a candidate key nor a superset of a candidate key and (2) Court is no subset of Rate type. Dependency Rate type → Court is respected, since a Rate type should only ever apply to a single Court. The design can be amended so that it meets BCNF: The candidate keys for the Rate types table are and ; the candidate keys for the Today's bookings table are and . Both tables are in BCNF. When is a key in the Rate types table, having one Rate type associated with two different Courts is impossible, so by using as a key in the Rate types table, the anomaly affecting the original table has been eliminated.


Achievability of BCNF

In some cases, a non-BCNF table cannot be decomposed into tables that satisfy BCNF and preserve the dependencies that held in the original table. Beeri and Bernstein showed in 1979 that, for example, a set of functional dependencies cannot be represented by a BCNF schema.Beeri, Catriel and Bernstein, Philip A. "Computational problems related to the design of normal form relational schemas". ''ACM Transactions on Database Systems'' 4(1), March 1979, p. 50. Consider the following non-BCNF table whose functional dependencies follow the pattern: For each Person / Shop type combination, the table tells us which shop of this type is geographically nearest to the person's home. We assume for simplicity that a single shop cannot be of more than one type. The candidate keys of the table are: * , * . Because all three attributes are prime attributes (i.e. belong to candidate keys), the table is in 3NF. The table is not in BCNF, however, as the Shop type attribute is functionally dependent on a non-superkey: Nearest shop. The violation of BCNF means that the table is subject to anomalies. For example, Eagle Eye might have its Shop type changed to "Optometrist" on its "Fuller" record while retaining the Shop type "Optician" on its "Davidson" record. This would imply contradictory answers to the question: "What is Eagle Eye's Shop Type?" Holding each shop's Shop type only once would seem preferable, as doing so would prevent such anomalies from occurring: In this revised design, the "Shop near person" table has a candidate key of , and the "Shop" table has a candidate key of . Unfortunately, although this design adheres to BCNF, it is unacceptable on different grounds: it allows us to record multiple shops of the same type against the same person. In other words, its candidate keys do not guarantee that the functional dependency → will be respected. A design that eliminates all of these anomalies (but does not conform to BCNF) is possible. This design introduces a new normal form, known as
Elementary Key Normal Form Elementary key normal form (EKNF) is a subtle enhancement on third normal form, thus EKNF tables are in 3NF by definition. This happens when there is more than one unique compound key {{Unreferenced, date=October 2020 In database design, a composi ...
.Zaniolo, Carlo. "A New Normal Form for the Design of Relational Database Schemata". ''ACM Transactions on Database Systems'' 7(3), September 1982, p. 493. This design consists of the original "Nearest shops" table supplemented by the "Shop" table described above. The table structure generated by Bernstein's schema generation algorithmBernstein, P. A. "Synthesizing Third Normal Form relations from functional dependencies". ''ACM Transactions on Database Systems'' 1(4), December 1976 pp. 277–298. is actually EKNF, although that enhancement to 3NF had not been recognized at the time the algorithm was designed: If a referential integrity constraint is defined to the effect that from the first table must refer to a from the second table, then the data anomalies described previously are prevented.


Intractability

It is
NP-complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryi ...
, given a database schema in
third normal form Third normal form (3NF) is a database schema design approach for relational databases which uses normalizing principles to reduce the duplication of data, avoid data anomalies, ensure referential integrity, and simplify data management. It was de ...
, to determine whether it violates Boyce–Codd normal form., Corollary 3.


History

Chris Date Chris Date (born 1941) is an independent author, lecturer, researcher, and consultant, specializing in relational database theory. Biography Chris Date attended High Wycombe Royal Grammar School (U.K.) from 1951 to 1958 and received his BA i ...
has pointed out that a definition of what we now know as BCNF appeared in a paper by Ian Heath in 1971.Heath, I. "Unacceptable File Operations in a Relational Database". ''Proc. 1971 ACM SIGFIDET Workshop on Data Description, Access, and Control'', San Diego, Calif. (November 11th–12th, 1971). Date writes:Date, C. J. ''Database in Depth: Relational Theory for Practitioners''. O'Reilly (2005), p. 142.
Since that definition predated Boyce and Codd's own definition by some three years, it seems to me that BCNF ought by rights to be called ''Heath'' normal form. But it isn't.
Edgar F. Codd released his original article "A Relational Model of Data for Large Shared Databanks" in June 1970. This was the first time the notion of a relational database was published. All work after this, including the Boyce–Codd normal form method was based on this relational model.


References


Bibliography

* Date, C. J. (1999). ''An Introduction to Database Systems'' (8th ed.). Addison-Wesley Longman. .


External links


Rules Of Data Normalization


by ITS, University of Texas. {{DEFAULTSORT:Boyce-Codd normal form BCNF de:Normalisierung (Datenbank)#Boyce-Codd-Normalform (BCNF)