Fourth Normal Form
   HOME

TheInfoList



OR:

Fourth normal form (4NF) is a normal form used in
database normalization Database normalization or database normalisation (see spelling differences) is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. ...
. Introduced by
Ronald Fagin Ronald Fagin (born 1945) is an American mathematician and computer scientist, and IBM Fellow at the IBM Almaden Research Center. He is known for his work in database theory, finite model theory, and reasoning about knowledge. Biography Ron ...
in 1977, 4NF is the next level of normalization after
Boyce–Codd normal form Boyce–Codd normal form (or BCNF or 3.5NF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF). BCNF was developed in 1974 by Raymond F. Boyce and Edgar F. Codd to address certain ty ...
(BCNF). Whereas the
second The second (symbol: s) is the unit of time in the International System of Units (SI), historically defined as of a day – this factor derived from the division of the day first into 24 hours, then to 60 minutes and finally to 60 seconds ...
,
third Third or 3rd may refer to: Numbers * 3rd, the ordinal form of the cardinal number 3 * , a fraction of one third * Second#Sexagesimal divisions of calendar time and day, 1⁄60 of a ''second'', or 1⁄3600 of a ''minute'' Places * 3rd Street (d ...
, and Boyce–Codd normal forms are concerned with
functional dependencies In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, a functional dependency is a constraint between two attributes in a relation. Given a relation ' ...
, 4NF is concerned with a more general type of dependency known as a
multivalued dependency In database theory, a multivalued dependency is a full constraint between two sets of attributes in a relation. In contrast to the functional dependency, the multivalued dependency requires that certain tuples be present in a relation. Therefore, ...
. A
table Table may refer to: * Table (furniture), a piece of furniture with a flat surface and one or more legs * Table (landform), a flat area of land * Table (information), a data arrangement with rows and columns * Table (database), how the table data ...
is in 4NF
if and only if In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicondi ...
, for every one of its non-trivial multivalued dependencies ''X'' \twoheadrightarrow ''Y'', ''X'' is a
superkey In the relational data model a superkey is a set of attributes that uniquely identifies each tuple of a relation. Because superkey values are unique, tuples with the same superkey value must also have the same non-key attribute values. That is, ...
—that is, ''X'' is either a
candidate key A candidate key, or simply a key, of a relational database is a minimal superkey. In other words, it is any set of columns that have a unique combination of values in each row (which makes it a superkey), with the additional constraint that removi ...
or a superset thereof."A relation schema R* is in fourth normal form (4NF) if, whenever a nontrivial multivalued dependency X \twoheadrightarrow Y holds for R*, then so does the functional dependency X → A for every column name A of R*. Intuitively all dependencies are the result of keys."


Multivalued dependencies

If the column headings in a relational database table are divided into three disjoint groupings ''X'', ''Y'', and ''Z'', then, in the context of a particular row, we can refer to the data beneath each group of headings as ''x'', ''y'', and ''z'' respectively. A
multivalued dependency In database theory, a multivalued dependency is a full constraint between two sets of attributes in a relation. In contrast to the functional dependency, the multivalued dependency requires that certain tuples be present in a relation. Therefore, ...
''X'' \twoheadrightarrow ''Y'' signifies that if we choose any ''x'' actually occurring in the table (call this choice ''xc''), and compile a list of all the ''xcyz'' combinations that occur in the table, we will find that ''xc'' is associated with the same ''y'' entries regardless of ''z''. So essentially the presence of z provides no useful information to constrain the possible values of ''y''. A trivial multivalued dependency ''X'' \twoheadrightarrow ''Y'' is one where either ''Y'' is a subset of ''X'', or ''X'' and ''Y'' together form the whole set of attributes of the relation. A
functional dependency In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, a functional dependency is a constraint between two attributes in a relation. Given a relation ' ...
is a special case of multivalued dependency. In a functional dependency ''X'' → ''Y'', every ''x'' determines ''exactly one y'', never more than one.


Example

Consider the following example: Each row indicates that a given restaurant can deliver a given variety of pizza to a given area. The table has no non-key attributes because its only candidate key is . Therefore, it meets all normal forms up to BCNF. If we assume, however, that pizza varieties offered by a restaurant are not affected by delivery area (i.e. a restaurant offers all pizza varieties it makes to all areas it supplies), then it does not meet 4NF. The problem is that the table features two non-trivial multivalued dependencies on the attribute (which is not a superkey). The dependencies are: * \twoheadrightarrow * \twoheadrightarrow These non-trivial multivalued dependencies on a non-superkey reflect the fact that the varieties of pizza a restaurant offers are independent from the areas to which the restaurant delivers. This state of affairs leads to redundancy in the table: for example, we are told three times that A1 Pizza offers Stuffed Crust, and if A1 Pizza starts producing Cheese Crust pizzas then we will need to add multiple rows, one for each of A1 Pizza's delivery areas. There is, moreover, nothing to prevent us from doing this incorrectly: we might add Cheese Crust rows for all but one of A1 Pizza's delivery areas, thereby failing to respect the multivalued dependency \twoheadrightarrow . To eliminate the possibility of these anomalies, we must place the facts about varieties offered into a different table from the facts about delivery areas, yielding two tables that are both in 4NF: In contrast, if the pizza varieties offered by a restaurant sometimes did legitimately vary from one delivery area to another, the original three-column table would satisfy 4NF. Ronald Fagin demonstrated that it is always possible to achieve 4NF.Fagin, p. 268 Rissanen's theorem is also applicable on
multivalued dependencies In database theory, a multivalued dependency is a full constraint between two sets of attributes in a relation. In contrast to the functional dependency, the multivalued dependency requires that certain tuples be present in a relation. Therefore, ...
.


4NF in practice

A 1992 paper by Margaret S. Wu notes that the teaching of database normalization typically stops short of 4NF, perhaps because of a belief that tables violating 4NF (but meeting all lower normal forms) are rarely encountered in business applications. This belief may not be accurate, however. Wu reports that in a study of forty organizational databases, over 20% contained one or more tables that violated 4NF while meeting all lower normal forms.


Normalization beyond 4NF

Only in rare situations does a 4NF table not conform to the higher normal form 5NF. These are situations in which a complex real-world constraint governing the valid combinations of attribute values in the 4NF table is not implicit in the structure of that table.


See also

* Attribute-value system *
Injective function In mathematics, an injective function (also known as injection, or one-to-one function) is a function that maps distinct elements of its domain to distinct elements; that is, implies . (Equivalently, implies in the equivalent contrapositiv ...
*
Bijection In mathematics, a bijection, also known as a bijective function, one-to-one correspondence, or invertible function, is a function between the elements of two sets, where each element of one set is paired with exactly one element of the other s ...


References


Further reading

*Date, C. J. (1999),
An Introduction to Database Systems
' (8th ed.). Addison-Wesley Longman. . *Kent, W. (1983)

', Communications of the ACM, vol. 26, pp. 120–125 {{Database normalization 4NF de:Normalisierung (Datenbank)#Vierte Normalform (4NF)