Cardinality (SQL Statements)
   HOME

TheInfoList



OR:

In SQL (Structured Query Language), the term cardinality refers to the
uniqueness Uniqueness is a state or condition wherein someone or something is unlike anything else in comparison, or is remarkable, or unusual. When used in relation to humans, it is often in relation to a person's personality, or some specific characterist ...
of data values contained in a particular column (attribute) of a
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
table Table may refer to: * Table (furniture), a piece of furniture with a flat surface and one or more legs * Table (landform), a flat area of land * Table (information), a data arrangement with rows and columns * Table (database), how the table data ...
. The lower the cardinality, the more duplicated elements in a column. Thus, a column with the lowest possible cardinality would have the same value for every row. SQL databases use cardinality to help determine the optimal
query plan In general, a query is a form of questioning, in a line of inquiry. Query may also refer to: Computing and technology * Query, a precise request for information retrieval made to a database or information system ** Query language, a computer lan ...
for a given query.


Values of cardinality

When dealing with columnar value sets, there are three types of cardinality: high-cardinality, normal-cardinality, and low-cardinality. High-cardinality refers to columns with values that are very uncommon or unique. High-cardinality column values are typically identification numbers, email addresses, or user names. An example of a data table column with high-cardinality would be a USERS table with a column named USER_ID. This column would contain unique values of 1-''n''. Each time a new user is created in the USERS table, a new number would be created in the USER_ID column to identify them uniquely. Since the values held in the USER_ID column are unique, this column's cardinality type would be referred to as high-cardinality. Normal-cardinality refers to columns with values that are somewhat uncommon. Normal-cardinality column values are typically names, street addresses, or vehicle types. An example of a data table column with normal-cardinality would be a CUSTOMER table with a column named LAST_NAME, containing the last names of customers. While some people have common last names, such as Smith, others have uncommon last names. Therefore, an examination of all of the values held in the LAST_NAME column would show "clumps" of names in some places (e.g. a lot of Smiths) surrounded on both sides by a long series of unique values. Since there is a variety of possible values held in this column, its cardinality type would be referred to as normal-cardinality. Low-cardinality refers to columns with few unique values. Low-cardinality column values are typically status flags,
Boolean Any kind of logic, function, expression, or theory based on the work of George Boole is considered Boolean. Related to this, "Boolean" may refer to: * Boolean data type, a form of data with only two possible values (usually "true" and "false" ...
values, or major classifications such as gender. An example of a data table column with low-cardinality would be a CUSTOMER table with a column named NEW_CUSTOMER. This column would contain only two distinct values: Y or N, denoting whether the customer was new or not. Since there are only two possible values held in this column, its cardinality type would be referred to as low-cardinality.


See also

* Cardinality (mathematics)


References

{{DEFAULTSORT:Cardinality (Sql Statements) SQL