relational model
The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data are represented in terms of t ...
of
database
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
s, a primary key is a designated attribute (
column
A column or pillar in architecture and structural engineering is a structural element that transmits, through compression, the weight of the structure above to other structural elements below. In other words, a column is a compression member ...
) that can reliably identify and distinguish between each individual record in a
table
Table may refer to:
* Table (database), how the table data arrangement is used within the databases
* Table (furniture), a piece of furniture with a flat surface and one or more legs
* Table (information), a data arrangement with rows and column ...
. The database creator can choose an existing unique attribute or combination of attributes from the table (a natural key) to act as its primary key, or create a new attribute containing a unique ID that exists solely for this purpose (a
surrogate key
A surrogate key (or synthetic key, pseudokey, entity identifier, factless key, or technical key) in a database is a unique identifier for either an ''entity'' in the modeled world or an ''object'' in the database. The surrogate key is ''not'' deri ...
).
Examples of natural keys that could be suitable primary keys include data that is already by definition unique to all items in the table such as a
national identification number
A national identification number or national identity number is used by the governments of many countries as a means of uniquely identifying their citizens or residents for the purposes of work, taxation, government benefits, health care, bank ...
attribute for person records, or the combination of a very precise
timestamp
A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolu ...
attribute with a very precise location attribute for event records.
More formally, a primary key is a specific choice of a minimal set of attributes that uniquely specify a tuple ( row) in a
relation
Relation or relations may refer to:
General uses
* International relations, the study of interconnection of politics, economics, and law on a global level
* Interpersonal relationship, association or acquaintance between two or more people
* ...
(table). A primary key is a choice of a
candidate key
A candidate key, or simply a key, of a relational database is any set of columns that have a unique combination of values in each row, with the additional constraint that removing any column could produce duplicate combinations of values.
A candi ...
(a minimal superkey); any other candidate key is an alternate key.
Design
In relational database terms, a primary key does not differ in form or function from a key that isn't primary. In practice, various motivations may determine the choice of any one key as primary over another. The designation of a primary key may indicate the "preferred" identifier for data in the table, or that the primary key is to be used for
foreign key
A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples ...
references from other tables or it may indicate some other technical rather than semantic feature of the table. Some languages and software have special syntax features that can be used to identify a primary key as such (e.g. the PRIMARY KEY constraint in SQL).
The relational model, as expressed through relational calculus and relational algebra, does not distinguish between primary keys and other kinds of keys. Primary keys were added to the
SQL
Structured Query Language (SQL) (pronounced ''S-Q-L''; or alternatively as "sequel")
is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling s ...
standard mainly as a convenience to the application programmer.
Primary keys can be an integer that is incremented, a
universally unique identifier
A Universally Unique Identifier (UUID) is a 128-bit label used to uniquely identify objects in computer systems. The term Globally Unique Identifier (GUID) is also used, mostly in Microsoft systems.
When generated according to the standard methods ...
Primary keys are defined in the ISO SQL Standard, through the PRIMARY KEY constraint. The syntax to add such a constraint to an existing table is defined in SQL:2003 like this:
ALTER TABLE
ADD ">CONSTRAINT
PRIMARY KEY ( ... )
The primary key can also be specified directly during table creation. In the SQL Standard, primary keys may consist of one or multiple columns. Each column participating in the primary key is implicitly defined as NOT NULL. Note that some
RDBMS
A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970.
A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured forma ...
require explicitly marking primary key columns as NOT NULL.
CREATE TABLE table_name (
...
)
If the primary key consists only of a single column, the column can be marked as such using the following syntax:
CREATE TABLE table_name (
id_col INT PRIMARY KEY,
col2 CHARACTER VARYING(20),
...
)
Surrogate keys
In some circumstances the natural key that uniquely identifies a tuple in a relation may be cumbersome to use for software development. For example, it may involve multiple columns or large text fields. In such cases, a
surrogate key
A surrogate key (or synthetic key, pseudokey, entity identifier, factless key, or technical key) in a database is a unique identifier for either an ''entity'' in the modeled world or an ''object'' in the database. The surrogate key is ''not'' deri ...
can be used instead as the primary key. In other situations there may be more than one
candidate key
A candidate key, or simply a key, of a relational database is any set of columns that have a unique combination of values in each row, with the additional constraint that removing any column could produce duplicate combinations of values.
A candi ...
for a relation, and no candidate key is obviously preferred. A surrogate key may be used as the primary key to avoid giving one candidate key artificial primacy over the others.
Since primary keys exist primarily as a convenience to the programmer, surrogate primary keys are often used, in many cases exclusively, in database application design.
Due to the popularity of surrogate primary keys, many developers and in some cases even theoreticians have come to regard surrogate primary keys as an inalienable part of the relational data model. This is largely due to a migration of principles from the object-oriented programming model to the relational model, creating the hybrid object–relational model. In the ORM like
active record pattern
In software engineering, the active record pattern is an architectural pattern. It is found in software that stores in-memory object data in relational databases. It was named by Martin Fowler in his 2003 book ''Patterns of Enterprise Application ...
, these additional restrictions are placed on primary keys:
* Primary keys should be immutable, that is, never changed or re-used; they should be deleted along with the associated record.
* Primary keys should be anonymous integer or numeric identifiers.
However, neither of these restrictions is part of the relational model or any SQL standard.
Due diligence
Due diligence is the investigation or exercise of care that a reasonable business or person is normally expected to take before entering into an agreement or contract with another party or an act with a certain standard of care.
Due diligence ...
should be applied when deciding on the immutability of primary key values during database and application design. Some database systems even imply that values in primary key columns cannot be changed using the UPDATE SQL statement.
Alternate key
Typically, one candidate key is chosen as the primary key. Other candidate keys become alternate keys, each of which may have a UNIQUE constraint assigned to it in order to prevent duplicates (a duplicate entry is not valid in a unique column).Alternate key – Oracle FAQ /ref>
Alternate keys may be used like the primary key when doing a single-table select or when filtering in a ''where'' clause, but are not typically used to join multiple tables.
Candidate key
A candidate key, or simply a key, of a relational database is any set of columns that have a unique combination of values in each row, with the additional constraint that removing any column could produce duplicate combinations of values.
A candi ...
Relational database
A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970.
A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured for ...
*
Entity relationship diagram
An entity is something that Existence, exists as itself. It does not need to be of material existence. In particular, abstractions and legal fictions are usually regarded as entities. In general, there is also no presumption that an entity is Lif ...
*
Strongly typed identifier
A strongly typed identifier is user-defined data type which serves as an identifier or key that is Strong and weak typing, strongly typed. This is a solution to the "primitive obsession" code smell as mentioned by Martin Fowler (software engineer), ...