SQL: Primary key

A primary key constraint is a table constraint that allows to identify records based on a unique value, or a unique combination of values. Thus, a primary key constraint is a unique constraint with the additional stipulation that null values are not allowed.

A (primary key identified) record in a table can be referred to by another record (in the same or another table). Thus the primary key foreign key relationship forms the basis for 1:n relations. The R in such relations gave raise to the R in RDMBS.

Creating primary key values

Because the crutial property of primary key values is that their values are unique, it is of important to guarantee that the assigned primary key values are unique across time and space. For example, the values first name and last name (usually) don't satisfy such a requirement because there are many examples where a combination of a first and last name does not uniquely identify a person.

Therefore, a way to create unique values for a primary key is to use what is called surrogate key. The value of a surrogate key is not inherently connected to the entity that it identifies. Rather, it is an artificial value. Often, such a value is an ever increasing number (as is produced, for example, in Oracle with sequences).

Another commonly seen technique to create surrogate keys is to use GUIDs. GUIDs, Globally Unique Identifiers, are also guaranteed to be unique across time and space.

Problems of surrogate keys

Although surrogate keys are commonly viewed as desirable, it should be noted that they might lead to unintended consequences. They're especially problematic when data should be prepared for data mining or data analysis. With surrogate keys, it is often quite time consuming, if possible, to match data in different data stores that belongs to the same entity.

SQL: Primary key

Creating primary key values

Problems of surrogate keys

See also