Natural key and surrogate key

1. Natural Key

A natural key is a type of unique key in a database which has an equivalent in the external world outside the database. Database tables are used to represent entities in the real world. A natural key is some unique attribute or group of attributes of the real world entity which uniquely identifies the entity. A natural key is also known as business key or domain key.

A natural key uniquely identifies each row in a database table.

1.1 Example of natural key

Suppose every citizen of the country is provided a unique identification number. This unique identification number can be used to uniquely identify the citizens. This identification number can be used as a primary key in a table used to store citizens information. Since this identification number belongs to the real world entity, it can be used as a natural key.

1.2 Advantages of using natural key

  • Sometimes, natural key is a combination of two or more columns. More number of columns as a primary key are not easy to manage from application side.
  • The natural key is related to the real world. Since this attribute is part of entity, need to be saved in the database. So using a natural key as the primary key saves disk space.
  • A natural key makes it easy to understand the data and hence its design and management. If you use some custom generated sequence as a primary key, it will be difficult to have a look at the sequence and tell what the data is all about. A natural key makes it easy to design the database system.
  • A natural key relates to the real world, so it can be used in the search.
  • Since the natural key column has meaning, less number of join is required and hence less I/O operation.

1.3 Disadvantages of using natural key

  • A natural key is an attribute from the real world. If the structure of the natural key changes, it will require the change in database table as well. Changing the primary key in a database table is very difficult, especially after the database is designed and holds a lot of production data.
  • A natural key can not be good in terms of performance if it is a combination of two or more columns.
  • You can’t save the data unless you know the natural key value.

2. Surrogate Key

A surrogate key is a type of unique key in a database which is not derived from application data, unlike a natural key. A surrogate key does not represent an attribute of a real world entity. A surrogate key is also known as synthetic key, pseudo key, fact-less key or technical key.

The surrogate key is internally generated and are independent of the current row data. A surrogate key is usually not visible to the user of the application. Usually the database takes care of surrogate key generation and usually it is a numeric type incremented whenever there is a need of new key.

2.1 Example of surrogate key

Consider the index of the book. The index is usually a number which represents the unique sections of the book. However, the index is not related to the content of the book. Index here can be used as a surrogate key.

2.2 Advantages of using surrogate key

  • Surrogate key is easy to manage as there is only single column to manage. No extra columns are required to uniquely identify the row.
  • Surrogate key does not relate to the row of the data. So whenever the row data changes, there is no need to change the surrogate key.
  • Unlike the natural key, if the attribute to uniquely identify the row changes, surrogate key in unaffected.
  • You can define your own logic for surrogate key. Surrogate key is usually numeric or alphanumeric. Surrogate is easy to manage. Since surrogate key is system generated with predefined logic, it can be easily used in automation tasks.
  • Surrogate keys can be easily used to verify the data. If there is any deviation from the pattern of surrogate key, it can be easily identified.

2.3 Disadvantages of using surrogate key

  • A surrogate key does not relate to real world, so it is does not help in understanding the data. Suppose there is a table which uses numeric type as a surrogate key. You can’t tell just by looking at the number about the data. A surrogate key does not help in data modeling.
  • For a surrogate key, you have to create another column. This requires additional disk space.
  • Surrogate keys can result in duplicate values in any natural keys.
  • Surrogate keys can not be used in search.
  • Since surrogate key has no meaning, more join is required and hence more I/O.