ok, another database design task and I found the reference article I had talked about

A few posts ago I mentioned why Primary Keys in a table shouldn’t have business meaning. I didn’t quote the article I usually re-read for reference. I ran into this article while looking for another subject in www.agiledata.org today.
The section of this article I like the most is:

There are two strategies for assigning keys to tables. The first is to simply use a natural key, one or more existing data attributes that are unique to the business concept. For the Customer table there was two candidate keys, in this case CustomerNumber and SocialSecurityNumber. The second strategy is to introduce a new column to be used as a key. This new column is called a surrogate key, a key that has no business meaning, an example of which is the AddressID column of the Address table… Addresses don’t have an “easy” natural key because you would need to use all of the columns of the Address table to form a key for itself, therefore introducing a surrogate key is a much better option in this case.

The primary advantage of natural keys is that they exist already, you don’t need to introduce a new “unnatural” value to your data schema. However, the primary disadvantage of natural keys is that because they have business meaning it is possible that they may need to change if your business requirement change. For example, if your users decide to make CustomerNumber alphanumeric instead of numeric then in addition to updating the schema for the Customer table (which is unavoidable) you would have to change every single table where CustomerNumber is used as a foreign key.

If the Customer table instead used a surrogate key then the change would have been localized to just the Customer table itself (CustomerNumber in this case would just be a non-key column of the table). Naturally, if you needed to make a similar change to your surrogate key strategy, perhaps adding a couple of extra digits to your key values because you’ve run out of values, then you would have the exact same problem.

The fundamental problem is that keys are a significant source of coupling within a relational schema, and as a result they are difficult to change. The implication is that you want to avoid keys with business meaning because business meaning changes.

And the article can be found at Data Modeling 101

Enjoy!

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.