Landing : Athabascau University

COMP 602 - Week 3 – Reflection

The week is an extension of week 2’s topic of data modeling. By the end of week 2 I have come up with an ERD for the proposed Apartment Rental system I intended to model.

Going from 2nf to 3nf is straight forward. However, from what I can see the differences between 2nf and 3nf is very subtle as much as they both share the same resolution (create a separate entity to hold the attribute that belongs together). Researching on the topic I have come across a phase that helps to remember the differences in the 3 normalization forms.

... must provide a fact about the key [1nf], the whole key [2nf], and nothing but the key [3nf]…

The phase helps to explain the varying degree of attributes reliance on the key. So through 3 levels of normalization, the resulting entities should only contain attributes that is part and parcel to the key of an entity. As a result relationships can be established between each entities and an ERD is formed.

I also came across this discussion on the differences between the forms which I found the explanation was very clearly articulated. http://stackoverflow.com/questions/723998/can-someone-please-give-an-example-of-1nf-2nf-and-3nf-in-plain-english

A database built on 1nf is unusable as it results in too many repeating data that is filled with data modification anomalies. A create, update, or delete operation on a specific piece of data would touch many records that are not specific to only the entity to be created, updated or deleted, hence, created insertion, update and deletion anomaly. 2nf is not free from data modification anomalies but the occurrences are alleviated significantly due to the distinct entities separated out in the process. In 3nf should be free from such anomalies all together.

A mistake that I made was to do with surrogate key, I had the tendency to introduce surrogate key for every single entity identified as something I have been doing all along in my career in database design. It’s not mandatory but it is beneficial from a technical perspective. However, as professor Dron pointed out, surrogate key should be avoided during normalization as we are creating new attribute out of thin air and therefore would overlook a natural key for an entity that naturally serves as the candidate key position. I admit that I came up with the ResidentID (int) during normalization and finally added SIN and Driver License Number after the fact. I was lured away from existing attributes associated with a resident and created one myself.

Not saying I would not be adding surrogate key in my entities from this point onward, I will think about the reasons for adding them, and only do so for the right (technical) reasons and not as convenience or habit.

Dickson

Comments

  • Jon Dron May 22, 2012 - 10:33am

    Yes - surrogate keys are a great idea once you have a clear and accurate model, but they really get in the way of finding that model!

    I like the simple sentence - I've often used the sentence myself when explaining what it is about but have not come across that explicit mapping to normal forms before. Nice! Do share that bookmark via the group bookmarks if you have not done so already - it looks useful.