Intro

I have collected Q&A topics since about 2010. These are being put onto this blog gradually, which explains why they are dated 2017 and 2018. Most are responses to questions from my students, some are my responses to posts on the Linkedin forums. You are invited to comment on any post. To create a new topic post or ask me a question, please send an email to: geverest@umn.edu since people cannot post new topics on Google Blogspot unless they are listed as an author. Let me know if you would like me to do that.

2017-12-20

Why number the rules of normalization? Is it helpful?

Chris Date/Ted Codd originally numbered the rules of normalization, so we have 1NF, 2NF, etc. They were defined in a cascading fashion. For example, for a data structure to be in 3NF, it must satisfy the 2NF rule, which must in turn satisfy the 1NF rule. This definition has persisted in virtually all introductory text books on data management and data modeling, evidently with the authors not applying any critical thinking to the question. The authors just blindly accept what has been written before.


So is it possible for a relation (entity record or table) to have no transitive dependencies (thus satisfying the condition for 3NF) but have a partial dependency (2NF). Would that situation mean that the record/relation did not satisfy 3NF? Is it helpful to say that? Even more, when we speak of nested relations or object modeling (as in UML, for example) they explicitly violate 1NF allowing nested repeating groups of data items or attributes in a record/relation. Does that mean that automatically such structures do not satisfy 2NF or 3NF, even if they had no partial dependencies or transitive dependencies?

The rules of normalization should be expressed or named by the condition which causes a violation.
1NF = NO multivalued dependency (or multivalued attribute thus requiring a many-to-many relationship).
2NF = NO partial dependency, i.e., the data item "attribute" is only dependent on part of a composite key.
3NF = NO transitive dependency on an intermediate data item.
...
Then the rules of normalization are independent and can be applied in any order.

No comments:

Post a Comment

Comments to any post are always welcome. I thrive on challenges and it will be more interesting for you.