Intro

I have collected Q&A topics since about 2010. These are being put onto this blog gradually, which explains why they are dated 2017 and 2018. Most are responses to questions from my students, some are my responses to posts on the Linkedin forums. You are invited to comment on any post. To create a new topic post or ask me a question, please send an email to: geverest@umn.edu since people cannot post new topics on Google Blogspot unless they are listed as an author. Let me know if you would like me to do that.

2020-09-13

Concept Modeling

Thomas Frisendal posted a nice piece on concept(ual) modeling: (2020 April 13)
www.dataversity.net/the-conceptual-model-strikes-back/#
In it he said that there were common ingredients to all conceptual modeling schemes:
  • Concepts
  • Properties
  • Relationships
  • The Triple (subject – predicate – object)
  • Occasional elements of semantic sugar such as cardinalities, class systems and various other abstractions.

I would like to suggest a few clarifications or modifications to this list as it might pertain to business "data" modeling, recognizing that we are modeling some aspect of a business/user domain (which may be represented in a stored database).

1. Note that "Concept" is a noun and each named concept represents a population of instances of some class of things.  It is not just an abstract notion of some idea. Conceptual is an adjective.  The key here for modeling is that Concepts need to be clearly defined so we know what instances are included and which are excluded from the population.

2. The notion of Property is derived (often called an "attribute" in data modeling).  Here is my definition:  An Attribute is a Concept (or Object) playing a Role in a Relationship with some (other) Concept.  As an example, we can have the Concept of a Date, define the population of dates, and perhaps the format of its lexical representation. Now with the Concept of Employee, we could have the Concept of Birthdate (or HireDate or...).  That would necessitate a Relationship between the two Concepts.  In that Relationship, Birthdate is a Role name for Date.  We could also have a predicate phrase to name the relationship, such as "is born on" (we could also have an inverse reading).  The Relationship between Concepts comes first, before we can speak of Properties.  Properties can exist all by themselves without having a relationship with a Concept, precisely because it is first, a Concept.  In fact modeling circles, some people speak of an "attributeless" modeling scheme.

3. Recognize that Relationships can be more than binary. Nijssen's first paper (1976) was on Binary Modeling.  It later became Fact modeling, recognizing that Facts could be unary, ternary, or more.

4. Saying that a common ingredient is the Triple, restricts us to binary relationships.  In Halpin's ORM a predicate can be of any arbitrary arity.  Higher order relationships are represented by objectifying a predicate.  This is equivalent to having a separate table in a relational data model to represent even a many-to-many binary relationship.  This stems from the first normal form restriction which says that an "attribute" can be at most single valued.  That means it is possible to directly represent at most a 1:Many relationship.   Since this restriction is already a step toward implementation in a relational database, I argue that it has no place in a business concept "data" model.  People don't have difficulty comprehending a M:N or even a ternary relationship, even if our relational data management systems do!

5. For the occasional elements I call them all constraints, or more generally business rules.  This includes mandatory/optional and exclusive/multiple (what you call cardinality), and so much more, particularly conditional constraints (which is not even possible in even the more advanced DBMSs).  In fact, our overarching goal in concept modeling is to capture and formally express as rich a set of semantics as possible about the user domain.  Lacking expressible semantics in our models we are doomed to data quality issues.  One of the best examples of rich data semantics capture is in Halpin's ORM flavor of fact modeling.

No comments:

Post a Comment

Comments to any post are always welcome. I thrive on challenges and it will be more interesting for you.