Intro

I have collected Q&A topics since about 2010. These are being put onto this blog gradually, which explains why they are dated 2017 and 2018. Most are responses to questions from my students, some are my responses to posts on the Linkedin forums. You are invited to comment on any post. To create a new topic post or ask me a question, please send an email to: geverest@umn.edu since people cannot post new topics on Google Blogspot unless they are listed as an author. Let me know if you would like me to do that.
Showing posts with label attribute. Show all posts
Showing posts with label attribute. Show all posts

2020-09-13

Concept Modeling

Thomas Frisendal posted a nice piece on concept(ual) modeling: (2020 April 13)
www.dataversity.net/the-conceptual-model-strikes-back/#
In it he said that there were common ingredients to all conceptual modeling schemes:
  • Concepts
  • Properties
  • Relationships
  • The Triple (subject – predicate – object)
  • Occasional elements of semantic sugar such as cardinalities, class systems and various other abstractions.

I would like to suggest a few clarifications or modifications to this list as it might pertain to business "data" modeling, recognizing that we are modeling some aspect of a business/user domain (which may be represented in a stored database).

1. Note that "Concept" is a noun and each named concept represents a population of instances of some class of things.  It is not just an abstract notion of some idea. Conceptual is an adjective.  The key here for modeling is that Concepts need to be clearly defined so we know what instances are included and which are excluded from the population.

2. The notion of Property is derived (often called an "attribute" in data modeling).  Here is my definition:  An Attribute is a Concept (or Object) playing a Role in a Relationship with some (other) Concept.  As an example, we can have the Concept of a Date, define the population of dates, and perhaps the format of its lexical representation. Now with the Concept of Employee, we could have the Concept of Birthdate (or HireDate or...).  That would necessitate a Relationship between the two Concepts.  In that Relationship, Birthdate is a Role name for Date.  We could also have a predicate phrase to name the relationship, such as "is born on" (we could also have an inverse reading).  The Relationship between Concepts comes first, before we can speak of Properties.  Properties can exist all by themselves without having a relationship with a Concept, precisely because it is first, a Concept.  In fact modeling circles, some people speak of an "attributeless" modeling scheme.

3. Recognize that Relationships can be more than binary. Nijssen's first paper (1976) was on Binary Modeling.  It later became Fact modeling, recognizing that Facts could be unary, ternary, or more.

4. Saying that a common ingredient is the Triple, restricts us to binary relationships.  In Halpin's ORM a predicate can be of any arbitrary arity.  Higher order relationships are represented by objectifying a predicate.  This is equivalent to having a separate table in a relational data model to represent even a many-to-many binary relationship.  This stems from the first normal form restriction which says that an "attribute" can be at most single valued.  That means it is possible to directly represent at most a 1:Many relationship.   Since this restriction is already a step toward implementation in a relational database, I argue that it has no place in a business concept "data" model.  People don't have difficulty comprehending a M:N or even a ternary relationship, even if our relational data management systems do!

5. For the occasional elements I call them all constraints, or more generally business rules.  This includes mandatory/optional and exclusive/multiple (what you call cardinality), and so much more, particularly conditional constraints (which is not even possible in even the more advanced DBMSs).  In fact, our overarching goal in concept modeling is to capture and formally express as rich a set of semantics as possible about the user domain.  Lacking expressible semantics in our models we are doomed to data quality issues.  One of the best examples of rich data semantics capture is in Halpin's ORM flavor of fact modeling.

2020-04-01

Thinking about Attributes or Properties

Kevin Feeney (LinkedIn, Data Modeling, 2020/03/24)
In his presentation on data modeling (https://lnkd.in/dnwYTEY) says that an accurate data model defines Things, Properties of things, how things are Identified, and Relationships.

Everest says:
A caution when thinking about Properties.  You cannot define an Attribute until you first have (or presume) a Relationship.  An Attribute is a thing with a population (a domain of values).  So ORM does not distinguish, it calls them both Objects.  For example, I could have a thing called a Skill Code and Employees have Skills.  That means there is a relationship between Employee and Skill.  We often depict an Attribute being tucked away in a box for the Employee.  This naturally leads to (thinking about) putting it in a column in a table for the Employee entity.  That can lead to problems.  In ORM we defer thinking about tables since that is really a step toward implementation (in a Relational DBMS).  Better to think in terms of two objects, Employee and Skill, with a relationship between them.  So here is the definition:  An ATTRIBUTE (or Property) is an OBJECT which plays a ROLE in a RELATIONSHIP with another OBJECT.  Now we can add cardinality to the relationship.  In fact, in this example, if an Employee can possess multiple Skills there is a M:N relationship and Skill cannot be stored in an Employee table (it would violate First Normal Form).  But Skill is no less an attribute of Employee, even if it is not stored in the Employee table.  That further reinforces the fact that an OBJECT has ATTRIBUTES by virtue of having RELATIONSHIPS with other OBJECTS.  Hence, there is no need for an Attribute artifact in a data model.

2018-01-02

Definitions of concepts should start with their supertype. Why not?

Andries van Renssen -posts on LinkedIn. 
A good definition of a concept should build on the definition of more generalized concepts. This can be done by two parts of a definition:
1. Specifying that the concept is a subtype of some other more generalized concept.
2. Specifying in what respect the concept deviates from other subtypes of the same supertype concept.

Such kind of definitions have great advantages, such as:
‑ The definition of (and the knowledge about) the supertype concept does not need to be repeated, as it is also applicable ("by inheritance") for the defined (subtype) concept.
‑ The definition implies a taxonomy structure that helps to find related concepts that are also subtypes of the same supertype (the 'sister‑concepts'). Comparison with the 'sister‑concepts' greatly helps to refine the definitions. Furthermore, the definitions prepare for creating an explicit taxonomy.
‑ Finding the criteria that distinguishes the concept from its 'sister concepts' prepares for explicit modeling of the definitions, which prepares for computerized interpretation and the growth towards a common Formal Language.
--Thus, why not?