Semantic Data Modeling with Xplain
By Mark Hissink Muller
In relational database design, data modeling is typically done using Entity-Relationship (ER) diagrams. This approach may lead to suboptimal models that are difficult to understand, due to a lack of semantic clarity and management issues for larger diagrams. This article explores a more elegant solution: Semantic Data Modeling with Xplain.
The Limitations of ER Diagrams
Data modeling for relational databases is typically done with ER diagrams, which represent the data structure that is then implemented in a database system using the structured query language (SQL) or one of its product-specific dialects.
In the ER approach, relationships and between entities are not always named explicitly, such as between Book
and Warehouse
or between Book
and Shoppingbasket
in the following example.
Such practices may result in suboptimal data models due to the lack of semantic clarity. Moreover, ER diagrams tend to become complex quickly, making it difficult to maintain an overview, especially when the number of tables exceeds 15, 25, or even 50. This complexity hinders communication within the team and with less-technical stakeholders.
This modeling practice leads to suboptimal data models, due to the lack of semantic clarity. Moreover, ER diagrams tend to become complex quickly, making it difficult to maintain an overview, especially when the number of tables exceeds 15, 25, or even 50. This complexity hinders communication within the team and with less-technical stakeholders.
Semantic Data Modeling
Semantic Data Modeling, developed by Johan ter Bekke, offers a more intuitive and elegant way to model databases.
A more elegant way to model databases, and to think about data structures in general, is Semantic Data Modeling, which was developed by Johan ter Bekke (1946-2004). I attended Johan ter Bekke’s class on Database design (NL: Database ontwerp) during my time at Delft University of Technology
During my time at Delft University of Technology, I attended Ter Bekke’s class on Database Design, where I learned the advantages of this approach.
The following diagram from Ter Bekke’s tribute site shows a data model of a bank. The concepts should be read from bottom to top.
The first concept we encounter a bank office
, which can be one of the branch
es, or the head office
. Concepts that are placed lower (e.g. office
), are unaware of the concepts above it that are related to it with a n-1 relationship (e.g. holder
). In this example, a holder
has an (/knows the) office
that he/she is a customer at. And an office
may have zero or more account holder
s. Each holder
can have one or more accounts
, which can be of different type (mortgage
, saving
, business
and current
). And so forth.
Xplain modeling language
Xplain is the language used to describe and query Johan ter Bekke’s Semantic Data Models. On his website about Xplain, Berend de Boer writes the following about Xplain:
Xplain is a beautifully orthogonal database manipulation and query language. Orthogonal means that with Xplain there is usually only one solution, not dozens like in SQL. Xplain straight-forwardly supports aggregation and generalization. Or in other words, it supports has-a and is-a relations, the only possible relations. Because it supports is-a relations, it is a very natural component in an object-oriented environment.
Xplain also has a graphical side. Data models created in Xplain are far more readable than data models drawn with ER based tools. I use Xplain data models a lot in my consultancy businesses. Clients usually don’t even have an ER model, and even if they have, are you able to understand it in a short time? ER models have to be studied very, very hard to understand them. I always draw the data model in Xplain (or a more of less equivalent if the database doesn’t make sense) and are able to understand it very quickly. Or even better, to propose a better design.
These statements echo my experiences in both consulting assignments and for personal development projects. Compared to ER diagrams, Xplain models have a substantially higher expressive power, which results in diagrams that are more concise and easier to read. The fact that Xplain diagrams use direction helps teams communicate and work towards clear and elegant models.
Translating Xplain models to SQL
An added benefit of the Xplain language is that its models can be translated directly to multiple SQL dialects via Berend de Boer’s Xplain2sql.
Xplain model pharmacy (Dutch)
The following diagram depicts the data model of a pharmacy (NL: apotheek) that we made as exercise for the course Database design.
I would be happy to discuss the benefits of the semantic data modeling approach for your organisation.