Semantic Data Modeling with Xplain

By Mark Hissink Muller

July 3, 2024

In relational database design, data modeling is typically done using Entity-Relationship (ER) diagrams. This approach may lead to suboptimal models that are difficult to understand, due to a lack of semantic clarity and management issues for larger diagrams. This article explores a more elegant solution: Semantic Data Modeling with Xplain.

The Limitations of ER Diagrams

Data modeling for relational databases is typically done with ER diagrams, which represent the data structure that is then implemented in a database system using the structured query language (SQL) or one of its product-specific dialects.

In the ER approach, relationships and between entities are not always named explicitly, such as between Book and Warehouse or between Book and Shoppingbasket in the following example.

Figure 1: Example ER diagram, courtesy Visual Paradigm

Such practices may result in suboptimal data models due to the lack of semantic clarity. Moreover, ER diagrams tend to become complex quickly, making it difficult to maintain an overview, especially when the number of tables exceeds 15, 25, or even 50. This complexity hinders communication within the team and with less-technical stakeholders.

This modeling practice leads to suboptimal data models, due to the lack of semantic clarity. Moreover, ER diagrams tend to become complex quickly, making it difficult to maintain an overview, especially when the number of tables exceeds 15, 25, or even 50. This complexity hinders communication within the team and with less-technical stakeholders.

Semantic Data Modeling

Semantic Data Modeling, developed by Johan ter Bekke, offers a more intuitive and elegant way to model databases.

A more elegant way to model databases, and to think about data structures in general, is Semantic Data Modeling, which was developed by Johan ter Bekke (1946-2004). I attended Johan ter Bekke’s class on Database design (NL: Database ontwerp) during my time at Delft University of Technology

During my time at Delft University of Technology, I attended Ter Bekke’s class on Database Design, where I learned the advantages of this approach.

The following diagram from Ter Bekke’s tribute site shows a data model of a bank. The concepts should be read from bottom to top.

Example data model — Figure 2: Example semantic data model

The first concept we encounter a bank office, which can be one of the branches, or the head office. Concepts that are placed lower (e.g. office), are unaware of the concepts above it that are related to it with a n-1 relationship (e.g. holder). In this example, a holder has an (/knows the) office that he/she is a customer at. And an office may have zero or more account holders. Each holder can have one or more accounts, which can be of different type (mortgage, saving, business and current). And so forth.

Xplain modeling language

Xplain is the language used to describe and query Johan ter Bekke’s Semantic Data Models. On his website about Xplain, Berend de Boer writes the following about Xplain:

Xplain is a beautifully orthogonal database manipulation and query language. Orthogonal means that with Xplain there is usually only one solution, not dozens like in SQL. Xplain straight-forwardly supports aggregation and generalization. Or in other words, it supports has-a and is-a relations, the only possible relations. Because it supports is-a relations, it is a very natural component in an object-oriented environment.

Xplain also has a graphical side. Data models created in Xplain are far more readable than data models drawn with ER based tools. I use Xplain data models a lot in my consultancy businesses. Clients usually don’t even have an ER model, and even if they have, are you able to understand it in a short time? ER models have to be studied very, very hard to understand them. I always draw the data model in Xplain (or a more of less equivalent if the database doesn’t make sense) and are able to understand it very quickly. Or even better, to propose a better design.

These statements echo my experiences in both consulting assignments and for personal development projects. Compared to ER diagrams, Xplain models have a substantially higher expressive power, which results in diagrams that are more concise and easier to read. The fact that Xplain diagrams use direction helps teams communicate and work towards clear and elegant models.

Translating Xplain models to SQL

An added benefit of the Xplain language is that its models can be translated directly to multiple SQL dialects via Berend de Boer’s Xplain2sql.

Xplain model pharmacy (Dutch)

The following diagram depicts the data model of a pharmacy (NL: apotheek) that we made as exercise for the course Database design.

Figure 3: Semantic data model of a pharmacy

I would be happy to discuss the benefits of the semantic data modeling approach for your organisation.