UML Modeler - Software Analysis and Design Blog

Everything about software analysis and design - done right or wrong

WHAT EVERY MANAGER SHOULD KNOW ABOUT SOFTWARE DATA MODELING - 1

Nothing stands alone. Or, as they say, "everything is connected".Customers and products, services an
Nothing stands alone. "Everything is connected". 

When it comes to building software for your business, it is surely the case: today's business software is built on models, and when it comes to models - data models are rather important.

And when it comes to data almost all software is built on a "relational database" model in which things that describe your business are connected (did we mention "relational"?). 
It is not that technical as it sounds - and what is behind it is essential to delivering a better software for your business.

So firstly, let's talk about something with a strange name: entity-relationship models. Note: it is not about having an affair, though. :)

An entity-relationship model describes exactly that: "relationships" - what is connected to what, and how.  Again, unfortunately, we will talk about pieces of data and their relationships (and not about men and women).

SO WHY IS IT THAT IMPORTANT FOR YOU?

Using the Entity-Relationship model 
  • can give you insight into your business, 
  • help to define the right requirements for your shiny new software, and 
  • even help you to improve your processes with providing a common basis to think both for business and tech people
With an ER model you will see your business from another angle - together with the tech guys.

AS YOU LIKE IT: TOP-DOWN


An ER-model can be very detailed when you need it, but it can be extremely high level at the start.

An Entity-Relationship model has three levels, defined as:
  • Conceptual model
  • Logical model
  • Physical model
Each of these levels can give you different information while at the same time move you along the road from idea to "real stuff" (=working software).

We will cover all three levels and how / where they can help you in your everyday business.

CONCEPTUAL MODEL - FLYING HIGH


The Conceptual model is a simplistic model. All it describes is the set of Entities and the relationship between these Entities.

  • Customer owns zero or more Credit Card(s)
  • Credit Card belongs to one Customer
  • Order includes one or more Product
  • Product is part of zero or more Order(s)
  • Invoice covers* one or more Order
  • Order is covered by zero or more Invoices
Or with a graphical representation:

In this chart: the Customer owns (multiple, not only one, hence the "chicken leg") Credit card(s). But may own zero Credit Cards, as well (hence the small circle next to the chicken leg).
One Invoice can cover multiple Orders - and the other way around. Just like with Orders and Products: an Order can include multiple Products and one Product can belong to multiple Orders.

A side-note: As you can see, you definitely want to use a diagram. It is simpler to understand than the textual descriptions.

And why is the Conceptual model, with such a limited detail, valuable?
It gives you the chance to discover exactly how your business works (or in technical lingo: "what entities and relationships your business works with") without having to worry about anything else beyond that, at this stage.

You can focus simply on the "what and who are involved, and how are they related?" question. And why is it important? It ensures / provides:
  • A common language
  • That nothing has been left out
  • A common model - it ensures that you have the same base understanding with your "software guys"
  • It helps you to clarify what is "in" and what is "out" of the software to be delivered.
Since the software (and the underlying database) will need to support the business, thus having a good Conceptual model will give you a solid foundation when you go deeper into details, and start to work on your Logical model.

It also gives you the chance to go really wide and explore seemingly tenuous connections without wasting too much time.


Your Conceptual model can end up having more entities and relationships than you would ever put into a database or a software, and that is quite all right.

You can always remove the ones that you definitely don't need, and nothing says that your Logical and Physical models need to include everything from the Conceptual model.

Look at the Conceptual model as the perfect place to explore widely without any commitment. 
Include every entity you think can have even minimal impact. 
You never know what insight you will gain after looking at all the new entities and relations. 
This is the model that will give you a high level view of influence over your business.


To give you an example:
---------------------------------------------------------------------------------
When looking at your business, say a shoe store set up you definitely have an Entity called Customer.
Not in any connection to you business, the customer is related to another Entity called Traffic.
Now, as it happens there are two locations near me, the Arsenal football stadium and Finsbury Park.
They both host events - an Entity with a very concrete connection to Traffic.
Usually a negative one.
Are these events connected to your business? Not at all, but they still can have an impact on your business. Maybe getting hold of the event calendars for the upcoming year would be a good idea.
---------------------------------------------------------------------------------

So here comes a business decision to be made: what to include and what to exclude?
Should you include these Entities and Relations in Logical or Physical models? 
No, you don't - unless you really, really see the business use of these at this stage, of course.
Let us highlight that these are not technical decisions. These should be made all by you.

LOGICAL MODEL - DRILLING DOWN


With the Logical model we take a big step towards implementation. From the Conceptual model we start to figure out how to build up the data model.

This is where we start to do a little bit more serious modelling.  In technical lingo: what happens here is defining attributes for the entities, primary and foreign keys, resolving many-to-many relationships, normalising.

The Logical model can deliver value for you in several ways:
  • It is good for more in depth investigation and exploring issues.
  • It is detailed enough to run query tests to spot possible performance bottlenecks by simulating query paths.
  • It is technology independent, which means you can find the best data structure and then pick the right database technology to implement it.
  • And yes, you can use the Logical model even if you are in the NoSQL database camp.
  • Fundamental facts about your data don't depend on the technical implementation.
For a business related problem a Logical model allows you to create a filtered view of the entities, their attributes and relationships. This will contain only what you need in your solution, and if it doesn't contain everything you need, then you know what to change. And you won't have to navigate the technical constraints of the Physical model which should be irrelevant when still thinking about the business.

The Logical model does not (and need not) reflect the Conceptual model perfectly: entities can be removed, split or merged, or even converted into a simple attribute of another Entity.

---------------------------------------------------------------------------------
For example, while Colour can be an Entity in the Conceptual model, we can change it to a simple attribute in the Logical model. Maybe it's not that important. Maybe we just want to store a simple, unrestricted name for a colour. Like 'red' or 'blue' or 'asdfasdf' (wait: this is my keyboard stuck, it is not a color!).
---------------------------------------------------------------------------------

Create Logical models based on only part of a Conceptual model, especially if the goal is to explore some business problem and find a solution to it you don't need to go and include everything.

---------------------------------------------------------------------------------
Let's say you need to work out how to measure and track customer retention.

You will not need to include Product, Invoice, Order, Supplier, etc. in your logical model.
You will not need to define the Primary Keys, all the Attribute and normalise everything.

But you will need to define the attribute for the Entities involved in the Customer visit.
There is no need for the Phone Number field, but definitely there is a need to include the Date of Last Visit.
---------------------------------------------------------------------------------

Then some additional technical stuff follows: normalising, resolving many-to-many relationships, defining Primary Keys, and so on. We leave the explanation of these more technical steps out, but the essence of all these is to make sure that your model makes sense from a technology perspective, as well (e.g. it is "clean enough" with no redundancies).


PHYSICAL MODEL: NONE OF YOUR BUSINESS - WELL, IT IS YOUR BUSINESS, TOO


The Physical model is the most implementation specific level of the three. This is mainly for the hard core developer.

Taking a Logical model and turning it into a Physical model involves:
  • Converting Entities into database tables
  • Defining physical characteristics for the attributes
  • Denormalising where needed - it may sound crazy but it is the opposite of what tech people did in the previous step
  • Defining views - these are pre-prepared stuff tech people create for things you often have a look at (e.g. if you ask often the question how much is 1+1 then they will store the answer "2" in a separate table for you)

The same Logical model can be represented by several Physical models based on:
  • technology used - quite different for a relational database you may be using such as MySQL 4, MySQL 5 or an Oracle 11g etc. or even non-relational databases (they are in fashion nowadays, sounds brilliantly exciting, right!?)
  • different requirements - e.g. which queries need to provide answers fast
  • systems consuming the data - your legacy systems may impose some additional constraints

There are two big challenges, issues with Physical models in our experience:

1. When the technical constraints and limitations start to creep up and impose themselves on the Logical model and the business.

While it is true that sometimes technical limitations mean that the users need to find a 'work around', this should not impact the models of higher abstraction levels.

So in the majority of the cases you should stick to your own ideas and do not let developers change them.

In other words: the Conceptual model influences the Logical model and the Logical model influences the Physical model but do not let it work the other way back.

Well - there are some exceptions, but do not let in easily.


2. Keeping your Physical model up to date.

You have to live with the fact that (almost) every change to your business will modify the Physical model.
There are tools that will make it quite simple and easy on the model's level.

The challenge is how to modify the model in a way that it does not cause software errors (in existing software) and that these changes will surely affect all other layers of your software (and if you have multiple systems in place, other software in your business).

CONCLUSION


I hope that you are convinced that the Entity-Relationship model is more useful than just describing your database tables and relationships.

It can be used throughout the whole project cycle and also outside of the scope of a software project.

It can give you new insights into your business, and it can help you and your team develop the right solution both in the software world and the business.