The role of the data modeler

Originally posted on Linkedin.

I have been developing “data models” in a few occasions, but I always found it difficult to define, exactly, what I was doing as a “data modeller”. So this post is an attempt to explain to myself, and perhaps to discuss with others, what being a data modeller is about. At least from my experience.

Perhaps this can be reduced to a simple question: what does doing a data model look like?

Is it like designing a database? Not really. Ultimately this can be an end-product of a data model (let’s clarify, a “conceptual” data model), but I usually talk about entities and relations, not about data types or specific technologies (I often found myself citing: “early optimisation is the root of all evil”).

Then perhaps this is solved, “data modelling” is designing something like an ER (Entity-relationship) model. Tempting, but really I spent lot of time thinking in terms of entities, processes, roles, qualities, temporal relations… this is not something that is usually represented in an ER model.

It could be like designing an ontology then (say, adding a layer of logic definitions to entities and relations). Probably in a good part this is what data modelling is about (perhaps implicitly). However, at least respect to some ontology design approaches, I have found myself thinking not only about what I was modelling, but also about how this would be used, or misused.

So let’s take another perspective: what does one do when acting as a “data modeller” ?

In my experience: take some data/examples from the domain to be modelled, talk to people about it, record what entities and relations they are talking about, compare the point of view of different people, start with an upper ontology to model the domain (an extra point of view!), often get back to people and re-discuss their understanding of their domain, in the end coming up with a set of definitions and relations that hit a sweet-spot among different points of view, use cases, and… reality (noise is always there).

The best definition I have come to so for a data modeller: it’s like being a data psychologist.

Leave a reply:

Your email address will not be published.

Site Footer

Sliding Sidebar

Publications

While I don’t have much time anymore to write papers, a list my main publications are indexed by Research Gate.

Some of the presentations I gave  are available on slides share.

Back to Sgtp.net

Back to my entry page.