As discussed in previous posts, data refers to a collection of information. Whatever the purpose of this collection, in order to gain insight and work with this information, an adequate manner of displaying and comparing data is necessary in order to get full use out of it. Some thought needs to be put into how a system is designed for modelling data, the first step in database design and object orientated programming. Data Modelling is generally understood as having three stages of design: Conceptual, Logical and Physical. (“Data Modeling – Conceptual, Logical, And Physical Data Models”) Complexity increases with each stage of design. It should be highlighted that the structure of containing data is often purpose built. “The biggest challenge is correctly capturing the requirements on the data model. Often when the project starts, there are only vague requirements (if requirements at all), and the data model must represent these requirements completely and precisely. Therefore it is a very challenging task to go from ambiguity or vagueness to precision. ” (Hoberman)
Data modelling assumes the following in its design:
i) There can be numerous links between different data
ii) Categorization of data, separation and encapsulation is necessary for searchability – and a well built ontology allows you to get the most out of your data.
iii) Unique keys are used to identify parts of information, as access points linking data.
The Conceptual model highlights how the different bits of data relate to one another, specifying Entity Names and Relationships. The Logical model, is more specific and detailed – adding Attributes, Foreign Keys and Primary Keys. The physical model, must be implementable and applicable to the database of choice -specifying Column names and data types, tables names and Foreign and Primary keys. (“Data Modeling – Conceptual, Logical, And Physical Data Models”)
Within the Conceptual, Logical and Physical schemas there are numerous ways of modelling data, that can vary according to design depending on using the data for comparison and tracking correlations. Hoberman reminds us that methods of building and modelling this data can vary. “In some efforts, the database design is completed, and then the logical and conceptual are built for documentation and support purposes.” While familiarity with ones data set is needed for the purposes of interpretation, techniques of displaying data can be useful for particular purposes. Personally, I find visual data modelling techniques much easier to work with – particularly when comparing data. “The underlying benefit of creating a data model is that the data actually becomes understandable, as others can read it and learn about it. ” (Hoberman)
Different types of data modelling techniques which we should be familiar with include:
Spreadsheets for example can be used to model data, depending on the purpose this can be adequate as information can be grouped in rows and columns. The example is given of spreadsheets as data notation with financial business experts by Steve Hoberman. However, he also highlights the importance of definitions when modelling data – as every data set needs to be treated differently. The key understanding here is the elation between different types of data.
Visual representation of data can be very useful, especially when looking for comparisons and correlations. Diagrams are very useful when trying to design the structure for holding your data – setting out links and structure. What is important too is query languages for databases, which can have ontologies assigned such as W3schools standards like RDF. There are numerous software programs that can be useful for Data Modelling from spreadsheets, to diagram drawing softwares for explaining the concept but these must be held on database platforms designed to support data models like MySQL which we’ve used in class
Hoberman, Steve. “Data modeling techniques explained: How to get the most from your data”. Date of Access: 11 May 2017.
“Data Modeling – Conceptual, Logical, And Physical Data Models” from Data Warehousing. Online. Date of Access: 11 May 2017.