Comparing data modelling techniques

Introduction

What is data modelling?

Data modelling is essentially the structure in which data (information and knowledge) is collected, managed and represented.  It is intended to describe the concepts or objects of concern to an individual or organisation in order to represent both the concepts and objects and the relationships between them.

Data modelling techniques

There are currently two main data modelling techniques used in computer systems.  These are database systems and graph systems.  Of the database systems, in the last few decades since it was proposed in 1970 by Codd, the relational model has become the de facto standard for information representation (Martinez-Cruz, Blanco, and Vila).  However, in the last decade ontologies, expressed in graphs, have emerged and have grown in popularity to represent a viable alternative to relational databases, particularly because of their application in the Semantic Web.

Comparing techniques – similarities and differences

As the basic intention of data modelling is to describe things and their relationships to each other, it is not surprising that there is a strong degree of correlation between the organisation of databases and ontologies.  Both use a formal language and have types, properties and constraints.  A relational database ‘entity’ can correspond to an ontology ‘class’, a relational attribute to an ontology ‘property’.  However, the focus of databases is the data, while the focus of ontologies is communicating meaning and shared understanding.

Relational databases, considered fully normalised when normalised to 3rd normal form, are highly suitable to data organisation and structure.  Because normalisation reduces or eliminates redundancy, they are also very effective for data collection.  Normalisation, however, requires the creation of multiple tables with joins between tables.  Querying across tables can be technically complex, can cause efficiency problems and can be expensive. Databases are often de-normalised to improve performance for data  warehousing and extraction.  What arises is multiple highly specialised individual databases developed to manage specific information by individual entities.  The databases and the data in them generally sit behind a firewall and are often only available on the internet through a customised application that has to be developed or customised for the individual database. Both the normalisation process and online access through another application can result in a reduction of specificity of the original dataset.

 

Ontologies are also highly structured.  Based on the Resource Description Framework (RDF) , a standard method for defining things and the relationships between them,  and designed for the Web to refer to any thing or any concept, the entire ontology can be viewed globally without restrictions or layers of interfaces.  Viewers are given access to the data rather than to html documents.  Using HTTP URIs as globally unique identifiers for data items and vocabulary terms, an ontology can be amended and added to at any time, unlike databases, which can be technically difficult and expensive to modify.  Being inherently scalable,  ontologies enable much quicker searching of vast quantities of data. 

Conclusions

While the strength of databases is in data capture and structuring, the strength of graph models lie in their ability to visually represent data and relationships between data and in the ease of sharing data on the web.  To access the data, however, both models require some prior knowledge, of the database schema or ontology structure.  Choosing one over the other will ultimately depend on the particular project and end user requirements.  Graph database technology is comparatively new and much less familiar to potential end users than relational databases, which are now commonplace and which have stood the test of several decades. Further research will explore and develop methods of effectively translating data in databases to graph models and as Martinez-Cruz et al suggest, it is likely that databases will remain important for some time for the capture and structuring of large datasets. 

 

Bibliography

Martinez-Cruz, Carmen, Ignacio J. Blanco, and M. Amparo Vila. “Ontologies versus Relational Databases: Are They so Different? A Comparison.” Artificial Intelligence Review 38.4 (2012): 271–290. CrossRef. Web.

 

Posted in My Course | Tagged , , , | Leave a comment

Some thoughts on Rural Broadband

In the digital age, there is unprecedented access to information for more people than ever before but is this a true democratisation of access to data and the possibilities that data promise for how we live and work?

I was interested to read a recent opinion piece in the Irish Times in reaction to a controversy over the cost of subsidising a rural rail line where the columnist put the focus instead on the issue of rural broadband  (Taylor).  Taylor sees provision of broadband as a highly significant long term investment by the government.  As a rural dweller, I agree. We live in a digital world and having broadband is now not a choice but a necessity for work, life and play.

The National Broadband Plan, announced in 2012 has had a slow start.  The 2012 plans were altered to extend the reach to 927,000 households and businesses.  This figure represents approximately 35% of the population.  Put that another way – 35% of the population currently have poor broadband connectivity. The national plan is still in the procurement phase and it is now likely that it will be mid to late 2017 before roll out can begin over a period that will extend to the end of 2022.  Even then, the speed being worked towards is likely to need to be increased.

Earlier in the same week another Irish Times columnist (Burke-Kennedy, Rural broadband speeds are up to 36 times slower) drew attention to broadband speeds in some rural areas that are up to 36 times slower than some towns and cities with only one quarter of households with speeds of 30 megabits per second (mbps), the minimum target set out in the National Broadband Plan.  Burke-Kennedy has been writing on this topic for some time and sees the urban / rural broadband divide as ‘digital apartheid’ – something that is having a devastating effect on small businesses, on education, on quality of life and on rural isolation (Burke-Kennedy, Can broadband plan end our ‘digital apartheid’?).

Those of us who live in areas of low broadband speed know the frustrations of not being able to see or access what other people can.  Whether it’s online banking or access to Netflix, as digital businesses grow and predominate, the analogue options shut down – a trade off that is unnoticed by the many but leaves some of us wishing we could forget about our troubles by watching the latest DVD releases from the local rental – if only the rental shop hadn’t closed last year!

Bibliography

Burke-Kennedy, Eoin. “Can broadband plan end our ‘digital apartheid’?” 2 June 2016. www.irishtimes.com. 1 December 2016.

—. “Rural broadband speeds are up to 36 times slower.” 15 November 2016. www.irishtimes.com. 1 December 2016.

Taylor, Cliff. “Broadband not railway lines the key to rural survival and development.” The Irish Times 15 November 2016: 16.

Posted in Uncategorized | Leave a comment