When developing databases it is known that developers have choices and preferences. For some the relational model is tried and tested, for others it is outdated and confining. So what is the best choice? Where do you hedge your bets? This post will define and explore the connection between the two in order to understand the crux of this question. While data modelling is continually evolving, and therefore continually complicated, for this discussion at least, we will still to the basics.
Relational databases are not a “new thing”. While the technology narrative instills in us that the world wide web and all its glories are a very recent invention the relational database has been in existence since the 1970s. Conceived by Edgar Codd as a response to user demands this system is anything but new. This data model is built on a system of tables and the references or relationships between these tables are defined by “keys”. Therefore, to connect these tables, that is to display the relationship, a process called “joining” is required. These operations are usually considered quite server and memory intensive. Therefore, to produce the most efficient model the developer must choose to what level they will ‘normalise’ their data. This process is simply one of standardisation in which data is classified into more accurate categories. Ranging from first to fifth levels the amount of normalisation required is usually at the discretion of the developer. However, additional costs, server space and intensity all factor into this decision. While the process of normalisation can strengthen the capabilities of the database it has been argued that the need for classification can cause difficulties for non-quantitative fields of study, such as the humanities. Conversely, for simple data structures which require little probing the relational database can serve as a truly tested and reliable model.
As stated above the relational database is tried and tested to be both successful and useful for certain types of data structures, and arguably could be utilised for others if incorporated properly. However, it is possible that the success of relational databases lies on rocky foundations. A lack of real competition only serves as a highlight and praise to the strength of RDBs. However, graph bases present a new challenge to this longstanding hegemony and now that a viable alternative is on the market, a decision must be made.
It could be argued that graph databases simply build on the relationship model. Both of these databases rely on their ability to connect or “join” data in response to server demand. However, in the case of graph databases, tables and keys have been replaced by a system of nodes and edges. Essentially the nodes represent classes or entities. These entities are “joined” by relationship records, or edges, which can be defined by type, direction or other additional attributes. Therefore, when performing the graph equivalent of a “joining” operation graph databases uses this list of edges, or predicates”, to find the connected nodes. The benefits of this nodes and edges system is the intuitive way in which it allows you to store and manipulate data. In graph databases your data is more flexible, while it should still be small and normalised to a degree the verb defined connection process allows for greater adaptability. This is particularly relevant for humanities databases as it allows expansion beyond the confines of the RDB when handling difficult data.
In conclusion, if you’re data is relatively neat and quantitative based there is no need to be pressured into incorporating it into a graph database. In these scenarios RDBs have been proven as an effective model for uncomplicated data storage. However, if you’re data is becoming more complex and requires more detail it may be necessary to upgrade to the newer graph database. Not only will this database allow for greater flexibility with you data but, it will also decrease the strain on “joining” operations thanks to the subject, object and predicate basis.