Quantitative History: An Annotated Bibliography

Humanities Computing and later the Digital Humanities may be linked back to 1949 and what might now be referred to as a quantitative data, or quantitative history project. The origin was a project of Father Roberto Busa who in collaboration with IBM sought to ‘make an index verborum of all the words in the works of St Thomas Aquinas and related authors, totalling some 11 million words of medieval Latin’(Hockey).

In the field of history, quantitative data has since taken on many forms such as online census returns, parliamentary papers, digitised archives of manuscripts, primary sources and information databases based on prosopography, ships logs and various other compilations of historic data.

In the following annotated bibliography ten entries relating to quantitative history will be presented which upon examination may illustrate the benefits, challenges, perceptions, visions, debates, and processes evident in regards to the development and use of digital quantitative history.

Bradley, John, and Harold Short. Texts into Databases: The Evolving Field of New-style Prosopography. Literary and Linguistic Computing. (2005) 20 (Suppl): 3-24. Web. 24 Nov. 2014.

In this article John Bradley and Harold Short of King’s College London, discuss three projects which sought to compile and digitise vast amounts of historical materials into online prosopography databases. The article goes into detail in regards to these vast projects, one of which, The Prosopography of the Byzantine World, sought ‘to record in a computerized relational database all surviving information about every individual mentioned in Byzantine sources during the period from 641 to 1261, and every individual mentioned in non-Byzantine sources during the same period who is ‘‘relevant’’ (on a generous interpretation) to Byzantine affairs’. The article provides a valuable insight on the thoughts and processes of those involved in creating a vast quantitative history database, and importantly includes discussion on the development of the end user experience.

Cohen, Daniel J., and Roy Rosenzweig. Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web. Web. 25 Nov. 2014.

Cohen and Rosenzweig’s Digital History is, as per the title, a broad guide to gathering, preserving, and presenting history online. While not going into great depth regarding the technical aspects of creating digital history, it provides an easy access point for traditional historians into the theory behind going digital as an historian. In regards to quantitative history, the chapter titled Exploring the History Web: Archival Websites explores the history, development, forms and funding of large online history archives. This chapter, as per much of the book, does not over-reach in discussing the theory or technicalities of quantitative history projects but is valuable in providing a basic understanding of the development and forms of quantitative history.

Graham, Shawn, Ian Milligan, and Scott Weingart. The Historian’s Macroscope: Big Digital History – Working Title. Under contract with Imperial College Press, Open Draft Version. 2013. Web. 23 Nov. 2014.

These three authors are in the interesting process of compiling a book in public view by which an online draft version is readily available to view and comment upon. The purpose of their study is Exploring Big Data through a Historian’s Macroscope. While this work is not yet fully complete, the draft online version contains interesting pieces in regards to the use, benefits and perils of ‘Big Data’ from both the authors and readers of the draft whose comments are available to view online. Chapters are available which discuss The Joys of Abundance and The Limits of Big Data, along with chapters on the practical computational methods in Building the Historian’s Toolkit. This draft version already has a great deal of theoretical and practical information on quantitative digital data and the completed work would seem to have the potential to become a great resource to historians who wish to understand and participate with the tools and resources of quantitative digital data in relation to their field.

Hitchcock, Tim. “Academic History Writing and the Headache of Big Data.” Historyonics. 30 Jan. 2012. Web. 22 Nov. 2014.

In this blog-post Tim Hitchcock grapples with the relationship between academic history writing and quantitative digital history resources. Hitchcock details his involvement in the creation of many websites consisting of big data relating to the field of history. The blog post describes how such sites are, for Hitchcock, ‘fragments of a single coherent research agenda and project’, which seek to create the writing of a new form of history, ‘history from below’. In constructing these projects Hitchcock considers their potential value in providing access to sources for a larger audience and a new breed of historian. Yet the blog also discusses the problem of how as these sites methodologies began to develop they become ‘reasonably technically challenging’. Hitchcock describes how potentially big data tools and resources may in fact create historians from a ‘top down, technocratic elite’, and may result in writing history which is neither humanistic nor very humane. Hitchcock does not dismiss digital tools for quantitative history, but provides an interesting argument for a re-assessment of how we might construct and use these tools in the future.

Piersma, Hinke, and Kees Ribbens. “Digital Historical Research: Context, Concepts and the Need for Reflection.” BMGN – Low Countries Historical Review 128.4 (2013): 78–102. Web. 25 Nov. 2014.

At the beginning of this article the authors point out that the prospect of close collaboration between humanities and computer science was described between 2009 and 2012 as a ‘promising cross-fertilisation’, a ‘great leap forward’, and a ‘revolutionary movement’ in Dutch academic circles. In the course of this article Piersma and Ribbens set out to analyse how these hopes have materialised in reality by examining the results of this ‘promising cross-fertilisation’. After analysing two quantitative history projects, the authors suggest that there is a struggle between the traditional and digital approaches, yet meanwhile quantitative and digital processes have furthered other fields which the humanities compete with in producing results and seeking funding. The article concludes by suggesting a re-emphasis on understanding and willing collaboration between both fields towards what is still perhaps a collaboration in progress.

Prescott, Andrew. “The Deceptions of Data.” Digital Riffs: Extemporisations, Excursions and Explorations in the Digital Humanities. 13 Jan. 2013. Web.  22 Nov. 2014.

In discussing “The Deceptions of Data” Andrew Prescott is keen to highlight the dangers which the reproduction and reliance of data in a new format can have in representing history. In regards to quantitative historical data, Prescott highlights large databases constructed out of ‘hundreds of log books’ which were used to create a digital map of British trade routes from 1750 to 1800. However in closer analysis Prescott shows that this large database was not sufficient to create this ‘complete’ digital map as it did not contain shipping information from large parts of Britain during that period. Prescott suggests that some amongst his peers consider themselves ‘no longer curators or scholars but makers and consumers of data’ and in great haste to convince the traditional humanist of the ‘cool’ digital formats. Prescott would seem to remind digital advocates to maintain their scholarly critical thinking in the face of a confidence in quantitative data that may distort the creation of reliable quantitative digital history.

Reed, Ashley. “Managing an Established Digital Humanities Project: Principles and Practices from the Twentieth Year of the William Blake Archive.” Digital Humanities Quarterly 8.1 (2014). Web. 23 Nov. 2014.

The William Blake Archive may be described as an artistic archive before an historical archive yet this article is interesting in regards to quantitative digital history as it discusses the processes and challenges of a quantitative humanities archive which experiences continued growth and expansion of the data within its project. Much like the challenges faced by many historical archives which may regularly uncover new sources and must decide how and if they should be presented, this article on the William Blake Archive discusses these very questions in relation to the maintenance and growth of a quantitative humanities archive.

Zaagsma, Gerben. “On Digital History”. BMGN – Low Countries Historical Review, vol. 128, no. 4 (2013), pp 3-29. Web. 24 Nov. 2014.

In this article it is from pages 23-27, which encompass chapters headed Historical Practice 2.0, that Zaagsma addresses issues regarding the use of quantitative history. The author looks firstly at digitisation and the archive, discussing the progression and forms of digital archive that come to create ‘big data’. Secondly in digital historical analysis Zaagsma acknowledges the fears of traditionalist humanists towards the use of computational methods in a humanist field, and also that quantitative data analysis, ‘is far from objective or neutral’, as suggested by Rieder and Röhle. However while acknowledging these issues, Zaagsma’s own argument would suggest the need for a greater integration of the ‘historian’s interpretive and hermeneutic work’, in engagement with quantitative historical data and analysis, adding ‘the challenge is to apply our critical faculties to digital resources’.

“Interchange: The Promise of Digital History”. Journal of American History, vol. 95, no. 2 (Sept. 2008). Web. 23 Nov. 2014.

This article is an edited version of an online discussion which took place over several months in 2008. While the discussion was broad in that the topic was the promise of digital history as a whole, it is interesting to read the thoughts of these prominent contributors to the field of digital humanities and digital history, and how quantitative history was perceived in this broad discussion. Michael Frisch comments that quantitative history ‘has won, and many historians routinely and effectively deal with quantitative data when they want to or need to in a fluid and responsive inquiry-driven way’. Dan Cohen later adding that the full potential of digital history will be realised by ensuring ‘that digital history is not simply an echo of quantitative history’. Through such comments regarding quantitative history in the broader discussion this is an interesting article in understanding the academic perception of quantitative history.

London Lives 1690 to 1800 ~ Crime, Poverty and Social Policy in the Metropolis Web. 25 Nov. 2014.

As an interesting addition to this bibliography, the London Lives website, which is an excellent example of a digital quantitative history project, provides a great deal of information, not just on how quantitative history can be presented as a valuable resource for historians but also how one is developed and constructed. One section in particular, titled About this Project, provides a wealth of information about a digital quantitative history project’s rationale, funding and technical methods, along with hyperlinks to connected sister sites and to web pages for individuals, sources and bodies involved deeper in the construction of a quantitative history resource.

Cited in Introduction

Hockey, Susan. “The History of Humanities Computing.” A Companion to Digital Humanities. Ed. Susan Schreibman et al. Oxford: Blackwell, 2004. Web. 20 Nov. 2014.

Thoughts on the Digitisation of resources for Historians


In September 2011 I entered Maynooth University as a first year history student. In my mind I envisaged that the three years which lay ahead of me would involve many hours trawling through volumes of dusty old books in darkened archives at the far reaches of the beautiful and historic Russell Library. The stereotype of a history lecturer who was as old and dusty as the history and books being studied was firmly in my mind, albeit slightly exaggerated for my own romantic purposes, and seeing history as the study of the past I expected, and hoped, it would be conducted primarily by examining and using the tools of the past. The study of history through stacks of old books and manuscripts was an experience I was looking forward to embracing, allowing those in other fields to pursue their studies using laptops, e-books and other tools of the expanding Digital Age. Yet within weeks of my first lectures one particular assignment would quickly turn my thoughts towards the advantages of digital resources in history. That October I was set an assignment to review an online digital resource, a website called The Valley of the shadow: Two communities in the American civil war.

The Valley of the shadow was an idea conceived in 1991 and although meant for publication as a book, it developed as a digital resource before going live in 1993, an early example of a digital history resource. It resembles in presentation a traditional archive with depositories of contemporary letters, diaries, images and newspapers from two communities on opposite sides during the American Civil War. It does not however remain silent, as would a traditional archive, as it does contain what Cohen and Rosenzweig call ‘implicit interpretation of the materials’, so while the site resembles an archive it may be better described as an edited collection. However it still functions as a valuable collection of source material which allows the user to study independently primary sources as part of their own research.

In constructing the site this way the project allowed the user to enter an archive that looked familiar in the traditional sense, but the site had several of Cohen and Rosenzweig  seven qualities of new digital media such as capacity, containing ‘tens of thousands of newspaper articles, 1,400 letters and diaries, full census records from 1860, 45 Geographic Information Systems (GIS) maps, and more than 700 photographs and images’ (Cohen and Rosenzweig), which would have required vast volumes in printed form. The digital format provided global accessibility for an audience who otherwise may not have had access to these historical documents, and it also provided flexibility and options for content such as the ability to add sound and video files. For this student it opened up the possibilities which digitisation can provide for much wider access and utilisation of primary source materials.

In the following years of my undergraduate degree a lot of time followed my pre-conceived method of studying history, whereby many hours were spent in Ireland’s National Archives in Dublin. Opening boxes containing 17th century land grants with the seal of King Charles II provided a great thrill to a budding historian, yet it also made me consider the condition and availability of these resources. While these documents were well maintained, many came with archivist notes attached which rightly highlighted the care needed to be taken with such old, valuable and irreplaceable resources. It also reminded me that while my topic and these documents may have been of interest to many historians, they were only accessible by attending the National Archives personally, and so access came at great expense to those based outside of Dublin and indeed Ireland. I began to consider what resources lay scattered in various archives I myself was unable to access, and furthermore if I was able to access these archives would my physical handling of these resources contribute unwittingly to their deterioration for future historians?

With these concerns in mind digital resources such as the The Valley of the shadow seemed to answer a great need. This digital format allowed global access to these documents, while the ability to study a digital version allowed the original documents to lie in a more constant state of preservation. The Valley of the shadow therefore stood as an early example of what Roy Rosenzweig set out to accomplish with the establishment of the Digital Public Library of America: “To use digital media and computer technology to democratize history—to incorporate multiple voices, reach diverse audiences, and encourage popular participation in presenting and preserving the past.”

Yet in the digital format, by which the user can quickly search archives and collections by using keywords to access targeted information, might the digital historian be in danger of missing a greater narrative? Might information be missed or misrepresented in the digitization process, but still be contained without interpretation within those dusty old archives? Might digitization result in a field of study overly reliant on resources which Cohen and Rosenzweig pointed out can contain ‘implicit interpretation’? In The Human Presence in Digital Artefacts, Alan Galey opens with a passage from a letter by Erasmus to the Archbishop of Canterbury:

‘The reader wanders at leisure over smiling fields; he plays and runs and never stumbles; and he never gives a thought to the time and tedium it has cost me to battle with the thorns and briars, while I was clearing the land for his benefit. He does not reckon […] how great the discomforts that secured his comfort, how much tedium was the price of his finding nothing tedious.’

While Erasmus writes of the lack of appreciation for the extent of his work, his research and his editing, the passage also reveals his work had removed what he considered tedious material. This passage shows that as Erasmus sought to produce a work to benefit the reader, ‘thorns and briars’ were removed and thus his work was an interpretation of a greater body of information. This is a long standing process of information management which Ann Blair identifies as the four S’s: storing, sorting, selecting and summarizing.

In conclusion, while digitization can allow for a great body of original sources to be made accessible and contribute to further study and preservation, it still involves a process of information management as described by Blair. Therefore while it may be argued that the digital resource can certainly be a valuable component to historical study, it must be noted that implicit interpretation may be present and that the ‘thorns and briars’ of the compiler may in fact be vitally important to another student of the information. The process of constructing and using digital editions, archives and collections therefore entails a great deal of consideration and responsibility.

