Modelling Humanities Data Blog Post #2: Different Methods of Modelling data

This blog post will focus on the 1641 depositions project, based at Trinity College Dublin. The aim of the project was to digitise approximately 8,000 depositions dealing with the 1641 uprising in Ireland and provide them online, which amounts to 19,010 pages of text bound in 31 volumes. Each page was photographed in high-resolution, transcribed and marked up in TEI.

The transcription which was carried out preserved variant and incorrect spellings, as well as subsequent emendations, such as struck out words or marginalia. These are formatted in a way which emphasises their separateness from the ‘main’ text. These accounts were initially taken spontaneously, as a means of gathering information about the uprising from those who were affected by, or witnessed, the disturbances. This first wave of depositions are more discursive in character and were taken within two years of the initial events. Subsequent witness statements, taken in the 1650’s, were more focused on damage to property and loss of life with a view to charging those guilty of such acts in court. Though these statements were marked up in TEI, the code itself is inaccessible, due to concerns about people making use of the transcribed manuscripts without permission. This hinders the markup’s functionality, as it makes it impossible for scholars to search, process or analyse the text in ways that markup would otherwise allow.

The data schema that was used within the context of the  project website is also idiosyncratic in many respects. The tagging system which facilitates searches of the depositions uses twenty-four separate terms, among them, ‘apostasy’, ‘arson’, ‘captivity’, ‘witchcraft’ and ‘death’. There is a significant amount of overlap within this systems, the question arises as to what precise differences there are between ‘death’, ‘killing’, ‘multiple killing’ and ‘massacre’ as subjects. Further, tags such as ‘witchcraft’ disproportionately emphasise the sensational nature of some of the depositions; despite the fact that references to supernatural phenomena, feature in a relatively small number of depositions.

This is somewhat ironic considering the uses the depositions were put to at the time they were first written, as a means of fuelling anti-Catholic prejudice in England to further entrench the plantation project and justify the representation of Catholicism as ‘a proven tyrannical force’. This may have been done with a view to the potential impact of the project; Elizabeth Price’s deposition was dramatised on RTÉ presumably because it offers a vivid account of a massacre, though no attention was given in the broadcast to their unreliability as a resource. As the depositions were devised by a governing infrastructure attempting to prosecute insurrectionists and quell rebellions from non-compliant parts of the country, they could hardly be considered disinterested investigations.

There is an argument to be made that a panel of historical experts on Tudor and Stuart Ireland would be capable of devising a sequence of topics in order to provide a guiding mechanism for any prospective reader, particularly within the context of a digital scholarly edition such as this, in which there is such a huge amount of material. However, it is clear that in this case, this has not been achieved.


Canny, Nicholas, Making Ireland British 1580-1650 (Oxford University Press: 2003)

Foster, R.F., Modern Ireland 1600-1972 (Penguin: 1989)

Heffernan, David, The Emergence of the Public Sphere in Elizabethan Ireland (The Tudor and Stuart Ireland Conference 2012: 2012)

Hughes, Anthony, The Stuart Post Office: Not Just for Delivering Letters (The Tudor and Stuart Ireland Conference 2012: 2012) Accessed: 4 May 2017.

Ohlmeyer, Jane, Bartlett, Thomas, Ó Siochrú, Micheál, Morrill, John, 1641 Depositions, Available at: Accessed: 4 May 2017

Modelling Humanities Data: Deleuze, Descartes and Data

While dealing with the distinctions between data, knowledge and information in class, a pyramidal hierarchy was proposed, which can be seen on the left. This diagram discloses the process of making data (which have been defined as ‘facts’ which exist in the world), into information, and thereafter knowledge. These shifts from one state to another are not as neat as the diagram might suggest; it is just one interpretation giving shape to a highly dynamic and unsettled process; any movement from one of these levels to another is fraught. It is ‘a bargaining system,’ as every dataset has its limitations and aporias, not to speak of the process of interpretation or subsequent dissemination. This temporal dimension to data, its translation from a brute state is too often neglected within certain fields of study, fields in which data is more often understood as unambiguous, naturally hierarchicalised, and not open to contextualisation or debate.

This blog post aims to consider these issues within the context of a dataset obtained from The Central Statistics Office. The dataset contains information relating to the relative risk of falling into poverty based on one’s level of education between the years 2004 and 2015 inclusive. The data was analysed through use of the statistical analysis interface SPSS.

The purpose of the CSO is to compile and disseminate information relating to economic and social conditions within the state in order to give direction to the government in the formulation of policy. Therefore it was decided that the most pertinent information to be derived from the dataset would be the correlations between level of education and the likelihood of falling into poverty. The results appear below.

Correlation Between Risk of Poverty and Level of Education Achieved

Correlation Between Consistent Poverty (%) and Level of Education Received

Correlation Between Deprivation Rate (%) and Level of Education Received

Poverty Risk Based on Education Level

Deprivation Rate Based on Education Level

Consistent Poverty Rate based on Education Level

It can be seen that there is a very strong negative correlation between one’s level of education and one’s risk of exposure to poverty; the higher one ascends through the education system, the less likely it is one will fall into economic liminality. This is borne out both in the bar charts and the correlation tables, the latter of which yield p-values of .000, underlining the certainty of the finding. It should be noted that both graphing the data, and detecting correlations through use of the Spearman’s rho are elementary statistical procedures, but as the trend revealed here is consistent with more elaborate modelling of the relationship,[1] the parsimonious analysis carried out here is all that is required.

It should not be assumed that just because these graphs are informative that it is impossible to garner information from data in any other way. Even in its primary state, as it appears on the website, one could obtain information from a dataset through qualitative means. It is unlikely that this information will be as coherent as that which that can be gleaned from even the most basic graph, but it is important to emphasise the fact that the border that separates data from information is fluid.

It is unlikely to be a novel finding that those who have a third level education have higher incomes than those who do not; there is a robust body of research detailing the many benefits of attending university. [2] Therefore, can it be said that the visualisation of the dataset above has contributed to knowledge? One would answer this question relative to one’s initial research question, and how the information complicates or advances it. If the causal relationship between exposure to poverty and level of education has been confirmed, and a government agency makes the recommendation that further investment in educational support programmes are necessary, it is somewhere in this process that the boundary separating information from knowledge has been crossed.

The above diagram actualises the temporal nature of data to a greater extent than the pyramid, but in doing so it perpetuates a linearisation of the process, a line along which René Descartes’ notion of thought could be said to align. Descartes understood thought as a positive function which tends towards the good and toward truth. This ‘good sense’, allows us to ‘judge correctly and to distinguish the true from the false’.[3] Gilles Deleuze believes Descartes instantiates a model of thought which is oppressive, and which perceives thinking relative to external needs and values rather than in its actuality: ‘It cannot be regarded as fact that thinking is the natural exercise of a faculty, and that this faculty is possessed of a good nature and a good will.’[4]

In Deleuze’s conception, thought takes on a sensual disposition, reversing the Cartesian notion of mental inquiry beginning from a state of disinterestedness in order to arrive at a moment at which one recognises ‘rightness’. Deleuze argues that there is no such breakthrough moment or established methodology to thought, and argues for regarding it as more invasive, or unwelcome, a point of encounter when ‘something in the world forces us to think.’[5]

Rather than taking the neat, schematic movement from capturing data to modelling to interpreting for granted, Deleuze is engaged by these moments of crisis, points just before or just after the field of our understanding is qualitatively transformed into something different:

How else can one write but of those things which one doesn’t know, or know badly?…We write only at the frontiers of our knowledge, at the border which separates our knowledge from our ignorance and transforms one into the other.[6]

Deleuze’s comments have direct bearing upon our understanding of data, and how they should be understood within the context of the wider questions we ask of them. Deleuze argues that, ‘problems must be considered not as ‘givens’ (data) but as ideal ‘objecticities’ possessing their own sufficiency and implying acts of constitution and investment in their respective symbolic fields.’[7] While it is possible that Deleuze would risk overstating the case, were we to apply his theories to this dataset, it is nonetheless crucial to recall that data, and the methodologies we use to unpack and present them participate in wider economies of significance, ones with indeterminate horizons.


[1] Department for Business, Education and Skills, ‘BIS Research Paper №146: The Benefits of Higher Education and Participation for Individuals and Society: Key Findings and Reports’, (Department for Business, Education and Skills: 2013)

[2] OECD, Education Indicators in Focus, (OECD: 2012)

[3] Descartes, René, Discourse on the Method of Rightly Conducting the Reason, and Seeking Truth in the Sciences (Gutenberg: 2008),

[4] Deleuze, Gilles, Difference and Repetition (Bloomsbury Academic: 2016), p.175

[5] Ibid.

[6] Ibid, p. xviii

[7] Ibid, p.207


Deleuze, Gilles, Difference and Repetition (Bloomsbury Academic: 2016), p.175

Department for Business, Education and Skills, ‘BIS Research Paper №146: The Benefits of Higher Education and Participation for Individuals and Society: Key Findings and Reports’, (Department for Business, Education and Skills: 2013)

Descartes, René, Discourse on the Method of Rightly Conducting the Reason, and Seeking Truth in the Sciences (Gutenberg: 2008),

OECD, Education Indicators in Focus, (OECD: 2012)