Modelling Humanities Data: Deleuze, Descartes and Data

While dealing with the distinctions between data, knowledge and information in class, a pyramidal hierarchy was proposed, which can be seen on the left. This diagram discloses the process of making data (which have been defined as ‘facts’ which exist in the world), into information, and thereafter knowledge. These shifts from one state to another are not as neat as the diagram might suggest; it is just one interpretation giving shape to a highly dynamic and unsettled process; any movement from one of these levels to another is fraught. It is ‘a bargaining system,’ as every dataset has its limitations and aporias, not to speak of the process of interpretation or subsequent dissemination. This temporal dimension to data, its translation from a brute state is too often neglected within certain fields of study, fields in which data is more often understood as unambiguous, naturally hierarchicalised, and not open to contextualisation or debate.

This blog post aims to consider these issues within the context of a dataset obtained from The Central Statistics Office. The dataset contains information relating to the relative risk of falling into poverty based on one’s level of education between the years 2004 and 2015 inclusive. The data was analysed through use of the statistical analysis interface SPSS.

The purpose of the CSO is to compile and disseminate information relating to economic and social conditions within the state in order to give direction to the government in the formulation of policy. Therefore it was decided that the most pertinent information to be derived from the dataset would be the correlations between level of education and the likelihood of falling into poverty. The results appear below.

Correlation Between Risk of Poverty and Level of Education Achieved

Correlation Between Consistent Poverty (%) and Level of Education Received

Correlation Between Deprivation Rate (%) and Level of Education Received

Poverty Risk Based on Education Level

Deprivation Rate Based on Education Level

Consistent Poverty Rate based on Education Level

It can be seen that there is a very strong negative correlation between one’s level of education and one’s risk of exposure to poverty; the higher one ascends through the education system, the less likely it is one will fall into economic liminality. This is borne out both in the bar charts and the correlation tables, the latter of which yield p-values of .000, underlining the certainty of the finding. It should be noted that both graphing the data, and detecting correlations through use of the Spearman’s rho are elementary statistical procedures, but as the trend revealed here is consistent with more elaborate modelling of the relationship,[1] the parsimonious analysis carried out here is all that is required.

It should not be assumed that just because these graphs are informative that it is impossible to garner information from data in any other way. Even in its primary state, as it appears on the website, one could obtain information from a dataset through qualitative means. It is unlikely that this information will be as coherent as that which that can be gleaned from even the most basic graph, but it is important to emphasise the fact that the border that separates data from information is fluid.

It is unlikely to be a novel finding that those who have a third level education have higher incomes than those who do not; there is a robust body of research detailing the many benefits of attending university. [2] Therefore, can it be said that the visualisation of the dataset above has contributed to knowledge? One would answer this question relative to one’s initial research question, and how the information complicates or advances it. If the causal relationship between exposure to poverty and level of education has been confirmed, and a government agency makes the recommendation that further investment in educational support programmes are necessary, it is somewhere in this process that the boundary separating information from knowledge has been crossed.

The above diagram actualises the temporal nature of data to a greater extent than the pyramid, but in doing so it perpetuates a linearisation of the process, a line along which René Descartes’ notion of thought could be said to align. Descartes understood thought as a positive function which tends towards the good and toward truth. This ‘good sense’, allows us to ‘judge correctly and to distinguish the true from the false’.[3] Gilles Deleuze believes Descartes instantiates a model of thought which is oppressive, and which perceives thinking relative to external needs and values rather than in its actuality: ‘It cannot be regarded as fact that thinking is the natural exercise of a faculty, and that this faculty is possessed of a good nature and a good will.’[4]

In Deleuze’s conception, thought takes on a sensual disposition, reversing the Cartesian notion of mental inquiry beginning from a state of disinterestedness in order to arrive at a moment at which one recognises ‘rightness’. Deleuze argues that there is no such breakthrough moment or established methodology to thought, and argues for regarding it as more invasive, or unwelcome, a point of encounter when ‘something in the world forces us to think.’[5]

Rather than taking the neat, schematic movement from capturing data to modelling to interpreting for granted, Deleuze is engaged by these moments of crisis, points just before or just after the field of our understanding is qualitatively transformed into something different:

How else can one write but of those things which one doesn’t know, or know badly?…We write only at the frontiers of our knowledge, at the border which separates our knowledge from our ignorance and transforms one into the other.[6]

Deleuze’s comments have direct bearing upon our understanding of data, and how they should be understood within the context of the wider questions we ask of them. Deleuze argues that, ‘problems must be considered not as ‘givens’ (data) but as ideal ‘objecticities’ possessing their own sufficiency and implying acts of constitution and investment in their respective symbolic fields.’[7] While it is possible that Deleuze would risk overstating the case, were we to apply his theories to this dataset, it is nonetheless crucial to recall that data, and the methodologies we use to unpack and present them participate in wider economies of significance, ones with indeterminate horizons.

Notes

[1] Department for Business, Education and Skills, ‘BIS Research Paper №146: The Benefits of Higher Education and Participation for Individuals and Society: Key Findings and Reports’, (Department for Business, Education and Skills: 2013) https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/254101/bis-13-1268-benefits-of-higher-education-participation-the-quadrants.pdf

[2] OECD, Education Indicators in Focus, (OECD: 2012) https://www.oecd.org/education/skills-beyond-school/Education%20Indicators%20in%20Focus%207.pdf

[3] Descartes, René, Discourse on the Method of Rightly Conducting the Reason, and Seeking Truth in the Sciences (Gutenberg: 2008), http://www.gutenberg.org/files/59/59-h/59-h.htm

[4] Deleuze, Gilles, Difference and Repetition (Bloomsbury Academic: 2016), p.175

[5] Ibid.

[6] Ibid, p. xviii

[7] Ibid, p.207

Bibliography

Deleuze, Gilles, Difference and Repetition (Bloomsbury Academic: 2016), p.175

Department for Business, Education and Skills, ‘BIS Research Paper №146: The Benefits of Higher Education and Participation for Individuals and Society: Key Findings and Reports’, (Department for Business, Education and Skills: 2013) https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/254101/bis-13-1268-benefits-of-higher-education-participation-the-quadrants.pdf

Descartes, René, Discourse on the Method of Rightly Conducting the Reason, and Seeking Truth in the Sciences (Gutenberg: 2008), http://www.gutenberg.org/files/59/59-h/59-h.htm

OECD, Education Indicators in Focus, (OECD: 2012) https://www.oecd.org/education/skills-beyond-school/Education%20Indicators%20in%20Focus%207.pdf

A Deleuzian Theory of Literary Style

I’m always surprised when I read one of the thinkers generally, and perhaps lazily, lumped in to the general category of post-structuralist, when I find how great a disservice the term does to their work. To read Derrida, Foucault or Deleuze, is not to find a triad of philosophers who struggle to produce a coherent system via addled half-thoughts in order to deconstruct, stymie or relativise everything. In fact, I’m not sure there’s another philosopher I’ve read who displays greater attention to detail in their work than Derrida, and Deleuze, far from being a deconstructionist, presents us with painstaking and intricate schemata and models of thought. The rhizome, to take the most well-known concept associated with Deleuze and his collaborator, Félix Guattari, doesn’t provide us with a free-for-all, but an intricately worked-out model to enable further thought. Difference and Repetition is likewise painstaking, and so involved is Deleuze’s model of difference, applying it in great depth to my theory of literary style, might be something to do if one wished to be a mad person, particularly since, at an early stage in the work, he attempts to map his concepts to particular authors, such as Borges, Joyce, Beckett and Proust. But I’ll do my best.

My notion of literary style has been influenced by the fact of my dealing with the matter via computation, i.e. multi-variate analysis and machine learning. All the reading I’m doing on the subject, is leading me towards a theory of literary style founded on redundancy. When I say redundancy, I don’t mean that what distinguishes literary language from ‘normal’ language is its superfluity, an excess of that which it communicates. For the Russian formalists, this was key in defining literary language, its surfeit of meaning. I don’t like this distinction much, as it assumes that we can neatly cleave necessary communication from unnecessary communication, as if there were a clear demarcation between the words we use for their usage (utilitarian) and the words we use for their beauty (aesthetic). The lines between the two are generally blurred, and both can reinforce the function of the other. The shortcomings of this category become yet more evident when we take into account authors who might have a plain style, works which depend on a certain reticence to speak. Of course, a certain degree of recursion sets in here, as we could argue that it is in the showcased plainness of these writers that the superfluity of the work manifests itself. Which presents us with the inevitable conclusion that the definition is flawed because its a tautology; it’s excessive because it’s literary, it’s literary because it’s excessive.

My own idea of redundancy comes from a number of articles in the computational journal Literary and Linguistic Computing, the entire corpus of which, from the mid-nineties until today, I am slowly making my way through. It provides an interesting narrative of the ways in which computational criticism has evolved in these years. At first, literary critics would have been sure that the words that traditional literary criticism tends to emphasise, the big ones, the sparkly ones, the nice ones, were most indicative of a writer’s style. What practitioners of algorithmic criticism have come to realise however, is that it is the ‘particles’ of literary matter, that are far more indicative of a writer’s style, the distribution of words such as ‘the’, ‘a’, ‘an’, ‘and’, ‘said,’ which are sometimes left out of corpus stylistics altogether, dismissed as ‘stopwords,’ bandied about too often in textual materials of all kinds to be of any real use. It’s a bit too easy, with the barest dash of an awareness of how coding works, to start slipping into generalisations along the lines of neuroscience, so I won’t go too mad, but I will say that this is an example of the ways in which humans tend to identify patterns, albeit maybe not necessarily the determining, or most significant patterns, in any given situation.

We’re magpies when we read, for better or worse. When David Foster Wallace re-instates the subject of a clause at its end, a technique he becomes increasingly reliant on as Infinite Jest proceeds, we notice it, and it becomes increasingly to the fore in our sense of his style. But, in the grand scheme of the one-thousand some page novel, the extent to which this technique is made use of is statistically speaking, insignificant. Sentences like ‘She tied the tapes,’ in Between the Acts, for instance, pass our awareness by because of their pedestrian qualities, much like many other sentences that contain words such as ‘said,’ because of the extent to which any text’s fabric is predominantly composed of such filler.

In Difference and Repetition, Deleuze is concerned with reversing a trend within Western philosophy, to mis-read the nature of difference, which he traces back to Plato and Kant, and the idealist/transcendentalist tendencies within their thought. They believed in singular, ideal forms, against which the notion of the Image is pitched, which can only be inferior, a simulacrum, as they are derivative copies. Despite his model of the dialectic, Hegel is no better when it comes to comprehending difference; Deleuze sees the notion of synthesis as profoundly damaging to difference, as the third-way synthesis has a tendency to understate it. Deleuze dismisses the process of the dialectic as ‘insipid monocentrality’. Deleuze’s issue seems to be that our notions of identity, only allow difference into the picture as a rupture, or an exception which vindicates an overall sense of homogeneity. Difference should be emphasised to a greater extent, and become a principle of our understanding:

Such would be the nature of a Copernican revolution which opens up the possibility of difference having its own concept, rather than being maintained under the domination of a concept in general already understood as identical.

Recognising this would be the advent of difference-in-itself.

This is all fairly consistent with Deleuze’s sense of Being as being (!) in a constant state of becoming, an experiential-led model of ontology which doesn’t aim for essence, but praxis. It would be fairly unproblematic to map this onto literary style; literary stylistics should likewise depend on difference, rather than similarity which only allows difference into the picture as a rupture; difference should be our primary criterion when examining the ways in which style becomes itself.

Another tendency of the philosophical tradition as Deleuze understands it is a belief in the goodness of thought, and its inclination towards moral, useful ends, as embodied in the works of Descartes. Deleuze reminds us of myopia and stupidity, by arguing that thought is at its most vital when at a moment of encounter or crisis, when ‘something in the world forces us to think.’ These encounters remind us that thought is impotent and require us to violently grapple with the force of these encounters. This is not only an attempt to reverse the traditional moral image of thought, but to move towards an understanding of thought as self-engendering, an act of creation, not just of what is thought, but of thought itself.

It would be to take the least radical aspect of this conclusion to fuse it with the notion of textual deformance, developed by Jerome McGann, which is of particular magnitude within the digital humanities, considering that we often process our text via code, or visualise it, and build arguments from these simulacra. But, on a level of reading which is, technologically speaking, less sophisticated, it reflects the way in which we generate a stylistic ideal as we read, a sense of a writer’s style, whether these be based on the analogue, magpie method (or something more systematic, I don’t want to discount syllable-counts, metric analyses or close readings of any kind) or quantitative methodologies.

By bringing ourselves to these points of crisis, we will open up avenues at which fields of thought, composed themselves of differential elements, differential relations and singularities, will shift, and bring about a qualitative difference in the environment. We might think of this field in terms of a literary text, a sequence of actualised singularities, appearing aleatory outside of their anchoring context as within a novel. Readers might experience these as breakthrough moments or epiphanies when reading a text, realising that Infinite Jest apes the plot of William Shakespeare’s Hamlet, for example, as it begins to cast everything in a new light. In this way, texts are made and unmade according to the conditions which determine them. I for one, find this to be so much more helpful in articulating what a text is than the blurb for post-structuralism, (something like ‘endlessly deferred free-play of meaning’). Instead, we have a radical, consistently disarticulating and re-articulating literary artwork in a perpetual, affirming state of becoming, actualised by the reader at a number of sensitive points which at any stage might be worried into bringing about a qualitative shift in the work’s processes of meaning making.

Deleuze and Guattari’s Geology of Literary Style

rhizomeWhen I was drafting my PhD proposal, I read a few sources on literary style, in order to come to a working definition of style, or an academic consensus on the matter to rail against. I didn’t want something simplistically formalistic that referred to vehicles, tenors, modes or what have you, but I also didn’t want a post-Derridean account, that described style as a limit-case/fault line/discourse rupture, an everything and nothing at once. These kind of critical stymieings, excessive nuancing to the point of inertia have gotten a bit wearying after five years of seeing them deployed, so I was hoping to get to some kind of working definition. Emphasis on ‘working’ considering I would be carrying out pragmatic actual tasks, via computation, which were to be finalised once I had my definition.

It was surprisingly challenging to track one down, and more often than not I was thrown back onto my own reflections on literary style, and what we talk about when we talk about it. Here, I think we stumble across its primary shortcoming as a delineator. People talk about Virginia Woolf’s interior, lyrical style, Jorge Luis Borges’ staid, cold style and Ernest Hemmingway’s staccato, pared back style. The difficulty with these simplistic accounts is that an author’s style generally encapsulates what it is that makes them unique in literary discourse in general. This isn’t necessarily surprising; most of what we detect in a writer’s style is what throws us out of our reading habits. When Foster Wallace frenetically re-instates the subject of a clause at its end, a technique he becomes increasingly reliant on asInfinite Jest proceeds, we notice it, and it becomes increasingly to the fore in our sense of his style.  But, in the grand scheme of the one-thousand some page novel, the extent to which this technique is made use of is statistically speaking, insignificant. Sentences like “She tied the tapes,” in Between the Acts, for instance, pass our awareness by because of their pedestrian qualities, much like many other sentences that contain words such as ‘said,’ because of the extent to which any text’s fabric is predominantly composed of such “filler.”

This dearth of attention directed to the ‘particles’ of literary materials, is a lot of what digital humanities projects present themselves as a corrective to, by looking at the macroeconomic, we can transcend our human fixation on shiny objects (read: pretty sentences), and gain a fuller understanding of a text’s style, liberated from the shortcomings of our usual reading habits.

Of course, this newfound command over an entire text does not prevent the critic from mounting flawed arguments; many digital humanities projects from its earlier experiments in literary analysis too frequently gave into Rubik’s cube thinking, attempting to tame indeterminacy, by solving a text via enumerative techniques. This is exactly the kind of objective approach I didn’t want to fall into when visualising and narrating data trends.

Franco Moretti’s work in the Stanford Lit Lab proved beneficial in opening me up to more diffuse and multi-perspectival digital methodologies; by visualising a text on a number of different textual levels. Moretti’s contention that the data shows the activation of different stylistic features scale is directly correlated to the differentiation of textual functions is positively invigorating, as it is as far removed from the Rubik’s cube mentality as is possible to get; it essentially concedes that what we see when we look at a text depends on the way that we’re looking at it. Yes, Moretti is talking about topic modelling rather than style, but for my purposes I’ll ignore that. I also enjoy that it seems to be a computational analogue to the psychedelic nature of literary criticism – the longer we look at a text, even a shorter one, perhaps even especially a shorter one, the more we see. Diversifying our means of approach therefore provides the critic with a disparate sequence of differentiated visualisations, Enright may be meaningfully analogous to, dunno, Proust from the perspective of the entire text, but on a word to word, sentence to sentence, chapter to chapter, etc. comparison, we may turn up more unexpected results.

I still lacked a conceptual, theoretical system to connect this approach with, until I read the third chapter of Gilles Deleuze and Félix Guattari’s A Thousand Plateaus, ’10, 000 BC: The Geology of Morals (Who Does the Earth Think It Is?)’ In this chapter, Deleuze and Guattari make use of the discipline of geology in order to outline a number of theories concerning form, content, ideology and the articulations thereof.  The unorthodox appropriation of geology is part of Deleuze and Guattari’s wider usage of theories and concepts outside of traditional philosophy, in order to subvert the staid formula of normative philosophical argumentation, wherein a summary is given of problem 1, why the solution A posited by philosopher z is insufficient and why solution B posited by philosopher y is even more so, and how both (and every other philosophy in the history of the discipline, by extension) have overlooked a solution that I alone have realised. This is all beside the point and I mention it only to indicate how smart I am.

In any case, the earth, and, for my purposes, a literary text is composed of a number of strata, differing layers, which contain, compose and construct otherwise transitory particles, making them subject to more macroeconomic structures of order. In this way, they simplify their contents, as particles move between these strata erratically. One should think of strata as totalising senses of an author’s style, whereas the particles are more subtle, granular features that disappear and re-appear in and outside of particular strata. Form and content are singularly intermingled on the level of the stratum, and are merely a function of primary and secondary articulation.

Strata in turn are composed of epistrata and parastrata, which further undermines any attempt someone, like a mad person, would make to get a stable grasp on exactly what it is Deleuze and Guattari mean when they lay out this seemingly intractable schema. The strata model is a challenge to systematic modes of thought, such as structuralism, so it offers no stability, but for me, this is precisely its appeal. Any interpretation on a particular textual level, such as stratum d, which we could equate to word choice, for instance, samples one among many protean strata, composed of other strata, made relative to a machinic assemblage, itself a stratified metastratum, which becomes involved in its, the strata’s dual articulations along the lines of form and content. Simple.

The key here is that it avoids closure, it is a theoretical construct that is anathema to pragmatists, and on that basis, even if my numbers add up, any conclusions I reach with them will be, by virtue of association,  strictly provisional.