Digital Literature Analysis: Voyant-Tools, bringing the Writer and the Digital Humanist together

On 1 December 2016,  Sara Kerr gave a stimulating lecture in the AFF601 module on analytics and analysis in the digital research environment. She suggested several different applications which we could investigate and perhaps use when the times comes to write our theses (something which I don’t much wish to think about with looming first semester deadlines!). One online tool which she pointed towards was, which allows a user to input a corpus of work, and which then generates the words most commonly used in that corpus. The program also enables the use to find each instance of these words being used in the corpus, and what other words they are used in relation to. For example, one could, if

Who said what? Word cloud of the most used words in Owen Wisters “The Virginian”, generated by

they wished, input Owen Wister’s The Virginian: A Horseman of the Plains, generate the cloud of which words are the most used, and then check through the text to see each instance of these words being used.

 The site also includes a trend graph, showing how the usage of a word may change over the course of the text. This offers some interesting insights into narrative progression via word use, with some words used more frequently at the very moment that others are being used less frequently. It is a highly useful tool for revealing different patterns in the text, as well as bringing a new insight into just how various themes are dealt with.

Trend graph of words most used in Owen Wisters “The Virginian”, generated by

While I suspect that voyant-tools and other similar programs are highly interesting and advantageous to those involved in literary studies, I, as an amateur short story writer in addition to being a student, decided to play with the tool and see what it could tell me about my own writing. I uploaded all of the short stories I’ve written this year and ran them through the generator, curious to see what words I use the most. Apparently I’m quite concerned with the human body and the passage of time, as well as words in general, though I suspect if I looked at the context of these words being used it would be more illuminating. If I were not the creator of the works in question, I must confess that I wonder how I might analyse the results. But that’s bordering on too much navel-gazing introspection for a blog post such as this.

The most common words used in my short stories this year, generated by

I’m still left with one question, though. What might my results be if I ran my Digital Humanities assignments through the generator? That might be an interesting experiment…

