A Digital Education

Meredith Dabek, Maynooth University

Category: digital scholarly editions

Creating a Digital Scholarly Edition: Lessons from The Woodman Diary Project


In a previous blog post, I wrote about the Woodman Diary project, in which a group of students (myself included) enrolled in AFF606a (Digital Scholarly Editing) are creating a digital edition of a First World War diary under the guidance of Professor Susan Schreibman. The project, which began in earnest in January 2015, is now entering its final weeks. Though the creation of each digital scholarly edition may differ depending on the project team, the timeframe, resources, or other aspects, there are some general lessons we can draw from the Woodman Diary project that may prove helpful for future work.

Teamwork, Communication and Project Management

A digital scholarly edition such as the Woodman Diary has many parts. There is the text itself, its transcription and digital images. There are the technical aspects, such as the XML/TEI encoded files and the XSLT used to transform the XML. Digital editions often include extensive contextual and historical information, and there might also be design considerations for the final website. With each part progressing and moving forward at its own rate during the project timeframe, teamwork and communication between team members has been vital to Woodman Diary’s progress.

At the start of the project, we established clear goals and a clear division of labor by having each team member assume responsibility for one part of the digital scholarly edition. Doing so allowed us to set clearly established communication avenues; questions about the annotations, for example, are directed to Noel, whereas Josh handles any issues with the design composites. By assigning one person to take charge of a specific piece of the project, we are striving to eliminate any confusion or cross-purpose tasks.

Woodman Diary logoThe division of labor also contributes to effective teamwork within the project. Given the scope and timeframe of this project, it simply is not possible to complete the necessary work without each team member contributing to the whole. Moreover, allowing each team member to oversee his or her own area of responsibility helps ensure the continued progress of the project, by separating a seemingly daunting task into manageable pieces.

At the same time, however, the appointment of a project manager and the work he does is absolutely essential to ensuring the project advances as intended. Project managers offer structure and a foundational grounding to a specific project, enabling team members to work together to accomplish defined goals. As the Project Management Institute (PMI) states on its website, “hope is not a strategy.” We could not have crossed our fingers and anticipated a positive outcome. Consequently, having a project manager provides the necessary structure needed to complete the project.  When individuals come together as a team to create something, whether it is a digital scholarly edition, a new software program or the construction of a building, they need a strong, solid plan and a leader who can guide the process from start to finish.

While the process of creating a digital scholarly edition such as the Woodman Diary is the result of the collective efforts of the entire team, ceding overall management and oversight of the project to one person is important for success. Woodman Diary team member Shane McGarry serves as the project manager, and his expertise and previous experience in such a role has proven invaluable. Throughout the last several months, Shane has kept us focused on our long-term goals and deadlines, acted as the primary contact between the project team and Professor Schreibman, and shepherded the project from its early beginnings to this last final month. He also ensures we adhere to good project management principles by establishing clear communication processes.

Our team meets in person for regular progress meetings on a weekly basis, avails ourselves of project management tools, such as Google Drive, Google Group, and Jira, and uses a shared Google Calendar to highlight any personal commitments that might interfere with deadlines. These practices enable us to communicate effectively amongst ourselves, whether it is simply to check in or to crowdsource ideas for a particular aspect of the digital scholarly edition.

Effective team communication, though, is more than simply staying in touch. Clear, consistent communication can help identity potential risks before they become problems, determine which areas of the project might need more attention, or reallocate resources based on progress reports. Indeed, project teams that communicate well are more likely to be successful. According to a 2013 report from the Project Management Institute, projects with highly effective communication plans were more likely to meet their original goals (80%, versus 52% of projects with minimal communication) and more likely to be completed on time (71%, versus 37%). With so many different parts to the Woodman Diary project, its ultimate success will be due, in large part, to our team’s ability to communicate well.

Know what the project is – and what it isn’t

Good communication can also mean listening, especially to those who have relevant knowledge. Last month, our team had the opportunity to speak with Gordon O’Sullivan, a former student at Trinity College Dublin who served as the project manager for another digital scholarly edition, the Mary Martin Diary project. Gordon offered a wealth of advice and feedback, but his most valuable piece of guidance was this: know what your project is – and know what your project isn’t.

Scope creep – the unplanned or continuous expansion or extension of a project’s scope – is the bane of many project managers (“Scope Creep”). Particularly in a group environment, when ideas are flowing and creativity peaks, it is easy to get carried away with grandiose visions and “wish list” items. But such ideas often don’t come with the necessary corresponding adjustments in time, resources and/or money. Moreover, many scope creep ideas are often “nice to have” elements in the project, but are not essential components for its completion.

Albert Woodman’s diary contains multiple inserted maps and newspaper clippings, referencing various campaigns and attacks during the war. Additionally, he mentions several towns and cities throughout his entries, which are encoded with a <placeName> TEI tag. In trying to determine how best to include the maps and the references to specific places in the project, we have considered using geo-referencing software to create dynamic images comparing Woodman’s geographic references with present-day Google Earth (see the sample image below).

Example of Geo-referenced MapUltimately, though, the geo-referenced maps are an example of scope creep. Their inclusion in the project would be interesting and informative, but the time involved in their creation (as well as the time needed to learn the specific geo-referencing software) shifts attention away from the project’s core components, especially at this critical time in our schedule. Gordon’s advice reminds us to focus on our original project plan. For now, geo-referenced maps do not fit within the scope of what our project is. Rather than attempting too much, we can instead concentrate on completing and refining our initial objectives and goals.


Though the Woodman Diary project may be unique with regards to its purpose, goals and final result, the lessons learned by myself and the other team members throughout the process can be useful and applicable for other digital scholarly edition (DSE) projects. From the appointment of a project manager to minimizing scope creep, the example set by our project team will hopefully prove beneficial for future DSE projects.

Works Cited:

Project Management Institute. The Essential Role of Communications. New York: PMI, 2013. Print. Web. 18 April 2015.

“Scope Creep.” Technopedia. Janalta Interactive Inc., 2015. Web. 18 April 2015.

“Why is Project Management Important?” Project Management Institute. PMI New York, 2015. Web. 18 April 2015.

Encoding Choices in the Woodman Diary Project

TEI and Diplomatic Editions

Developed and first released in 1990, the Text Encoding Initiative (TEI) Guidelines are a specific method of text encoding that allows both computers and humans can read and understand those texts, separate and independent from a specific operating system. The Guidelines, which are expressed in the Extensible Markup Language (XML), provide scholars with pre-defined markup tags and elements to establish the structure of a particular text. The full and complete set of the Guidelines comprises nearly 500 elements, which digital humanists use to indicate what a text is, rather than how it should look or act.

TEI files have two parts: (1) a header, which includes information about the text, such as its title, author, publisher, and other bibliographic items; and (2) the body or text section, which contains the encoding of the actual text. All of the TEI tags and elements are organized into one of these two parts (“Introducing”). In addition to common structural elements such as paragraphs (<p>) and lines (<l>), the TEI Guidelines also include tags that allow encoders to communication editorial choices (<choice>), account for any apparent errors (<del> or <add>), and reflect decisions about any emendations in the original text (<unclear>). These tags are often used when scholars seek to create a diplomatic edition, a version of an original text which attempts to accurately reproduce any significant features, including spelling, abbreviations, deletions and other alterations (Pierazzo).

Diplomatic editions can range in their adherence to accuracy, from those considered ultra-diplomatic or strictly diplomatic “in which every feature which may reasonably be reproduced…is retained” to editions that feature normalized texts, created with readability in mind (Driscoll). Many scholarly editions fall somewhere in the middle, with an emphasis on a “semi-diplomatic” edition that retains some of the original text’s features, but not all. Such is the case here at Maynooth University, where a group of students enrolled in the Digital Scholarly Editing module are using TEI to encode and create a digital edition of the Woodman Diary.

The Woodman Diary Project

In 1918, Albert “Bert” Woodman was a soldier in the “L” Signal Company of the Royal Engineers, stationed in Dunkirk, France during World War I. After marrying his sweetheart, Nellie, Bert started to keep a diary of his experiences, intending to share it with Nellie when he returned home. Bert’s handwritten entries, starting in January 1918 and continuing until just after Armistice Day in November, fill the front and backs of nearly every page in the diary and span two physical journals, known by their respective brand names, Wilson and Butterfly.

Many historical documents, like Woodman’s diary, present unique challenges and opportunities for text encoders. Aside from understanding and transcribing an individual’s specific handwriting style, encoders may also encounter faint or faded writing, ink spills which obscure words and scribbles and cross-outs. Consequently, text encoders (in this case, the students in the module) must make careful editorial choices regarding the level of accuracy encoded in TEI.

Though the Woodman Diary Project is not a strict or ultra-diplomatic edition, the project team did decide to encode a handful of features often present in diplomatic editions, such as unclear words, additions and deletions, and abbreviated words. These tags and elements not only help preserve Bert’s idiosyncrasies, but they also allow readers in the general public or academic researchers to understand more about the diary and the circumstances under which it was written.

As often happens with handwritten documents, the Woodman diary contains a number of struck out words, phrases and letters, perhaps because Bert misspelled something or incorrectly recorded a number or name. These deletions are frequently accompanied by additions, either above or next to the original text. To accurately represent these features of the text, the Woodman Diary Project team used TEI’s <del> and <add> tags. Additionally, the attribute @rend gave team members the ability to indicate further characteristics, such as the position of the addition (e.g., above, over-written, next to) and even the very nature of the deletion (e.g., scribble, strikethrough, etc):

… when <add rend=”overwritten”>the<del>J</del></add> the Union Jack comes along …

5 March

(Woodman, 5 March 1918)

Occasionally, there were occasional words the project team was unable to decipher with absolute certainty. Despite having access to high-quality, high-resolution digital images of the physical diary, some words remain illegible, even if the proposed word does make sense within the context of the entry. In these cases, TEI’s <unclear> tag is used to contain “a word, phrase or passage which cannot be transcribed with certainty because it is illegible…in the source [document]” (“Elements Available”). In these cases, the tag helps signal to readers and researchers that there is still some doubt regarding the transcription of a word and phrase:

I’ll start another as soon as I can get the price of one <unclear>more</unclear>!!!

8 July

(Woodman, 8 July 1918)

When scholarly editions want to retain some diplomatic edition features, text encoders may offer the option of switching between the original text and an edited version. TEI’s <choice> tags allows for this, giving encoders the ability to “switch automatically between one ‘view’ of a text and another,” and therefore providing readers and researchers with insight into the encoder’s editorial choices (“Elements Available”). For the Woodman Diary Project, team members used the <choice> tag to contain abbreviations (<abbr>) and expansions (<expan>). Bert seems to have favored economy, given his use of every page available to him in his notebooks, and he also frequently abbreviated Standard English words and phrases in a likely attempt to save precious writing space:

Don’t get any <choice><abbr>ltrs</abbr><expan>letters</expand></choice> at all today

4 Feb

(Woodman, 4 Feb 1918)

As demonstrated by the examples above from the Woodman diary, the TEI tags for encoding editorial changes and choices prove particular useful for scholarly editions. While some encoders may choose to make silent corrects or emendations to enable easier reading of a text, partial or strict adherence to a diplomatic encoding offers accuracy and authenticity when dealing with historical texts. The encoding choices made by the Woodman Diary Project team give readers and researchers further insight into Bert Woodman and provide a more complete representation of his diary.



Driscoll, M.J. “Electronic Textual Editing: Levels of Transcription.” TEI: Text Encoding Initiative. TEI Consortium, n.d. Web, 18 March 2015.

“Elements Available in All TEI Documents.” TEI: Text Encoding Initiative. TEI Consortium, n.d. Web, 18 March 2015.

“Introducing the Guidelines.” TEI: Text Encoding Initiative. TEI Consortium, 2013. Web. 1 January 2014.

Pierazzo, Elena. 2011. “A Rationale of Digital Documentary Editions.” Literary and Linguistic Computing. 26.4 (2011): 463-477. Web. 18 March 2015.

Woodman, Albert. “Diary.” 1918. The Woodman Diary Project. An Foras Feasa, Maynooth University.


Reimagining the Audience for Digital Scholarly Editions

According to the Modern Language Association’s Guidelines for Editors of Scholarly Editions, a scholarly edition’s most basic task is to “present a reliable text,” one that can also contribute to academic research on a particular topic. Traditionally, scholarly editions have had fairly limited audiences, the final printed version intended primarily for other scholars conducting similar research. With the dawn of the digital age, however, the creation of digital scholarly editions is changing the nature of the audience for these works. The availability of scholarly editions online and the use of crowdsourcing to help create these editions are just two ways the digital world is blurring the lines between the traditional academic audience and a much larger, more public audience.

In 2009, at the Association for Documentary Editing Annual Conference, Andrew Jewell presented a presented a paper that explored new ideas around the reading of digital scholarly editions. According to Jewell, “the dominant model for distributing [scholarly] editions in the age of print [was] to sell large volumes at large prices” (1). But the advent of digital publication on the Internet has upended this model by amplifying the reach of a scholarly edition. Where they once would have been available only to a narrowly focused audience, many scholarly editions in digital form can now be accessed by anyone with an Internet connection.

A general audience, however, has different needs than a scholarly one, and may even approach the edition with different intentions. In fact, many casual readers of a scholarly edition may not have even specifically sought out the resource, but rather stumbled across it accidentally. Jewell offers the example of his own Willa Cather Archive, noting that a reader may find the archive “because search engines lead them to hidden bits of knowledge deep in the site” (3). A wider, more diverse audience for a scholarly edition also means the text and content will be consumed in new ways. A printed scholarly edition may follow a traditional, linear format; in a digital world, readers skim, search, scan and skip over parts that may not interest them.

Moreover, readers can access digital editions through any number of Internet browsers, mobile devices or tablets. Each option changes the experience of the edition in subtle ways, even when the content available remains the same. As Jewell correctly points out, “we cannot fully predict how readers will interact with digital publications…[and] we cannot expect every view of that website to be the same for each user” (6). The very nature of the Internet means each visit to a digital edition website will result in a different kind of engagement with the text, with the idea of “the audience” changing each time as well.

The evolving nature of a digital scholarly edition’s audience is not limited to reading and accessing information, though. Some scholarly editions are blurring the boundaries even further by actively involving the audience in the creation of the text itself. In 2010, Cathy Moran Hajo, Associate Editor of the Margaret Sanger Papers, wrote, “Web 2.0 tools are increasing in sophistication and enabling large amounts of people from all walks of life to participate in the creation of editions.” Hajo was, in effect, referring to crowdsourcing and in the years since, an increasing number of cultural and academic institutions have turned to crowdsourcing to complement and contribute to existing projects.

Crowdsourcing in the humanities (or, indeed, in Digital Humanities) aims, in part, to “expand the scope of the community membership beyond academics, and into the interested and engaged general public” (Siemens, et al.). Crowdsourced projects specifically reach out to the audience and invite them into the scholarly editing process, by having them either enrich existing materials or help create an entirely new resource (Carletti et al). In doing so, these projects are not simply looking for free labor, but instead, according to Carletti et al., are “collaborating with their public to augment or build digital assets through the aggregation of dispersed resources.”

Transcribe Bentham, one example of a crowdsourced scholarly edition project, has relied on volunteers to help transcribe thousands of manuscripts from philosopher Jeremy Bentham. The rationale behind opening up this project and scholarly edition to the larger public was due partly because the initiative hoped to “democratize the creation of, and access to, knowledge and humanities research” (Causer and Terras). Beyond opening access to the research, however, crowdsourcing connects passionate, interested individuals with these scholarly projects. The vast majority of crowdsourcing volunteers are not rewarded monetarily, and so many participate simply because they have a deep, personal interest in the subject. And as Ricc Ferrante, Director of Digital Services & Information at the Smithsonian Institution Archives points out, “passion breeds evangelists, breeds new volunteers, and new discoveries,” all of which can, in turn, lead to new knowledge.

There are some who may question the value of an open-access, online digital edition or the use of crowdsourcing to create such an edition. These individuals may maintain that scholarly editions should remain in the realm of the scholar. Ultimately, though, the blurred audience lines can be considered a good thing, as it expands the reach of a particular subject and opens up the humanities to new understandings. For Jewell:

“The defining feature of the broader audience that encounters free, online documentary editions is diversity: it comes from around the world, from a variety of perspectives and educational levels, and with a variety of goals.”

With more diversity comes more readers, more perspectives, and more people discovering new content that they may not have before encountered. Digital tools and technologies create a larger audience for scholarly editions, providing an enriched, varied and dynamic way of accessing and experiencing humanities data. The challenge, then, for scholarly editors, is to “move beyond the ivory towers of research libraries to high schools, town libraries and even to the comfort of private homes” (Hajo). By extending the reach of a digital scholarly edition and blurring the line between a traditional audience and a more expansive one, researchers and editors can ensure that their work is truly open and accessible.


Works Cited:

Carletti, Laura, Gabriella Giannachi, Dominic Price, and Derek McAuley. “Digital Humanities and Crowdsourcing: An Exploration.” MW2012: Museums and the Web. 17-20 April 2013. Portland, OR. Paper. Web. 2 December 2014.

Causer, Tim and Melissa Terras. “’Many hands make light work. Many hands together make merry work’: Transcribe Bentham and crowdsourcing manuscript collections.Crowdsourcing Our Cultural Heritage. Ed. Mia Ridge. Ashgate, 2014. 57-88. Web. 2 December 2014.

Ferrante, Ricc (@raferrante). “@McMer314 @sandilo60 @phcostel #askletters1916 …and passion breeds evangelists, breeds new volunteers, and new discoveries = new knowledge.” 2 December 2014, 1:08 PM. Tweet.

Guidelines for Editors of Scholarly Editions.Modern Language Association. MLA, 2011. Web. 2 December 2014.

Hajo, Cathy Moran. “The Sustainability of the Scholarly Edition in a Digital World.International Symposium on XML for the Long Haul: Issues in the Long-term Preservation of XML, 2010. Paper. Web. 2 December 2014.

Jewell, Andrew. “New Engagements with Documentary Editions: Audiences, Formats, Contexts.Library Conference Presentations and Speeches. The Libraries at University of Nebraska-Lincoln, 2009. Web. 30 November 2014.

Siemens, Ray, Meagan Timney, Cara Leitch, Corina Koolen, and Alex Garnett. “Toward modeling the social edition: An approach to understanding the electronic scholarly edition in the context of new and emerging social media.” Literary and Linguistic Computing. 27.4 (2012): 445-461. Web. 2 December 2014.

© 2019 A Digital Education

Theme by Anders NorenUp ↑